Skip to content

Data News — Week 22.30

Data News #22.30 — Hightouch acquisition, Contentsquare and Neon fundraising, RStudio new direction, Data versioning and the fast news.

Christophe Blefari
Christophe Blefari
4 min read
Summer (credits)

Dear readers, I hope this email finds you well. This is still the summer edition of the Data News. I'm so happy to see people reading the news even if it's summer, so thank you all for the support once again.

Data fundraising 💰

RStudio news

Rebranding time. During the rstudio::conf(2022) RStudio team introduced a lot of changes and a new direction. First they are becoming Posit. In summary they do this change because they want reach more than R for data science. They aim to help all data scientists. Which means they will develop multi-language tooling (incl. Python).

The first manifestation of this vision is the release of Quarto. This is an open-source scientific and technical publishing system in which you can create dynamic content in Python, R and Julia. It has been inspired by R Markdown.

Next, they also announced Shiny for Python. Shiny, which is a way to create and publish web app directly from your R code will be available in Python. This could become a credible alternative to Streamlit.

To be honest I don't know very well the R world. I've written so little R code that my opinion could be wrong and bad. So, sorry in advance. I really like the vision and the initiative, R developers are legion and a lot of people are still using R because their niche library is only available in R. So if the vision is to empower everyone no matter the language, this is good. Still it'll be hard to break the scientific tool wall to become an enterprise-ready one — I mean production-ready.

Just as a side note: please don't become like Anaconda, I feel they tried to become the one-stop shop for everything and now this is too big to be the relevant player I want.

I've also read on Twitter that if R dependencies system could improve Python one it could be awesome. I don't disagree.

The R door (credits)

Data versioning

With today's cloud capacities we are able to save data changes. We have a lot of different technologies that can work with data versions. Christian wrote how you can version your datalake (with LakeFS).

Also, shame on me, I just discovered today that BigQuery had a time travel feature for instance (up to 7 days), see how Guillaume does BigQuery table snapshots.

In addition if you use dbt, here how you can do Change Data Capture in dbt or two ways to create incremental models.

Analytics time

Q&A to learn from others

This week we got a small Q&A from Picnic data engineering team sharing thoughts on the lakehouse and the data mesh. On the other side Instacart VP data science shared how you can build a data-driven company. Which you should put in perspective with Benn last week post: do data-driven companies always win?

Me reading stuff (credits)

Fast News ⚡️

datanews

Data Explorer

The hub to explore Data News links

Search and bookmark more than 1200 links

Explore

Christophe Blefari

Senior Data Engineer. I like 🚲, 🪴 and 🎮. I can do everything with data, just ask.

Comments


Related Posts

Members Public

Data News — Week 22.38

Data News #22.38 — Hidden gems in dbt artifacts, understand the Snowflake query optimizer, Python untar vulnerability, fast news and ML Friday.

Members Public

Data News — Week 22.37

Data News #22.37 — Data roles: lead, analytics engineer, data engineer, the metrics layers, McDonald's event-driven and fast news.