man holding his eyeglasses
Rethinking the newsletter (credits)

Here's a new edition of the Data News newsletter. Since my 2-year anniversary post, I've been struggling to find the right writing rhythm. I've been sick and I've been stuck on a client project. Writing the newsletter was not an easy exercise. Even though I keep telling myself "it's not a question of motivation, it's a question of discipline" like a LinkedIn guy. I do things because I enjoy the process of doing things, not for the results.

That's why I'll try to change a bit the way the things are done for the next 3 months. As of today I do the newsletter every Friday. I search and read articles first and then I write. Starting next week I'll do it on Thursday, to schedule the sending at the same hour every Friday, at 2PM.

This way, I'll dedicate my Fridays to write original articles, explore ideas and preparing articles stock for the summer holidays. I plan to do a 1-month break during August. But at the same time I have the FOMO—fear of missing out. So I need to schedule articles in advance. I can tease you that I'll create content about "Create a data platform in 2023", with live examples.

In September I will do a retro and decide if this is the right way to continue or not.


In term of content I've recorded a new podcast episode (in French) that will be out next week. The French version will be a bit different than Minds of data. It'll be more round tables and discussions about the present and the future of our ecosystem.

We also scheduled the next Paris Airflow Meetup in Mirakl offices. Pierre, an Airflow committer and PMC member, will present his Airflow journey. Join us!

Data contracts, dbt and modeling

Back to the roots, it's been a long time since I did not share dedicated stuff about dbt. This week a natural cluster of articles have emerged. A few people already implemented things with the new model governance dbt introduced last month in v1.5.

Julian shared a nice way to use dbt model governance when you have 1000+ models. In a nutshell you can add new characteristics to models that will give more context to dbt. Models can have group, access, contract and versions. In the article Julian greatly explains the software dev comparison when managing programatic APIs with public or private visibility with models management. Finally he also proposes 6 logical data layers to sort your models: source, base, cleanse, core, business and marts.

This structure gives also more visibility to the team because you can draw clear boundaries like: data engineers are responsible for the 3 first layers, analytics engineers for the others.

In order to go more in depth in the data contracts concepts applied to the warehouse and dbt you can activate ownership with dbt data contracts. Mikkel also showcases his tool synq.io that runs tests and alerts on top of dbt.

In addition there are 2 awesome articles about related topics:

Gen AI 🤖

The single greatest risk of AI is that China wins global AI dominance and we – the United States and the West – do not.

I propose a simple strategy for what to do about this – in fact, the same strategy President Ronald Reagan used to win the first Cold War with the Soviet Union.
Image
A ControlNet generated QR Code, the link sends to a website to personalise QR codes developed by the author

Fast News ⚡️

Data Economy 💰


See you next week ❤️.