This is what we call a Chat in French (credits)

Hello there, this is Christophe, live from the human world. Last week have been totally driven by ChatGPT frenzy, the social networks I use to follow are spammed with conversation screenshots and hype. On my side I don't know what the future holds for us but for sure MaaS—Models as a Service—looks not bright to me. OpenAI perfectly executed it, they dedicated an gigantic amount of computing power to offer a neat pay-as-you-query experience, like BigQuery. And I bet it will transform our industry as far as BigQuery did. But do we want big companies holding decision power in their own pre-trained models, leaving real data science to the big ones?

I don't want to be alarmist, this is not the tone I have here in the Data News, but do we want a future where the support chat of our home train service or our mobile carrier is under the hood ran by a Musk's company? Ok, it's a caricature, but imagine. I can't wait to see Excel comparing average cost per words written between a human and a machine.

🎄 Let's switch topic. It's time for the Advent of Data head's up. Since last week edition we had 6 new articles published in the calendar. Go taste your daily chocolates. In a nutshell you can now develop an internal pip package for your data team, handle governance, explain to stakeholders what you're doing, send AI models to small devices while understanding Rust for data engineering and 3 keys geospatial metrics.

Paris Airflow Meetup 🧑‍🔧

On Tuesday I organised the 4th Paris Apache Airflow Meetup. The first one since 2019 and it was awesome, I met with a lot of people, the talks and the venue were awesome. The goal now is to do a meetup per month in 2023. For this I'll look for speakers and hosts, so if you live in France and you want to share something with the French community reach me, I have a lot of ideas.

After an small introduction the evening started with a presentation by Clément and Steff from leboncoin data engineering team. They shared with us the good practices they implemented to scale their Airflow development. As a figure at leboncoin 7 teams are using Airflow to operate more the 1000 DAGs. For you a short takeaway in English of their presentation:

Then Qonto data engineering team with Charles & Charles shared how they integrated dbt within Airflow. After a small introduction of the classic modern data stack combo—snowflake-dbt-tableau-airflow—Charles presented what is dbt and what are the alternatives to integrate dbt within Airflow.

In a nutshell you have 3 options to do it:

Qonto decided to go for the last option.  Then the other Charles detailed what it means and how they monitor what is happening. Obviously there are a few pro/cons for this approach that are:

In the end they showcased their Metabase dashboard helping them understand every dbt run that is very complete mixing data from Airflow with a clever trick—they use XCom to save metadata in the database to be able to use it in Metabase—and the dbt artifacts.

👀 See the slides

PS: shout-out to people I met there reading the newsletter, your kind words are important and it gives me a lot of motivation. See you soon ❤️.

Studious atmosphere to listen Charles^2 (credits Alaeddine)

Fast News ⚡️

Data Fundraising 💰🇫🇷


See you next next week ❤️.