group of cyclists marching on highway
Who's leading the data peloton? (credits)

Hey you, this is the Saturday Data News edition 🥲. Time flies. I'm working for the Series of articles in advance for August about "creating data platforms" and I'm looking for ideas about the data I could use for this. Having some kind of simulated real-time data would be the best. But it requires to write a simulation. Which is enough complicated. What would you use?

Small French aside 🇫🇷

(A small part in French, jump to next section)

Cette semaine j'ai lancé mon podcast en français nommé À l'heure des données. Dans ce podcast, qui sera mensuel, je vais discuter avec des experts francophones qui font l'écosystème. On discutera du présent mais aussi du futur.

Dans le premier épisode j'ai discuté avec Benoit Pimpaud qui a été data scientist à l'Olympique de Marseille et qui s'est reconverti plus tard chez Deezer en data engineer. Aujourd'hui il s'occupe du produit chez Kestra, un orchestrateur open-source développé en France.

🎧 Pour nous écouter : AppleSpotifyDeezerAmazon

Sue un tout autre sujet, Stéphane Bortzmeyer a participé au colloque du CNRS sur Penser et Créer avec les IA génératives et il a écrit un rapport sur ces 2 jours.

PS : est-ce qu'une version française de mon contenu t'intéresse ?

The new dbt Semantic Layer

Following the acquisition of Transform by dbt Labs a few months ago, dbt Core integrates MetricsFlow. MetricsFlow was the semantic layer of the acquired company. This week, Nick Handel, co-founder of ex-Transform, wrote about how dbt Core specs will adapt.

As a reminder a semantic layer is a definition on top of your models meant to be reusable. The idea, is then, to use the semantics to generate SQL queries. You can read my article on the semantic layer.

In the new vision it will be possible to define multiple things:

Semantics and metrics in dbt Core explained. (credits: the example is reworked from Nick's examples)

Just ahead I gave you an precise example of how the new nomenclature will behave for a simple case with a fact_transaction model. This is important to notice that the semantic layer is something that sits on top of you current dbt models definitions.

To complete the picture this is important to notice that the revenue_usd metrics can be queried at the moment either with a CLI, either via the API dbt Labs will release through their dbt Cloud offering.

As an extension I've seen 2 things this week that I feel makes sense here:

These two examples are not really semantic layers in the strict sense, but revolve around the concept.

Gen AI 🤖

cooked food on round white ceramic plate
Now you want to think twice before eating a pizza (credits)

Fast News ⚡️

Data Economy 💰


See you next week ❤️