My weeks are like (credits)

Halo Data News readers. The weeks are pretty intense for me and every Friday come in a blink of an eye. I write the introduction before the content of the newsletter so I don't how it'll turn today. But I hope you'll enjoy.

For a future deep-dive, I'm looking for data engineering career paths. If you have one or something similar in your company I'd love to have a look at it — everything will be anonymous by default ofc.

No fundraising this week. I did not find any news to put light on.

Data roles

Every tech lead face this identity issue a day or another. This is the same for every data lead. How should you divide your time between management, contribution and stakeholders? Mikkel describe well the difficult life of the data lead. I previously was in a lead role and the main advice I could say to people in the same case is: make your grief and stop the contribution work except for the code reviews.

To some extent, 2 other posts I like this week:

The metrics layer

Pedram produced a deep-dive on the metrics layer. He tried to explain what's behind and what are the current solutions proposing a metric layer: Looker, dbt Metrics and Lightdash.

In the current state of the technology the metrics layer is nothing more than a declarative way (a file) to describe what are metrics, dimensions, filters, segment in your warehouse tables. In Looker you write it in LookML, in dbt and Lightdash you use the dbt YAML, in Cube you use Javascript to do it.

The final vision of the metrics layer is to create an interoperable way to define metrics and dimensions that every BI tool will understand natively avoiding hours to create this knowledge in the tool. But we are far from there.

McDonald’s event-driven architecture

Event flows at McDonald's (credits)

A two posts series detailed what's behind the McDonald's events architecture. First, they define what it means to develop such an architecture. Something that need to be scalable, available, performant, secure, reliable, consistent and simple. Mainstreamsly they picked up Kafka — but managed by AWS, the Schema Registry, DynamoDB to store the events and API Gateway to create an API endpoint to receive events. It feels like nothing facing, but looks strong.

In the second post they give the global picture and how everything orchestrate together defining the typical data flow. We can summarize it like: define event schema, produce event, validate, publish and if something goes wrong they use a dead letter topic or write directly to DynamoDB.

ML Friday 🤖

Fast News ⚡️

The true Uber alternative (credits)

See you next week and please stop writing about data contracts.