Me reading Hacker News comments (credits)

Hello dear readers, this is year 2021 week 43 — yeah, big advantage of being subscriber, you get the week number directly in your mailbox.

This week I published 3 stories about the time I deleted data in production, even if the title is voluntarily clickbait because the term "production" is blur. I had my first experience to be on Hacker News front page and I did not like it, as you can see there comments are a bit negative. But isn't a good way to learn something?

I've changed something to the usual fast news, so read everything to see it!

Data fundraising 💰

The next big challenge for Data is organizational

I know that you are a diligent reader of the data news, so you already noticed that I really like articles dealing with data organization topics — even if I'm a data engineer 🤓. We already see recurring patterns towards organizational issues in all data teams, and it's gonna be the next big challenge for data.

I totally agree with the article, today tools are important but not crucial as people. Tomorrow your data teams will shine because you hire the good person for you organization and because you embrace techniques that suits them — probably inspired by software engineering.

Also do not hesitate to Subscribe if you like this kind of content.

How Postman's data team works

We already seen how Postman data team organized their platform around Atlan to create the data workspace for everyone. This week we'll see how they organized the data team to meet the operational needs. It seems they are organized around 2 teams: engineering and science and analysts are included in the second one. They also put different hats on the analysts as they could be central, embedded or distributed.

I really like this article because it shows what are the thoughts they had while building their team and it could help everyone.

Grow as a ML engineer

The ML engineer position is probable the newest in the tech ecosystem. The border and the role are still blur but as more people are writing about it as faster we gonna converge to the best definition. This week I propose you to read this Roadmap: from Backend Engineer to ML Engineer. It also covers the major topics the MLE is involved in and the skills it needs to mainly master.

If you also plan to go as a machine learning engineer freelance Pau wrote how you can defeat your impostor syndrome to launch yourself in the independant way.

To finish and because I'm a data engineer I'd like to share also the 7 things you need to know to become a data engineer that Medhi proposed.

A ML engineer, a data engineer and an analytics engineer are in a pot... (credits)

The rise and evolution of data engineering — what’s it all about? 🌙→☀️

Data engineering is truly changing. A first reason is all the tools appearing each day that simplify our daily job. The second reason is that it is starting to becoming hype attracting money and talent. Data engineer are dragged out of the shadow because of this. But what's outside the shadow? Is it light? Olivier Molander puts words on the data engineering evolution.

What is data versioning and 3 ways to implement it

We've all build data lake or data warehouses once in our life. And we probably did it the batch way. This is normal. But with time and volume the batch method will start to struggle. You'll need to understand concepts around change data capture (CDC) or data versioning, here how to do it.

To go further on this concept I propose you this Podcast from Data Engineering Podcast about Streaming data pipelines made SQL with Decodable (I like the pun) and this state of the art about Feature Stores look at the end of the post for a link to the Feature Store Summit if you're in.

Data Engineers shouldn't write Airflow DAGs — new episode1

Leroy Merlin Russian team explained how they implemented a configuration driven DAGs generation for their data platform using Airflow. Thanks to YAML templates they are able to describe what data they want and how they transform it.

Let's release data products (credits)

Fast News ⚡ and Releases 👻

Let's bring some data product news and releases in the fast news to follow how our beloved data products are evolving.

Below usual Fast News.


1 Props to Anthony Henao for the title (cf. here)