Skip to content

Data News — Week 37

Data News #37 — Online next conferences, OLAP cube explained, Salary survey, dbt seeds and huge AI Friday and more.

Christophe Blefari
Christophe Blefari
4 min read
Back to school — let's learn new things this year (credits)

Hi folks, I hope you all had a great week. A new professional year has started — yes to me the new year is in Sept. like I'm still going to school — and I wish you the best for yours. What do you plan to learn this year? On my side I'd like to be better at pottery 🫖.

With this new year, new community events have been announced and because this week I don't have any awesome fundraising to share I'll share with you upcoming events.

Upcoming events 📺

Also to note that the Kafka Summit took place this week, if you want to have an idea Robin Moffatt, Dev Advocate at Confluent, wrote some Twitter threads about the his attendance.

What's an OLAP cube? 🧊

From an interview question to an onboarding chat data people mention OLAP or OLTP often without really knowing what it means. Claire Carroll wrote an awesome post explaining what is an OLAP cube. It has been written 1 month ago but it's a personal favorite.

As a side note I really like the conclusion about "Jargon as a gatekeeper" saying that we — the data community collectively — keep using complicated terms to create a barrier excluding new people.

O'Reilly 2021 Data/AI Salary Survey

O'Reilly published this week the result of their salary survey (mainly based on US -based respondents). Charts are interesting to see, they were able to split salaries by gender (a gap still need to be closed), by programming languages and also by tools and platform (this last split is not that relevant, the tools are too heterogeneous).

The 20% salary gap at executive level explained (credits)

GDPR compliance in a nutshell

It's a first time in the Data News, we are speaking about the GDPR. People from Sifflet wrote a FAQ / glossary post about all you need to know about the European regulation to be compliant.

🌱 Use dbt seeds for your Lookup tables

Daniel Mateus Pires explained how his team use the dbt seeds to manage better the lookup tables — or reference tables. This post a super good introduction to dbt seeds feature.

Airflow hidden features

Did you know the Airflow CLI contains a command to generate an image of you Airflow DAG? I didn't know before reading this cheat sheet about the Airflow CLI.

Folks at Databand.ai also explained how to use Airflow cluster policies and task callbacks to add observability on your tasks without too much overhead.

Understand Materialized Views — Part 1 & 2

Dunith Dhanushka wrote two articles on Medium to help you understand how materialized views can be useful for you and how it can speed up queries.

AI Friday

This week I want to share with some AI articles that have been written in the last weeks that I found really well written and inspiring! It sometimes makes me want to do AI 🙃.

Deezer team explained what they use when it comes to recommend music to new users. It is nice to see people writing about cold start.

Marie-Fleur Sacreste from Preligens, a french defense AI company, described in a well detailed post how they created a unique agile framework to deploy deep learning algorithms in a blink of an eye.

If finally these two articles gave you motivation to work with AI here some lessons learned from 2 years as a data scientist.

Don't forget to put your Data Scientist glasses to do AI (credits)

Fast News ⚡


Thank you and see you next week.

Data News

Data Explorer

The hub to explore Data News links

Search and bookmark more than 2500 links

Explore

Christophe Blefari

Staff Data Engineer. I like 🚲, 🪴 and 🎮. I can do everything with data, just ask.

Comments


Related Posts

Members Public

Data News — Week 24.16

Data News #24.16 — Llama the Third, Mistral probable $5B valuation, structured Gen AI, principal engineers, big data scale to count billions and benchmarks.

Members Public

Data News — Week 24.15

Data News #24.15 — MDSFest quick recap, LLM news, Airbnb Chronon, AST, Beam YAML, WAP and more.