Hey you, the end of the year is coming soon. I really liked this year with you. It was super fun to write every Friday of the year my opinion on data topics, I don't know yet if next year I'll be able to pull out stuff without repeating myself, I hate repeating myself, but for sure I'll try and I'll continue.
We have still 2 Fridays left until the end of the year, I'll try like last year to do special editions but no promise.
As a small reminder the Advent of Data 🎄 is still running and this week we got awesome articles again! So go check them out. For instance Marie and Bryan wrote great pieces to help you get started with data: Is tourism back to its pre-COVID-crisis level? and How to get started with data and help your local community.
dbt Cloud pricing update 🎁
dbt Labs announced yesterday a nice Christmas present for all dbt Cloud customers: a new pricing model. But you know this is the kind of Christmas present your uncle offers you that you don't like. Something you want to return directly because it does not suit you.
Let's have a look a it. Below are listed the major changes:
- Team plan x2. From $50/month/per dev to $100/month/per dev but limited to 8 devs
- Team plan is now limited to only one project
- Team plan will include the Semantic Layer no-one is asking
- The free tier now announced it's US based only
Small teams will get their dbt Cloud budget increase by 100%. For instance a small team of 2 analytics engineers will pay now $2400/year just to have a server running their SQL queries and a web IDE that is yet to perfect.
Obviously, dbt Labs has all the data points regarding activity and features usage to take this decision, but feels weird as dbt Cloud was a simple and costless solution for small users to enter the dbt world.
In term of strategy it also means that dbt Labs want to push companies to go for Enterprise plan with hidden pricing—don't forget transparency always wins this is one of dbt Labs core value.
Usual readers of the Data News might notice that I don't go softly with dbt Labs when it comes to their Cloud product, but this is a reality, if I caricature a bit right now dbt Cloud is only a web IDE with the capabilities to run your models, it should be commodity, the real value of dbt exists only in Core for the moment and in the community. In the open-source part.
As a comparison I pay PyCharm for years and it costs me €99/year and I can almost do everything that is included in the dbt web IDE plus I have all my comfort developer setup. The pricing difference is not worth it.
Fast News ⚡️
- Meta—Facebook—has been sued in Northern District of California following Cambridge Analytica scandal leftovers by a Californian attorney company. You'll probably say: "there is nothing new under the sun". OK. Then court files went public and listed the tables storing user identifiers for ads, 11051 Hive tables and 1190 Python pipelines have been listed. Nothing new under the sun.
- Yep 11051 Hive tables on the previous bullet point you didn't misread it. They need 11051 tables to run their ads system.
- Query your data in Kafka using SQL — This is a post that compares Flink, ksqlDB, Trino, Materialize, RisingWave and timeplus (the authors) in order to query Kafka. Even if it's vendor oriented this is a good starting point to have an overview of what you can expect from these tools.
- Traditional vs modern analytics data processing (part 2) — Petrica compare two ways to write a data models, with schema auto-discovery on and off.
- Airbyte move(data) conf videos — A YouTube playlist with 38 videos I did not watched because of lacking time from the online data engineering conference Airbyte organised a few weeks ago. You can read Matt's takeaways.
- A Zero ETL Future — Benjamin explores the promise of Zero ETL in the future following announcement from AWS or Snowflake.
- How HomeToGo has connected Superset Dashboards to dbt Exposures — Small article but great ideas.
- Why is everyone trying to kill Airflow? — Imagine a Cluedo and Airflow is Dr. Black. Who did it, when and with which weapon?
- Migration of Postgres from 9.6 to 10 via PgLogical for a Debezium.
- Unit Test SQL using dbt — Small setup to use seeds and tests to create a framework where you have unit tests.
Data Fundraising 💰
- Dataiku raised, once again, a $200m Series F. This new round of investing bring the total amount of money raised to $846m but with the economic global slowdown they did it at a lower valuation—$3.7b. Dataiku has been one of the first company to take AI path with all-in-one product. But it seems over the years as they focused big corporations they struggles selling their graphical drag-n-drop UI to smaller businesses.
As a side note, this is crazy to compare dbt Labs' valuation with Dataiku ones. Almost the same but even if I don't like Dataiku the depth of the product is by far not comparable.
See you next week ❤️
Join the newsletter to receive the latest updates in your inbox.