Skip to content

Data News — Week 38

Data News #38 — ClickHouse and Bigeye fundraising, Snowflake and Databricks competition, dbt v1, Infra as SQL and new trendy OS technologies.

Christophe Blefari
Christophe Blefari
4 min read
Data trends — same shape but different colors (credits)

Hello, a new fresh edition of the Data News is delivered to you in time. This week we maybe have less articles than the previous weeks but it raises still some interesting trends and questions.

Data fundraising 💰

  • Yesterday Bigeye announced their $45m Series B following a Series A 6 months earlier. Bigeye (formerly called Toro Data) is a data product focused on a data monitoring and alerting. You need to connect your sources, setup your metrics (auto or not), setup thresholds (auto or not) and you're done.
  • ClickHouse, Inc announced $50m in Series A founding. The high-performance columnar database will be incorporated in this new company and will spin-out from Yandex (their founding home company). I can imagine they will try now to compete in the Cloud database segment with others.

Snowbricks & Dataflake

This is a topic I've already mentioned in the newsletter. Snowflake and Databricks are converging, they both compete on the "Cloud Data Platform" segment (and also on Data Cloud 🤷). Annika Lewis wrote a two-parts analysis on where its going and also what are the similarities in this competition with previous SAP & Oracle competition.

A house with Snowbricks (credits)

Is BI dead?

Benn asked this question this week about the future of BI. Is BI dead? In a sense, yeah the terms BI and BI tools became untrendy because now we are speaking of Modern Data Stack so we don't want old stuff, we want modern tools. But what does that even mean? Is Looker really modern — why Tableau is considered too old to be at the table? I'd say it's actually only vocabulary discussion. The original BI is no longer the same, now we have visualization tools in a whole data ecosystem.

dbt v1.0 — get ready

In December 2021, the dbt Core v1.0.0 will be released. I'm not sure it will change a lot in the product — still some great improvements coming — but the perception will be at least different. dbt is now used by more than 6000 teams and here to last. With the v1.0.0 it means that you can start building on a stable version for the future. So prepare to upgrade.

Infrastructure as SQL

This week thanks to Octavian Zarzu I've discovered a new Data x DevOps range of tools: Infrastructure as SQL. Imagine a place where you could be able to run SQL queries to know how many EC2 instances you run and how much memory it represents. It's amazing. 3 tools came out recently aiming to do that:

  • Cloud Query — open-source and working with AWS, GCP, Azure, Yandex and DigitalOcean (+ Slack and Kubernetes)
  • Infrastructure as SQL (iasql) — only in early access, in creation by Alan Technologies, a company based in SF
  • Steampipe — open-source and working with AWS, GCP, Azure (+ Slack, Github, Zendesk)

Verticalized product analytics suite: PostHog

The product analytics space is one of the first to get specialized solutions in the modern data stack. This week I want to share PostHog that provides a self-hosted solutions for company to own the whole product analytics workflow.

They want to replace these complex product SQL queries by a platform already containing funnel analysis and product usage trends to name a few.

Under the hood if you plan to self-host the platform they will launch a ClickHouse instance to be able to keep a interactive UI. If you need to go deeper in ClickHouse I propose you this Reddit thread comparing it with Apache Pinot (another low-latency columnar database developed at LinkedIn).

Ad event processing at Uber

The Uber team wrote a post detailling the technologies they use to do a real-time exactly-once event processing. What I found fun here is that Uber become so huge today that they say using Pinot in that post, but they also appear on ClickHouse website in the portfolio.

A/B test explained by Netflix

If you are still new to A/B testing or that you need a post to explain it your colleagues Netflix wrote it for you. What is an A/B Test? This is the second post in the this dedicated series. It talks about the product challenges and that everything starts with an idea.

A guy doing coffee tests (credits)

Does empathy play a role in being data-driven?

Adam Votava writes in TDS about empathy. Do we need empathy to create a data-driven company? As he said "data doesn't lie!", so why bother being empathic? Go there to find out.

PS: this summary has been voluntarily wrote without empathy 🙃 .

Fast News

datanews

Christophe Blefari

Data Engineering Coach that enjoys all kind of data platform.