Hi, it's late Data News. I did forget you last Friday because I was lost between trees without electricity and internet. But here I am, and I'll give to you a glimpse of last week articles.
Data fundraising 💰 and Preset Cloud GA
- Databricks is going to the moon (or crazy), Bloomberg reported that a new investment round led by Morgan Stanley will inject at least $1.5B. This could bring Databricks valuation at $38 billion.
- Monte Carlo, a data observability product, raised $60m in Series C. Monte Carlo is bringing to data ecosystem the observability we deserve (inspired by Datadog).
- Dataiku raised $400m in Series E, this announcement follows the launch of the Dataiku Online version to focus more on startups and SMBs.
- Preset announced the Preset Cloud GA and raised $35.9m in Series B. Preset Cloud is the fully managed version of Apache Superset the modern, open-source data exploration & visualization platform. It includes a freemium version.
Modern Data Stack comparators
You want to start your data platform or you want to find all available tools before doing a benchmark to add new features to your platforms? The 2 following links will help you.
Datafold compared multiple categories: collection, warehousing, transformation, cataloging and analysis. This is not exhaustive but they started a good job comparing the communities behind each open-source tool, giving also key insights on each.
On the other side moderndatastack.xyz is finally available. The tool is giving us a more exhaustive list of the whole data landscape: categories, companies, influencers and also a list of useful resources to start your journey.
To finish this category Tech Ninja wrote a small comparison between Feature Store technologies. If you want to compare Feast, Hopsworks, iguazio, bytehub and QuintoAndar this is for you.
Modern Data Experience
Benn — I really like his views — wrote about what he called Modern Data Experience. He gave us principles that should help us define standards between modern data tools. Benn also share a lot of links to help your understanding of the field.
Suresh Srinivas, founder of Hortonworks and ex-VP engineering, announced OpenMetadata initiative to create an open standard for Metadata with a centralized store. The goal is to improve discovery and collaboration.
dbt Cloud — DAGs in the IDE
dbt Cloud team released a new feature where you can see your full dbt DAG directly in the development IDE in your browser. I would love to see this kind of feature inside my PyCharm (or VSCode) in the future.
Passing the Google Cloud professional Data Engineer exam
If you are an aspiring Google Cloud certification student, Chenming Yong wrote a Medium post about his own journey getting certified. It's worth checking out if you want to get certified too.
- If you want to get a new job or post a job for your team there is a new simple job board you can use: Data Stack Jobs. The board is clean and tags are useful.
- Get a glimpse of AWS Glue Data Catalog and Quicksight with this AWS technical blog post.
- Why You Should Probably Never Use pandas inplace=True — a must-read to understand more pandas internals.
- Data Warehouse Migration with AWS DMS — The post describe how Servian used DMS to migrate from Aurora PostgreSQL to Redshift.
See you on Friday 🥰.
Join the newsletter to receive the latest updates in your inbox.