DBRX & Competition in the Data Cloud
Why DBRX signals a serious, serious challenge for Snowflake and the brilliance of Open Source as marketing.
Hey Friend,
Bill here.
You haven’t received an email from me in a while. Certainly not on this topic. All this DBRX talk has brought me out of the shadows.
First, a quick update: Bill, what have you been up to?
9 months ago, I decided to take a leap and try to start my own business. I wrote about the start of the journey here. That’s what I call my Scrappy Startup journey (yes, when projects have failed my friends have told me they’re crappy startups😭 - all in good fun).
After various failed projects (turns out starting a business is hard, who knew?). I’ve finally landed on Hyperlint, which is getting traction.
Hyperlint helps developer teams write, edit, and maintain technical content.
If you’re interested in learning more about how Hyperlint can help you, just respond to this email.
I am also running a cohort course for PMs in June - How to build gen AI products for product managers.
Back to our regularly scheduled programming…
DBRX
First of all. Is DBRX the stock ticker that we’re waiting for?
Second of all, let’s talk about the brilliance of the DBRX release. There are several key factors if you’re going to compete against OpenAI + Anthropic / Mega Model companies:
You need fear.
You need a model.
You need durable advantage.
Example: HuggingFace
While a company like HuggingFace has amazing marketing and developer mindshare, as far as I can tell, they’ve got no durable advantage.
They’re going to be a part of the conversation at enterprises, but, they don’t have a business identity. The homepage proves that.
Last time I checked, you can’t “buy an AI community to implement at your company”. Nothing against HuggingFace, it’s just challenging to see how that’s going to ring true with enterprises.
So what do they have?
❓ Fear
✅ Model (s)
❌ Durable Advantage
They’re not selling a commodity or really managing anything major for enterprises. I would put them as more of a utility used by the community than anything. Sure, that can change, but that’s what it looks like right now.
What enterprises need
With all this talk of AI, LLMs, and so on. There’s insane budget being thrown around right now to solve the problems.
Enterprises, large companies, are looking for solutions.
The problem is, they don’t know the solutions that they’re looking for because they don’t understand this new tech.
There’s talk of prompting, model tuning, fine tuning and what not. But enterprises are going to be scared to give all of their data to OpenAI and these mega model companies, period. Some companies just won’t do it.
So what are they actually looking for?
A reasonable story to counter the fear of AGI attacking their business.
A vendor that can reasonably fit into that story.
I assert that data + models have to be that story.
If enterprises are missing half of that equation, it’s significantly less convincing.
That’s why Databricks is in such a dominant competitive position. They’ve spent the last decade moving data into Data Lakes operated my Databricks for enterprises. Now it’s time to take it to the next level.
The fear is there.
They’ve got the data.
Now they can build models.
And they can build them on your data.
DBRX: The checkmate of open models
Here’s what so smart about open sourcing DBRX.
“Open” makes companies feel that they own the IP. In Principle, they do but it’s hard to operate this level of infrastructure.
You need data to make these models happen, if you don’t have the data, you’re in trouble. That's a challenge for companies that don’t have data gravity.
Since DBRX is the “best” open source model. All the other vendors are now going to have to have DBRX as a provided model on their platform. That’s basically free advertising. For $10M, it’s not bad to have all of your competitors talking about your model….
This open model now FORCES a conversation about Databricks at large companies. If you’re going to a third party provider and want to evaluate DBRX, aren’t you going to talk to the source?
Starting today, DBRX is available for Databricks customers to use via APIs, and Databricks customers can pretrain their own DBRX-class models from scratch or continue training on top of one of our checkpoints using the same tools and science we used to build it. DBRX is already being integrated into our GenAI-powered products, where - in applications like SQL - early rollouts have surpassed GPT-3.5 Turbo and are challenging GPT-4 Turbo. It is also a leading model among open models and GPT-3.5 Turbo on RAG tasks.
If the story of this model is all about customization for your business - why would you go somewhere else:?
What that means for Snowflake ❄️
Let’s take Snowflake.
Snowflake is going to have to offer DBRX on the Data Cloud (assuming they can)1. The problem is, what does that announcement look like?
“Snowflake is proud to offer the best Open Source Model (DBRX) on the Data Cloud.”
That reads a bit strange. They’re telling customers that they’ve got to go elsewhere for this expertise.
They did as much in the Mistral AI announcement.
Basically “we don’t know how to build this, so we had to partner”.
Snowflake is an awesome business, don’t get me wrong, but they are so flatfooted about AI. Ali calls that out at the end of the video.
Snowflake is a great product.
Wrong product for AI.
Now a new CEO can help, but these decisions to make models and take charge of this space were made a long time ago.
Snowflake has some serious catching up to do.
Closing Thoughts & Snow in the Desert
Closing out this thought - finding product market fit is like wandering in the desert.
You’re not sure what you should do.
Go left?
Go right?
Go straight?
Lord it’s hot out here.
You are so tired.
You don’t have enough water.
A mirage appears.
Water?
Snow?
Your mind knows it’s not snow.
There’s no snow in the desert.
Snowflake is wandering in the desert when it comes to AI.
They bought Streamlit for a billion and have done basically nothing with it as far as I can tell (it’s integrated, with limitations according to the docs).
Neeva was probably an acquihire CEO play (h/t Lauren Balik, I didn’t consider this at the time).
Now they’re going to have to provide a competitors model (at some point) to make sure they’ve got the best open models on their platform.
That’ll be awkward.
Friends. Reconnect! Let me know how you’re doing. Let me know your thoughts.
Cheers,
Bill
This is not an eye poke. Snowflake literally only has the most basic models available on Cortex, which is in preview, and they’re only available in certain regions and they’re probably provided by a partner. Not 100% sure.
Gonna be interesting to see what they do and how awkward it gets for Snowflake or other companies
Side note: I just discovered Streamlit and love it!
Thanks for the article.
Hey Bill, an Andrew sent me your blog post recently. Glad you are doing well and great article!