What's Brewing in ML?
The commoditization of ML should be front of mind for startups entering this space.
The other day my friend asked me for a cup of tea. I had to make several choices to deliver on this mission:
How to heat the water up (stove, microwave, water boiler)
At what temperature
Which tea leaves to use
How long to let the tea leaves steep
What mug to serve it in
What to add to the tea (milk, sugar, honey)
At its core, making tea is a process to deliver value to someone.
Building an ML-fueled application is similar in the way you need to select data sets, pre-process, clean, and label the data - maybe even weave in synthetic data to complement the sparse original data sets. Then you have to design and train your model - a question of how long and how many layers, features etc. Then you have to deploy your models and monitor them to make sure their accuracy doesn't slide over time. The aim is often to optimize a workflow, to alleviate stretched humans or teams, or to serve a better experience to some form of consumer or client.
ML has turned into a means to an end. Similar to programming, in the beginning it was kind of sexy to focus on programming itself, using or creating the newest programming tools on the block was a thing. Over time, the programming hype became a commodity - a means to serve an end. From what I can see ML is heading the same way. Data Science and ML tooling and ML application development have definitely become less sexy. Less new. The colors have faded - or shifted? ML has become a commodity.
When observing the rapid progress of Machine Learning (ML) and tooling in various areas over the last decade we were all first excited. But over the years, the novelty has subsided. There has been a sprawl and some clear strong community leaders have emerged as the pillars of the ecosystem, there is less defendable innovation rising. Everyone can make their own cup of tea from what is out there in the open source.
Please let me elaborate… With a few algorithm patents to my maiden name, I speak from experience when I say that applied math or algorithms are hard to transform into a competitive moat. Kudos to the ones who try, but in short: math is math, and math is free. Using many different flavors of math you can solve problems in many ways, so protecting one does not mean much in the bigger picture. I am not sure you can’t compete on math. You can however compete on what you achieve with the math. What is competitive is, in my opinion:
How well you apply math to real business problems,
How much representative and unique training data you can access, acquire, or generate, and
How seamlessly it can serve humans in existing workflows.
In other words, do you have the best trained model that beats the off the shelf, free, and pre-trained models openly available? Do you have access to the most representative data sets? Is your user experience going to win?
I guess what I am trying to get across is: anyone can do ML with off the shelf assets. So the question becomes how will you compete in a world where ML has turned into a commodity? Is what the ML produces enough to build a company and business around - and not just a feature enhancement? How do you scale if almost everyone has already settled on their favorite tea and teapots?
In my experience so far:
ML is hard to scale beyond domains
ML-tooling is hard to democratize (i.e. enable beyond experts)
ML-tooling often focuses on one persona and forget to serve the end to end workflow for an organization, hence failing to become a truly transformational platform company
ML-tooling is hard to make sticky in an organization, as it is like with tea: everyone has their own favorite
This makes me doubt that democratization and the horizontal scale of ML businesses could become real.
So, how could you succeed as an ML startup? Well, here are a few of my current thoughts:
Focus on a specific domain. You could potentially grow the business over time by adding “other kinds of tea”, but the business risk would be that you’d have to eat the cost of optimizing, over and over again, for each tea variant added. It would still be hard to achieve exponential growth.
Instead of becoming a tea provider, you could aim to build a full cafe experience where tea is just one part of your customer perceived value. The ML is applied.
Focus on the hard parts - getting data to useful form, data quality, model security, fair and ethical data representation. Preferably focus where no one else is looking, but everyone is struggling - in the a-z organizational workflow. This may mean less sexy areas than ‘ML tooling’, but perhaps a more scalable business?
If you must enter the noisy space of ML-tooling, at least make sure to focus on how you are different from all the others. The ML tooling space is particularly crowded, and in the end not creating very sticky business.
If you want to dig deeper into current ML trends, I tend to agree with many of the points made in the recent Pitchbook report published a week ago or so. Otherwise, feel free to debate me on any of the above points. Curious to hear your thoughts!