Tools like ClearML, Pachyderm, Techton, Algorithmia are what the world needs right now to make everyone an AI Superhero

Jonathan Ballon
7 min readJan 18, 2021

Recognizing that every business is a data business, many leaders are rapidly investing in long-term machine learning capability development and implementing organizational changes necessary for the domain to become a core function. Framework tools and machine learning libraries employed by data scientists over the past five years have allowed researchers to make sense of vast, previously unmanageable quantities of data with the help of on-demand computing power in the cloud.

While many challenging aspects of the science have been simplified, there is work to be done to further automate complex tasks, bring multi-disciplinary teams together with a common set of tools, and ultimately make the entire end-to-end process efficient for all stakeholders in the enterprise.

Companies looking to compete will need to do these things adeptly and with heterogeneous sources of data, growing exponentially and in real-time, interpreted by sophisticated self-learning algorithms that can be put it to use in measurable ways across the company.

The stakes are incredibly high, for some there are existential…

By harnessing AI to its full potential, companies can do any or all of these things more adeptly than their competition:

  • Know their customers (and yours) intimately
  • Design better products, faster
  • Take advantage of market transitions before they are obvious
  • Consume fewer resources
  • Deliver better service
  • Streamline their supply chain
  • Understand and manage risks

Can they predict the future more accurately? Perhaps. One thing is clear: adopters of advanced data analytics will win, and they will keep winning because they will leverage the same technology to target and attract the best talent, raise more capital, and acquire more intellectual property.

In other words, machine learing will impact every aspect of how a company executes its strategy. Insert Darwinian moment.

Are you kidding me? It’s already happening…

According to McKinsey’s 2020 study on Enterprise adoption of AI, half of the companies they surveyed are already adopting elements of the technology in at least one corporate function and are seeing material impact. When you double click on the outliers — the companies who say 20 percent (!) or more of enterprise-wide EBIT was attributable to their machine learning use — several strengths set them apart from other respondents: a) better overall year-over-year growth, b) an engaged and knowledgeable C-suite champion, c) more (and growing) resource commitment.

High performers also tend to have the ability to develop AI solutions in-house — as a core capability — with more AI-related talent, such as data engineers, data architects, and data scientists.

These companies also are much more likely than others to say their companies have built a standardized end-to-end platform for AI-related data science, data engineering, and application development.

Enterprises must also design their machine learning (ML) development processes to include data engineers, data scientists and ML engineers in a single automated development, integration, testing, and deployment pipeline.

Many of the tools that are coming to market right now from startups are the result of engineers who grappled with these issues while working for the earliest, and most advanced adopters of machine learning who have forged out on their own to productize their innovations in an effort to help other companies, essentially doing to MLOps wht the industry has done for DevOps.

Tools to the rescue!

I need a hero
I’m holding out for a hero till the end of the night
She’s gotta be strong, and she’s gotta be fast
And she’s gotta be fresh from the fight
She’s gotta be sure, and it’s gotta be soon
And she’s gotta be larger than life
— Bonnie Tyler

Tools can be powerful, which is why most superheros employ them, even if they are already gifted with otherworldly talent.

In the first “Incredibles” movie by Pixar, a misguided and vengeful inventor named Buddy Pine (aka Syndrome), was possessed with the belief that technology and tools could turn anyone into a superhero. Syndrome created super tools for the masses, because “when everyone’s super, no one will be.”

Super or not, heroic companies that are advancing their competence of harnessing data through machine learning methods share some common attributes as it relates to repeatable processes, organizational capability and standardized tool adoption.

According to McKinsey, high performers are 2–3x more likely to adopt standard tools and processes than average machine learning enterprises.

It’s clear that the industry needs non-proprietary tools, and many companies are looking to fill the need. By one engineer’s recent count, there are over 280 companies in the space, 180 of them start ups with around half of those newly formed in just the past twelve months.

These companies fall across a few broad categories:

  • Purpose-built silicon, optimized to accelerate machine learning algorithms, either with a focus on workloads in data centers, or chips optimized for inference on edge/consumer devices with low power consumption
  • End-to-end platforms for developing & deploying AI applications
  • Discrete tools designed for specific, often discrete tasks like data management, modeling & training, monitoring

Up to now it has been the early adopters who are navigating this complex and fragmented landscape of machine learning tools. As more companies begin evaluating and adopting I think three things will occur over the next twelve to eighteen months:

  1. Applications that align to end-to-end, multi-user, collaborative workflow with time-saving functionality to standardize, automate and streamline the process will grow in adoption and see a greater market share of monetized features
  2. Armed with huge amounts of fresh venture capital, several ambitious companies are poised to consolidate a few (often very good) single-function tools operating as full applications / companies.
  3. Many of the remaining ventures will be picked up for their skills or run out of capital as the market matures and consolidates

There are so many companies building tools it’s often hard to discern one from another until you start working with them. The following companies are worth watching because they combine some combination of (more is better):

  • technically sophisticated & comprehensive feature set
  • team collaboration, supporting cross-functional, multi-disciplinary users
  • simple user experience that aligns to the end-to-end process
  • cloud agnostic, w/ data and model portability including on-premise option
  • open source

Algorithmia puts ML models into production fast, securely, and cost-effectively within existing operational processes, across all stages of the ML lifecycle. Algorithmia automates ML deployment, optimizes collaboration between operations and development, leverages existing software DevOps processes, and provides advanced security and governance. Over 110,000 users working in government intelligence agencies and Fortune 500 companies.

*ClearML by Allegro.ai is a pioneer in deep learning and machine learning software tools. With ClearML, businesses are able to manage and bring to market higher quality products, fast and cost effectively. ClearML’s suite of integrated tools include experiment management, MLOps, data management and more. ClearML is supported by a growing open source community, partners, as well as over 1,000 customers including global brands such as: NVIDIA, NetApp, Samsung, Hyundai, Bosch, Microsoft, Intel, IBM and Philips.

Domino is an Enterprise-wide tool for centralizing data science work and infrastructure to give Data Scientists the ability to experiment, collaborate, deploy and monitor data science models. Currently in use by twenty Fortune 100 customers.

Pachyderm delivers a robust data versioning and data lineage platform for AI/ML that acts like Git for data. It’s advanced pipelining system uses Kubernetes and Docker to quickly scale data transformation, training and model development across a distributed data science team. It’s customers include some of the world’s most advanced companies including cutting edge automakers, banks, healthcare, biotech and defense agencies.

Tecton transforms raw data into feature values, stores the values, and serves them for model training and online predictions. Most hyperscale AI companies have built internal feature stores (Twitter, AirBnb, Google, Facebook, Netflix, Comcast) and this is the team that built it at Uber. Until such time that there is a database designed for Data Scientists, feature stores certainly help bridge the skill and functionality gap. provides an enterprise feature store that makes it easy to build, deploy, and share features for machine learning.

Seldon is an open source, enterprise-level platform to help deploy ML models at scale on Kubernetes. The platform is framework-agnostic, built to scale, and can be run on a preferred cloud server, on-premises, or can be fully managed by Seldon.

Of course there are many many other firms doing very interesting things.

Others to watch include:

Abacus
BentoML
CometML
Dataloop
Determined
Fiddler
FloydHub
Iterative
Maiot
Mona
Neptune
OctoML

Peltarion
Polyaxon
Run
Spell
Superb
TerminusDB
Valohai
WhyLabs
YData

To infinity — and beyond…

Cheers —

JB

*disclaimer — the author is an advisor to allegro.ai, producer of ClearML

--

--

Jonathan Ballon

Passionate about the application of technology towards our environment, human productivity, education, health & longevity