Partnering with Hugging Face: A Machine Learning Transformation
By Pat Grady and Sonya Huang
Published May 9, 2022
Over the last few decades, humans have largely mastered the work of analyzing data: transaction records, click streams, anything in a structured format. But we are a storytelling species; most information is language, not data—and for most of history, machines haven’t been able to process our words.
What wasn’t possible yesterday will be tomorrow. Thanks to the breakthrough emergence of pre-trained transformers and the accompanying flood of language models, one of the hardest, most valuable problems in machine learning is beginning to crack. Text is quickly becoming just as easy to analyze as numbers—and along with it, context and intent. This profound change in the power of software has applications that reach far beyond chatbots and AI assistants, to everything from fraud detection to bias mitigation to categorizing groceries.
At the heart of this decade-defining trend sits Hugging Face, the company that bridges academia and applications and puts ready-to-use, state-of-the art machine learning models in the hands of developers everywhere. The first to implement Google’s landmark model BERT in the popular ML library Pytorch and share it with the open-source community, Hugging Face now offers a curated collection of more than 50,000 public models and both cloud and on-prem hosting options, allowing users to easily go from “I have data” to “my model is running in production.”
We at Sequoia are longtime fans of Hugging Face founders Clément Delangue, Julien Chaumond, Thomas Wolf and their team, and we’re not alone: their drive to build an open and collaborative platform, with transparency and value-informed decision making by default, has made them beloved by the hundreds of thousands of people they serve. Hugging Face has earned itself a privileged strategic position in the ML ecosystem. It’s the default destination for developers looking for the latest and greatest ML models—and the place natural-language processing scientists and other researchers, from the Allen Institute to Microsoft, go to distribute their models into the world.
What’s more, language is just the beginning. Transformers as a technology have started to appear in computer vision, structured data, biological chemistry and other modalities, accelerating the adoption of machine learning more broadly—and as use cases expand, so does Hugging Face. In the past year, they have more than tripled users, models and datasets. Customers tell us the new AutoTrain offering in particular has been a game-changer in terms of both model performance and resources required. They simply upload labeled data, specify parameters like model size and iteration count, and let Hugging Face automatically find, train and fine-tune the best model—without code. By abstracting away the underlying infrastructure and simplifying ops management around training and deploying models, Hugging Face is making production machine learning broadly accessible to all citizen developers—including those without access to dedicated ML engineering resources.
We are proud to partner with Hugging Face. Their accomplishments thus far are nothing short of transformative, but it’s no surprise that a team led by Clément, Julien and Thomas—three smart, hard-working and genuinely good humans—have been so successful, and we look forward to supporting them as they continue to grow. In a future where machine learning is becoming the default way to build technology, the successor to the big data revolution may be the “big content” revolution—and we believe Hugging Face and their powerful community can help lead the way.