Today, we’re turning the page on a new chapter in Arrikto’s story. We have raised $10 million in Series A funding, led by Unusual Ventures, and are adding Unusual’s Co-Founder & Managing Partner John Vrionis to our board. While this round is obviously a major milestone for the company, it’s also just the first step in fulfilling our mission to break down the technical barriers that keep most companies from implementing large-scale machine learning capabilities.
It’s been a long journey since 2015, when our small founding team worked day and night in a small flat in Athens building what would become Arrikto, and a lot has changed since then. We’ve seen our product evolve as cloud technologies and infrastructure grew more sophisticated and began unlocking new and exciting possibilities for businesses.
Of all the technologies that have bubbled up since we started Arrikto, none have been as consequential as Kubernetes will be. We think that, not too long from now, literally everything will run on Kubernetes. As computing gets bigger and more democratized – like large-scale machine learning applications – data becomes more essential. The size and complexity of a company’s data must not be an inhibitor to scaling. Data will have to become truly portable, and data infrastructure more elastic. With Kubernetes everywhere, this will be the new normal.
While Kubernetes is showing the world what simplified, large-scale infrastructure for machine learning looks like, Arrikto wants to show the world that it’s possible to put that infrastructure to work by easily creating end-to-end machine learning pipelines that quickly put apps into production.
How do we do this? By treating Data As Code.
The New Era of Data As Code
Software, as is often said, is eating the world. DevOps philosophies and technologies like containers and automated pipelines have created huge efficiencies in the software development lifecycle, and made it possible for teams of all sizes to ship great products quickly. The ability to put data into action, however, has been what defines market leaders. Think Netflix, FedEx, and Spotify.
Managing and putting data into action is not an easy thing to do. Where a team of four developers can build an automated pipeline that puts code into production in a week, it can take a data science and data engineering team five times that size a year to build, train, and deploy a single machine learning project. The problem? Machine learning and dev teams don’t speak the same language. Data scientists and engineers build models with one set of tools, while infrastructure and development teams use completely different tools on different infrastructure to run models in production.
We all know how fun redundant work is.
What if we treated data like code by applying the same DevOps principles used for software development and infrastructure deployment toward data managed in the ML lifecycle? What if we could manage data programmatically, set up automated continuous integration and deployment pipelines for data, add the ability to version, package, clone, branch, diff and merge data, and also made it collaborative across different clouds and workspaces? Most importantly, what if we made it possible for data scientists to do all this using the tools they’re already familiar with?
Pipelines Aren’t Just For Software Development
That’s where Arrikto comes in. Much like DevOps shifted deployment left to developers in the software development lifecycle, Arrikto shifts deployment left to data scientists in the machine learning lifecycle. Arrikto introduces a feature-rich version of Kubeflow built for enterprises that empowers data scientists and machine learning engineers to work their way, with familiar tools to rapidly build, train, debug, and serve models on Kubernetes in any cloud.
More specifically, Arrikto enables any company to realize the full potential of Kubeflow for machine learning GitOps automation: a production workflow that is portable, automated, reproducible, and secure. We enable data science and MLOps teams to collaborate and continuously develop, debug, and deploy machine learning models with DevOps efficiency, while treating data as code through the whole machine learning lifecycle.
To achieve this vision, we have closed the “GitOps gap for ML” in four key areas:
- Portability: One standardized, consistent environment from ground to cloud using MiniKF;
- Automation: Enable data scientists to easily generate production-ready pipelines for MLOps in their Jupyter Notebook, or any other tool of their choice (e.g., VS Code) using Kale;
- Reproducibility: Debug and collaborate on model development with versioned persistent volumes for code, libraries, and data along with a publish and subscribe model to manage versions across teams and clouds using Arrikto Rok and Rok Registry
- Security: Full audit trail and user isolation through secrets management, role-based access control, and fine-grain access management.
Machine Learning For All
Machine learning can unlock a world of new possibilities for companies. They can easily find and create efficiencies, build and ship better products and efficiently customize how they market and sell to customers. This shouldn’t be something available only to the few companies that have the resources to overcome the technical barrier that exists today.
Arrikto puts machine learning capabilities within reach for any company, helping them achieve faster model time to market, from months to minutes, higher quality models, from rapid debugging and easy collaboration, and improved model governance: from fully reproducible pipelines.
As data becomes more and more valuable and as the ability to put that data into action continues to be what separates market leaders from the rest of the pack, any company, regardless of size, should have access to the tools that will help them build a great business.
Data As Code Beyond Kubeflow
Data challenges aren’t limited to data science however, and we envision Data As Code extending to the entire ecosystem of Kubernetes applications, surrounding a machine learning environment, and even beyond to non-machine learning related stateful applications. Data As Code presents a new way to manage data that brings to every company the invaluable benefits the DevOps movement brought to their software development lifecycle. We believe it will free Kubernetes to scale massively and unlock its multi- and hybrid-cloud potential. We are thrilled to be leading this change.
We’re so excited about what’s coming next and can’t wait to make the impossible a reality. We are extremely thankful to everyone in the Arrikto team for their hard work and perseverance, to our customers and partners for their trust, and to our initial investor, Odyssey VP, who believed in our vision as early as 2015, as well as our all-star team of angel investors and advisors.
Constantinos & Vangelis
If you’d like to see more, please get in touch and we’ll show you what Arrikto can do for your company.
Co-Founder and CEO
Constantinos is a Co-Founder and the CEO of Arrikto. Originally from Athens, Greece, he studied computer science at NTUA, where he earned his master’s degree. Before Arrikto, he helped design and build a large scale public cloud service, one of the largest in Europe at the time. With Arrikto, he arrived in Silicon Valley, trying to redefine what’s possible in AI/ML, by rethinking how applications manage and store data. He tries hard to keep learning every day, and to infuse each day with a shot of creativity, drawing from his brief but exciting time studying footwear design in Florence, after NTUA.
Co-Founder and CTO
Vangelis is a Co-Founder and the CTO of Arrikto. He holds a PhD in computer science and has a long history of working in storage, data management, and cloud computing. At Arrikto, he is leading a team of talented engineers working hard to bridge the world of low-latency, high-performance local storage with eventually consistent, massively scalable object stores, and decentralized synchronization. Their vision is to empower real change in how Data Scientists and ML Engineers create end-to-end workflows for building, training, and serving models at global scale.