As this year draws to a close, we want to reflect on how our platform has evolved over the past year.
In 2024, we introduced many key additions and improvements to our end-to-end MLOps platform, all designed to help data science teams work more efficiently and scale their ML workflows.
Valohai’s evolution over 2024 is a strong representation of our mission: making MLOps scalable, efficient, and effortless.
This year, we’ve helped the leading AI teams deliver real business value by streamlining their processes and automating workflows while continuously refining our MLOps platform and shipping new features.
In this part of our annual review, you'll get to hear about all of the additions and improvements to our platform. In the next part, you'll hear how some of the world's best ML teams scaled their operations with Valohai.
Kubernetes & Slurm support
We integrated with Kubernetes and Slurm, simplifying the running of ML workloads across both cloud-native and HPC environments.
Our Kubernetes support abstracts away infrastructure complexity, enabling you to focus on experimentation and model development rather than managing Kubernetes clusters.
Similarly, our Slurm integration allows you to run ML jobs on HPC clusters directly through Valohai, optimizing resources and reducing operational costs.
Hugging Face integration
We introduced a Hugging Face integration that gives teams immediate access to a vast library of models. Fine-tuning Transformers and running batch inference without writing custom code significantly streamlines experimentation, helping you iterate faster and explore a broader range of models.
Model Hub
Our new Model Hub is a centralized solution for managing models throughout their lifecycle. It provides a unified view of all models, automated versioning, lineage tracking, performance comparisons, and the ability to trigger automated workflows. Leveraging our reproducibility-first design, the Model Hub simplifies collaboration, ensures compliance, and accelerates iteration.
Pipeline caching & parallel downloads
ML workflows running on Valohai became more efficient this year with the introduction of pipeline caching and parallel downloads. By reusing previous outputs and starting runs sooner, data scientists can experiment more rapidly, turning ideas into results faster than ever.
Underutilization alerts & smart instance selection
To ensure optimal resource usage, we introduced automated underutilization alerts. These alerts inform you of underused resources, helping you reduce costs and focus on building valuable models.
Similarly, smart instance selection further optimizes resource utilization by automatically choosing instances where data is already cached, cutting down on wait times and costs.
Webhooks & notifications
With webhook support, workflows running on Valohai can easily be triggered from external systems. These triggers (along with Valohai's robust notification system) help you automate workflows, accelerate handoffs, and minimize the operational burden of repetitive tasks.
Other ongoing improvements
While the aforementioned features represent key milestones, our innovations extended far beyond those this year.
Among other things, we also refined the user interface, introduced dataset tagging, added image comparison capabilities, improved the command-line experience, and released countless quality-of-life enhancements.
Guided by continuous user feedback and a rapid release cycle, we focused on ensuring that every improvement (whether a minor UI tweak or a foundational infrastructure enhancement) aligned with what agile data science teams need to meet the evolving demands of modern MLOps.
Looking ahead to 2025
As we move into 2025, we’ll continue our mission of enabling effortless, scalable MLOps by targeting the cutting edge of AI development.
- We plan to expand our platform’s capabilities to accelerate the development of GenAI applications, ensuring that adopting and refining LLMs and agentic AI systems becomes even more intuitive and efficient on Valohai.
- We also aim to streamline the path to production with advanced deployment and monitoring features, offering you fine-grained control and real-time insights into model performance.
Beyond that, as data ecosystems evolve, we’ll seek deeper integrations and strategic partnerships with leading data and cloud platforms (such as OVHcloud) to ensure that your data and models are always in sync.
Stay tuned as we continue to remove roadblocks and accelerate the path to scalable MLOps in the year ahead! If not already, you can stay updated by subscribing to our newsletter: