Automating machine learning pipelines for spend management
Spendesk is a FinTech SaaS company that provides a complete spend management platform that saves businesses with 50 to 1,000 employees time and money by streamlining financial processes and automating manual tasks.
Machine learning is close to the heart of Spendesk’s product and internal operations. The machine learning team consists of data science and machine learning experts who support a wide range of use cases across diverse domains:
- They build ML-enabled features for the core product, the spend management platform, such as auto-categorization of payments.
- They also enhance ML capabilities within internal operations with use cases ranging from lead scoring and churn detection to auto-categorizing customer feedback.
In this story, we’ll look at Spendesk’s end-to-end ML pipeline for training and deploying models for payment categorization into their core product.
In summary, the team at Spendesk was searching for an all-in-one platform for pipeline management. They found exactly that in Valohai along with a unified standard for building automated pipelines and these key results:
Dependency on DevOps and IT
for compute provisioning and overall support
Experimentation rate
by reusing pipeline steps from previous projects
Deployment speed to production
by setting unified standards for pipeline automation
Looking for an all-in-one platform for ML pipeline automation
Initially, the team used a combination of Airflow and MLflow. While this pairing had its strengths, it put arbitrary limits on the team, such as constraints on model size, slow model fetching, and challenges in experiment comparison.
In order to fulfill its ambition, the team needed a robust solution that enabled comprehensive pipeline management with unified standards for future work. One of the key requirements was the ability to retrain models on a weekly schedule and deploy them within the same pipeline.
As a FinTech company, Spendesk works with highly sensitive payment data and financial documents. For this reason, they were looking for an MLOps platform that adheres to the strictest security standards without compromising on ease of use and access to data and compute.
Based on the benchmark that we did for an MLOps platform, Valohai was far above other tools that we tested in terms of capacity, pricing, and integrations.
Adèle Guillet – Senior Lead Data Scientist, SpendeskProductionizing ML at scale with thousands models in one pipeline
The Machine Learning team at Spendesk perfects its feature for auto-categorization of payments with the help of Valohai’s pipeline automation capabilities.
In more detail, the team trains models for every customer account based on their spending habits. This results in thousands of separate models deployed in production.
Valohai enables us to train more than 3,600 models in parallel every week. This capability is one of Valohai’s strengths for our use case.
Adèle Guillet – Senior Lead Data Scientist, SpendeskOn top of that, the team re-trains every customer model on a weekly schedule with new transaction data. This means 52 training runs per year per model.
In total, this talented team of three executes over 250,000 training runs in a year in order to perfect one of many features of the product.
Here’s how they set up this pipeline:
- Pre-process the training data.
Valohai fetches every customer’s transaction data from a data lake and pre-processes it by aggregating it in cloud storage buckets for further training steps. - Train the models.
Next, Valohai starts a separate auto-scaled compute instances to train each of the models. The team at Spendesk has set separate training parameters based on the account size of their customers. Therefore - Aggregate the models.
The re-trained models are then aggregated in cloud storage buckets for more efficient serving in groups. - Deploy the models.
The models are finally deployed in production with a compute cluster. - Revoke old deployments.
If the deployment of the latest model versions is successful, Valohai automatically revokes the deployment nodes of the previous model versions.
What is important for us in Valohai is that we can do the training and the deployment in the same pipeline.
Maxime Rihouey – Senior Machine Learning Engineer, SpendeskAs a side note, Valohai seamlessly integrates with any technology, including storage and compute systems. In practice, you can build your pipeline to store files in S3 buckets, execute runs on EC2 instances, and deploy models to a Kubernetes cluster.
For ready-made integrations, check out our ecosystem library
Boosting ML team productivity and reducing IT overhead
After switching over to Valohai, the team has become more autonomous and minimized its dependency on IT and DevOps for compute provisioning and other support tasks.
If we didn't have a Valohai, we would need one more person on the team to help us with the maintenance.
Adèle Guillet – Senior Lead Data Scientist, SpendeskAs Valohai's core benefit is the ease of use, every team member can carry a project independently from start to end: from experimentation and training to deployment.
The key highlight is the newfound standard for building ML pipelines. This enables the team members to reuse pipeline steps from previous projects and further save time for experimentation and model development.
Looking forward
Spendesk’s success story is one of many examples of how the right tooling can help machine learning teams extend the impact of their work and achieve world-class results.
Its ML team gained more autonomy in a highly regulated space while scheduling thousands of training runs per week and deploying models to the core product in less time.
Thanks to its flexibility and technology-agnostic approach, Valohai supports Fortune 500 companies and industry disruptors on a wide range of across multiple industries.
Read their success stories