I often hear from full-stack developers looking to expand their knowledge in MLOps. Here's a good example: using MLflow and Kubeflow. Nicely enough, I was asked by the Linux Foundation to present on this at the AI.dev Open Source GenAI summit in San Jose, CA, earlier this December.
First of all, bear in mind that MLOps is similar to – yet more complex than – DevOps. The crucial differences are in the added intricacies of handling continuous changes in data and models. In other words, it means your scope needs to expand beyond the build and release of compute artifacts, going beyond what is implemented via CI/CD toolchains.
Here's the tl;dr on the tools: MLflow works principally as a detailed digital notebook for tracking your machine-learning experiments and for managing models (think transparent idempotency). Kubeflow, on the other hand, runs on and with Kubernetes, focused on efficiently managing distributed processing of notebooks, models, and pipelines.
One key takeaway: MLflow can be integrated with Kubernetes, facilitating the deployment of models in Docker containers. Containers can, in turn, ensure that models remain accurate and relevant over time. We implemented what I showed in the talk with our CGDevX platform engineering reference implementation. Spoiler: Kubeflow integration will be pushed to our git repo in early 2024.
Have a look at the talk on YouTube; I invite you to leave questions and comments there. Scroll below for the slides.