Kubeflow and MLOps Workshop Recap – Sept 2021

Kubeflow and MLOps Workshop

Last week we hosted a free Kubeflow and MLOps workshop presented by Kubeflow Community Product manager Josh Bottum. In this blog post we’ll recap some highlights from the workshop, plus give a summary of the Q&A. Ok, let’s dig in.

First, thanks for voting for your favorite charity!

With the unprecedented circumstances facing our global community, Arrikto is looking for even more ways to contribute. With this in mind, we thought that in lieu of swag we could give attendees the opportunity to vote for their favorite charity and help guide our monthly donation to charitable causes. The charity that won this month’s workshop voting was Save the Children. We are pleased to be making a donation of $250 to them on behalf of the Kubeflow community. Again, thanks to all of you who attended and voted!

What topics were covered in the workshop?

  • How to install Kubeflow via MiniKF locally or on a public cloud
  • Take a snapshot of your notebook
  • Clone the snapshot to recreate the exact same environment
  • Create a pipeline starting from a Jupyter notebook
  • Go back in time using Rok. Reproduce a step of the pipeline and view it from inside your notebook
  • Create a Katib experiment starting from your notebook
  • Create an AutoML experiment
  • Serve a model from inside your notebook by creating a KF Serving server

What did I miss?

Here’s a short teaser from the 45 minute workshop where Josh walks us through some of the steps required to turn models into pipelines using Kubeflow.

Install MiniKF

In the workshop Josh discussed how MIniKF is the easiest way to get started with Kubeflow on the platform of your choice (AWS, GCP or locally) and the basic mechanics of the installs.

Here’s the links:

Hands-on Tutorials

Although Josh focused primarily on the examples shown in tutorial #3 (which makes heavy use of the Open Vaccine Covid-19  example), make sure to also try out tutorial #4 which does a great job of walking you through all the steps you’ll need to master when bringing together all the Kubeflow components to turn your models into pipelines. Get started with these hands-on, practical tutorials.

Need help?

Join the Kubeflow Community on Slack and make sure to add the #minikf channel to your workspace. The #minikf channel is your best resource for immediate technical assistance regarding all things MiniKF!

Missed the Sept 23rd workshop?

If you were unable to join us last week but would still like to attend a workshop in the future you can sign up for the next workshop happening on Oct 28.

FREE Kubeflow courses and certifications

We are excited to announce the first of several free instructor-led and on-demand Kubeflow courses! The “Introduction to Kubeflow” series of courses will start with the fundamentals, then go on to deeper dives of various Kubeflow components.. Each course will be delivered over Zoom with the opportunity to earn a certificate upon successful completion of an exam. To learn more, sign up for the first course.

Q&A from the workshop

Below is a summary of some of the questions that popped into the Q&A box during the workshop.

Can I install MiniKF on GCP?

MIniKF runs on GCP, AWS and locally. Linked to the installers are here.

Is the Jupyter notebook Josh is showing available?

Here’s the link to the “Open Vaccine Covid-19” tutorial highlighted in the workshop.

Are Kale tags per cell, or do they apply to everything until the next tag?

The main idea behind Kale is to exploit the JSON structure of Notebooks to annotate them, both at the Notebook level (Notebook metadata) and at the single cell level (Cell metadata). These annotations allow you to:

  • Assign code cells to specific pipeline components
  • Merge together multiple cells into a single pipeline component
  • Define the (execution) dependencies between them

Additional information can be found here.

Where is Kubeflow storing the snapshots we are creating?

MiniKF stores its snapshots on Arrikto Rok, which in the background stores the snapshot data on the cluster’s local object storage service. On GCP this would be GCS.

How does GCP’s Vertex AI compare to Kubeflow?

Vertex is Google’s managed service that actually uses the Kubeflow Pipelines component of Kubeflow. The rest of the services on Vertex AI are closed source. Kubeflow is open source and provides the same experience on every cloud or on-prem environment.

Does Kale work outside Jupyter notebooks? For example, on pycharm, vscode?

Yes, you can use the Kale SDK with pycharm or vscode. For more information about the Kale SDK, check out this post.

Can you run multiple algorithms for modeling in parallel and select the best model based on a specified metric on Kubeflow?

Yes. Take a look at Tutorial #4 which covers AutoML on Kubeflow.

Can we view the final deployed format of the ML models, i.e. the output of the Kubeflow pipeline?

Kubeflow supports any format depending on the libraries you use in the pipeline. For example, people build TensorFlow and PyTorch models or even the distributed MPI operator.

Do all the components at each stage (preprocess, training, etc.) use the same image?

With the Rok storage class underneath kubeflow/kale, it takes a snapshot of each stage. The next stage starts with this snapshot. So you don’t need to create a new image for each stage, but you still get all the states and data from the previous stage.

Is it possible to reproduce these tutorials locally using MiniKF?

Yes, you can reproduce all of these tutorials with MiniKF. It is best to try them out using the GCP or AWS marketplace solutions however, since they need at least 32GB of RAM. If you have a big machine, you can also try out the Vagrant version on your laptop, but if this is your first time trying MiniKF it is much easier to do on GCP or AWS.

At any step of the pipeline, can I trigger an external API or script?

Yes, you can do that.

Is real time inferencing such as Prediction API supported?

Yes, Kubeflow includes the KFServing component for live inferencing via a Prediction API.

When you take a snapshot, does Kubeflow take a snapshot across all 40 pods?

No. When you take a snapshot, the snapshot consists of only the corresponding pods that run the workload, along with all the K8s PVC which hold the data and libraries, as well as the metadata from the K8s objects.

Is there any difference between MiniKF and the installation using the manifest repo of Kubeflow?

Yes. MiniKF is a single node installation for educational/training purposes, and makes it very easy to get started. MiniKF is packaged inside a VM image that includes K8s. Also, MiniKF includes out-of-the-box Kale and Arrikto Rok for data versioning and snapshotting, which are not part of the official manifests. For the exact same experience of MiniKF, but for a multi-node cluster, and enterprise-grade integrations you can check out Arrikto’s distribution of Kubeflow: Arrikto Enterprise Kubeflow

Are there any integrations available with other ML tools like feature store – feast, data quality validation, etc?

Kubeflow is open source and very extensible. Members of the Kubeflow community have integrated Kubeflow with a number of external tools, and there is also a contrib directory in the Kubeflow manifests where 3rd party vendors provide integrations for their products. For example Feast is part of that contrib directory.

What’s Next?

Free Technical Workshop

Turbocharge your team’s Kubeflow and MLOps skills with a free workshop.