Introduction to Kubeflow: Fundamentals Training and Certification Recap – Jan 27, 2021

We recently hosted the third delivery of the “Intro to Kubeflow: Fundamentals Training and Certification prep course.. In this blog post we’ll recap some highlights from the class, plus give a summary of the Q&A. Ok, let’s dig in!

Congratulations to Traian Antonescu!

The first student to earn the “Fundamentals” certificate at the conclusion of the course was Traian Antonescu from Capital One. A free MiniKF hoodie and shirt is on the way, well done!

First, thanks for voting for your favorite charity!

With the unprecedented circumstances facing our global community, Arrikto is looking for even more ways to contribute. With this in mind, we thought that in lieu of swag we could give workshop attendees the opportunity to vote for their favorite charity and help guide our monthly donation to charitable causes. The charity that won this workshop’s voting was Action Against Hunger. They are a global humanitarian organization which originated in France and is committed to ending world hunger. The organization helps malnourished children and provides communities with access to safe water and sustainable solutions to hunger. We are pleased to be making a donation of $250 to them on behalf of the Kubeflow community. Again, thanks to all of you who attended and voted!

What topics were covered in the course?

This initial course aimed to get data scientists and DevOps engineers with little or no experience familiar with the fundamentals of how Kubeflow works.

  • Kubeflow architecture
  • Overview of machine learning workflows
  • Kubeflow components
  • Tools and add-ons (Kale, Rok, Istio, etc)
  • Distributions
  • Installing Kubeflow on AWS
  • Community overview

What did I miss?

Here’s a short teaser from the 90 minute training. In this video we demonstrate there things in regards to Katib (which is a Kubeflow component that provides AutoML, hyperparameter tuning and early stopping):

  • How to view an experiment
  • How to set up experiments
  • How to identify the best run

Missed the Jan 27 Kubeflow Fundamentals training?

If you were unable to join us last week, but would still like to attend a future training, the next “Kubeflow Fundamentals” training is happening on Feb 16. You can sign up for this and the upcoming Notebooks, Pipelines and Kale/Katib courses here.

NEW: Advanced Kubeflow, Notebooks and Pipelines Workshops

We are excited to announce a new series of FREE workshops focused on taking popular Kaggle and Udacity machine learning examples from “Notebook to Pipeline.” Registration is now open for the following workshops:

Arrikto Academy

If you are ready to put what you’ve learned into practice with hands-on labs? Then check out Arrikto Academy! On this site you’ll find a variety of FREE skills-building exercises including:

  • Kale 101: Transform Jupyter Notebooks into Kubeflow Pipelines
  • Katib 101: Automated Hyperparameter Tuning for Models in Kubeflow Pipelines
  • Rok 101: Manage and Restore Kubeflow Pipeline Snapshots

Q&A from the training

Below is a summary of some of the questions that popped into the Q&A box during the course. [Edited for readability and brevity.]

Will a session recording be available for offline study & learning?

Yes, you can view the lectures and demos on this YouTube playlist.

What’s the cost per month/year of deploying MiniKF on AWS or GCP?

As of this writing, pricing is ~$0.51 per hour on AWS and ~$0.57 per hour on GCP.

What is Rok?

Rok is a data management solution for Kubeflow. Arrikto Kubeflow’s built-in Rok integration simplifies operations and increases performance, while enabling data versioning in notebooks, pipeline steps and generic pods. It also helps with packaging and secure sharing across teams and cloud boundaries.

What are the advantages of using Jupyter Notebooks inside of Kubeflow vs externally?

  • Jupyter Notebooks in Kubeflow can easily integrate with enterprise authentication and access control mechanisms
  • You can create notebook pods/servers directly in the Kubeflow cluster using images provided by admins
  • You can easily submit single node or distributed training jobs vs having to have everything configured on your laptop
  • You can convert the Notebooks to Kubeflow Pipelines

Can we enable Kale in Google Colab? Or is Kale specific to Jupyter Notebooks?

Kale currently natively supports JupyterLab as an extension. There is also an SDK you can programmatically interact with. You can use the SDK from an IDE right now (e.g., VS Code), but native, GUI-based support for VS Code and other IDEs are planned for future releases.

Where are Rok snapshots stored?

 Rok snapshots are stored on the Object Storage Services Rok is connected to depending on where it runs. Examples include AWS S3, Azure Blob, and Google’s GCS in Arrikto Enterprise Kubeflow deployments.

Is it possible for Kubeflow to integrate with code review workflow tools such as GitLab?

Yes, there are users in the community that have integrated different workflow tools with Kubeflow, including GitLab. 

Are there Helm charts for Kubeflow?

No. Kubeflow uses kustomize instead.

How many Kubernetes master and worker nodes are required in order to get a basic MiniKF installation up and running?

By issuing a kubectl get pods --all-namespaces command against a fresh install of MiniKF, the output tells us there are 120 pods running across 10 namespaces. MiniKF is based on a single-node minikube deployment.

Is Kubeflow an MLOps platform?

Yes. Kubeflow aims to solve problems that both data scientists and DevOps will encounter when moving models from development to production.

Does Kubeflow help with anomaly detection?

Kubeflow is an MLOps platform, it is independent of the ML solution different teams will build on top of it. One can use Kubeflow to implement anomaly detection use cases.

Are there instructions for installing MiniKF on a local Minikube deployment?

MiniKF is a VM-based solution and includes minikube. Thus it is possible to install MiniKF locally using Vagrant and VirtualBox, and this will include minikube. However, we don’t recommend it as the preferred method. You’ll need to make sure you have dedicated 2 CPUs, 32 GB RAM and at least 40 GB of free space to get up and running. Note, depending on the pipelines and experiments you intend to run, you can easily tax your system and introduce undesirable effects. This is why we recommend AWS or GCP MiniKF deployments, which again include minikube, but resources are plentiful and relatively cheap. About $0.51 per hour on AWS to run MiniKF.

Free Technical Workshop

Turbocharge your team’s Kubeflow and MLOps skills with a free workshop.