Kubeflow and MLOps Meetup Recap – Feb 2022

February 7, 2022

Blog and Kubeflow Updates | Kubeflow | News

Last week we hosted our fifth “Data Science, Machine Learning and Kubeflow” Meetup. Special thanks to our awesome speakers Danny D. Leybzon and Trevor Grant. In this blog post we’ll recap some highlights from the Meetup and preview what’s next. Ok, let’s dig in.

Join a Meetup near you

First, if you missed last week’s Meetup? No need to suffer from FOMO. Here’s a list of the Meetups that are part of the “Data Science, Machine Learning and Kubeflow” Meetup network. Please join the one that is the most time friendly to your location.

Get involved in the Kubeflow community

Join Kubeflow Community Slack
Are you interested in speaking at a future Meetup?
Is your company interested in sponsoring a Meetup?
Would you like to be a co-organizer of a local Meetup?

If you answered yes to any of the above, Send one of the organizers/hosts a message on Meetup.com or jump onto Kubeflow Community Slack and DM @rawkintrevo

Thanks for voting for your favorite charity!

With the unprecedented circumstances facing our global community, Arrikto is looking for even more ways to contribute. With this in mind, we thought that in lieu of swag we could give Meetup attendees the opportunity to vote for their favorite charity and help guide our monthly donation to charitable causes. The charity that won this month’s workshop voting was Doctors Without Borders. They are an international humanitarian medical non-governmental organization of French origin best known for its projects in conflict zones and in countries affected by endemic diseases. We are pleased to be making a donation of $200 to them on behalf of the Kubeflow community. Again, thanks to all of you who attended and voted!

Talk #1: AI Observability: How To Fix Issues With Your ML Model

When machine learning models are deployed to production, their performance starts degrading. Now that ML models are increasingly becoming mission critical for enterprises and startups alike, root cause analysis and gaining observability into your AI systems is similarly mission critical. However, many organizations struggle to prevent model performance degradation and assure the quality of the data being fed to their ML models, largely because they don’t have the tools and organizational knowledge to do so.

In this talk, MLOps Architect Danny D. Leybzon will explain the problems associated with ML models deployed in production, and how many of these problems can be addressed with data monitoring and AI observability best practices. Taking it a step further, the speaker will discuss steps that data scientists and machine learning engineers can take to proactively ensure the performance of their models, rather than reacting to the impacts of performance degradation reported by their customers.

Resource Links from the Talk

Danny D. Leybzon, currently MLOps architect at WhyLabs, studied computational statistics at UCLA, and was an analyst and then a product manager for the big data platform Qubole.

Talk #2: Using Apache Spark in Kubeflow: A non-trivial Usecase

Working with big data matrices is challenging, Kubernetes allows users to elastically scale, but can only have a pod as large as a node, which may not be large enough to fit the matrix in memory. While Kubernetes allows for other paradigms on top of it which allows pods to coordinate on individual jobs, setting them up and making them play nice with ML platforms is not straightforward. Using Apache Spark and Apache Mahout we can work with matrices of any dimension and distribute them across an unbounded number of pods/nodes, and we can use Kubeflow to make our work quickly and easily reproducible. In this talk, we’ll discuss how we used Apache Spark and Mahout to denoise DICOM images of lungs of COVID patients and published our Pipeline with Kubeflow to make the process easily repeatable which could help doctors in more resource limited hospitals, as well as other researchers seeking to automate the detection of COVID.

Resource Links from the Talk

Trevor Grant’s Twitter handle: @rawkintrevo
Book: Machine Learning- From Lab to Production
Peer reviewed article
Code

Trevor is the Director of Developer Relations at Arrikto and an international speaker excited to be back on the road after a 2 year COVID hiatus. He is also a member and involved with leadership of several projects at the Apache Software Foundation, PMC Chair of Apache Mahout, and Author of Kubeflow For Machine Learning: From Lab to Production.

Lightning Talks

There was also one short lightning talk at the Meetup worth checking out.

A 10 Minute Introduction to Kubeflow: Basics, Architecture & Components – Jimmy Guerrero, VP Developer Relations (Arrikto)

Questions and Answers

Here’s a recap of some of the Q&A during the Meetup edited for brevity and readability.

Is it possible to connect to kubeflow compute pods using vscode to run .py files (not notebooks)?

Yes, starting Kubeflow 1.3, you can spin up VS Code instances in a self-service manner

Can you share some resources on how to spin up VS Code instances in a self-service manner?

https://www.kubeflow.org/docs/components/notebooks/overview/

Developer Relations Engineer!? Get paid to write blogs and give fun talks?! Where can I apply?!?!

Drop an inquiry here: https://apply.workable.com/arrikto/j/87A42E1D3B/ …Trevor should see it.

Upcoming March 2022 Meetup

We are excited to announce that we have our speakers locked in for the next meetup.

March 3, 2022

Deep Learning in Robotic Vision – A Confluence of Architectures – Kausthub Krishnamurthy
Installing Kubeflow: Manifests vs Packaged Distributions – Jimmy Guerrero @Arrikto

If you are new to Kubeflow – install MiniKF

MIniKF is the easiest way to get started with Kubeflow on the platform of your choice (AWS or GCP.)

Here’s the links:

Get started with Kubeflow – hands-on tutorials

Installed but don’t know where to start? Get started with these hands-on, practical Kubeflow tutorials.

FREE Kubeflow courses and certifications

We are excited to announce the first of several free instructor-led and on-demand Kubeflow courses! The “Introduction to Kubeflow” series of courses will start with the fundamentals, then go on to deeper dives of various Kubeflow components. Each course will be delivered over Zoom with the opportunity to earn a certificate upon successful completion of an exam. Visit us to learn more.

We hope to see you at a future Meetup!