November 16, 2019

161 words 1 min read

Automating GPU Infrastructure for Kubernetes

Automating GPU Infrastructure for Kubernetes

Kubernetes has seen broad interest from the machine learning community and many users are bringing GPUs to their clusters. However, compiling, installing, and updating the NVIDIA kernel modules needed …


Talk Title	Automating GPU Infrastructure for Kubernetes
Speakers	Lucas Servén Marín (Senior Software Engineer, Red Hat)
Conference	KubeCon + CloudNativeCon Europe
Conf Tag
Location	Copenhagen, Denmark
Date	Apr 30-May 4, 2018
URL	Talk Page
Slides	Talk Slides
Video

Kubernetes has seen broad interest from the machine learning community and many users are bringing GPUs to their clusters. However, compiling, installing, and updating the NVIDIA kernel modules needed to run workloads on those GPUs continues to be a cumbersome and largely manual process. Furthermore, distributions like Container Linux, which update frequently can require new kernel modules every other week. In this presentation, Lucas Servén explains how to automate all of these operations for Kubernetes deployed on Container Linux and describes his experience running GPU Kubernetes clusters on both AWS and bare metal.

container cluster automating gpu aws infrastructure nvidia machine learning kubernetes

comments powered by Disqus

Deploying Hyperledger Fabric with Kubernetes/Helm

Deploying Hyperledger Fabric with Kubernetes/Helm

November 8, 2019

Deploying Hyperledger Fabric to production on Kubernetes is not a solved topic, AID:Tech present their work on designing and open-sourcing Helm Charts.Rather than developing a monolithic Helm chart, A …

Continued: Deploying Hyperledger Fabric with Kubernetes/Helm

Continued: Deploying Hyperledger Fabric with Kubernetes/Helm

November 7, 2019

Deploying Hyperledger Fabric to production on Kubernetes is not a solved topic, AID:Tech present their work on designing and open-sourcing Helm Charts.Rather than developing a monolithic Helm chart, A …

Deploying Hyperledger Fabric with Kubernetes/Helm

Deploying Hyperledger Fabric with Kubernetes/Helm

November 7, 2019

Deploying Hyperledger Fabric to production on Kubernetes is not a solved topic, AID:Tech present their work on designing and open-sourcing Helm Charts.Rather than developing a monolithic Helm chart, A …

Building a Kubernetes on Bare-Metal Cluster to Serve Wikipedia

Building a Kubernetes on Bare-Metal Cluster to Serve Wikipedia

November 16, 2019

Starting about a year ago, the Technical Operations team of the Wikimedia Foundation (aka WMF), the organization that runs wikipedia, decided to embark on a journey pushing for streamlined service/mic …

Running Hyperledger Sawtooth in Production

Running Hyperledger Sawtooth in Production

November 11, 2019

In this talk we provide a brief overview of Hyperledger Sawtooth and our rationale for focusing on this framework. We discuss the challenges of running Sawtooth in production and the benefits of stand …

Apache OpenWhisk on Kubernetes: Building a Production-Ready Serverless Stack on and for Kubernetes

Apache OpenWhisk on Kubernetes: Building a Production-Ready Serverless Stack on and for Kubernetes

November 16, 2019

Apache OpenWhisk is a serverless, open source cloud platform that executes functions in response to events at any scale. OpenWhisk can now be deployed on Kubernetes, providing a compelling unified pla …