Skip to content

Instantly share code, notes, and snippets.

@igorbrigadir
Last active July 30, 2020 14:13
Show Gist options
  • Save igorbrigadir/c16dae4dad2b00c3c3bedcc24fbf3198 to your computer and use it in GitHub Desktop.
Save igorbrigadir/c16dae4dad2b00c3c3bedcc24fbf3198 to your computer and use it in GitHub Desktop.
Working with Notebooks on GPUs with CUDA

Working with Notebooks on GPUs with CUDA

Like a lot of people, i prototype things in notebooks. When it comes to working with GPUs, a common problem i've had to deal with over and over is keeping my environment in delicate balance.

A Stable arrangement I ended up using:

Ubuntu Installing Drivers:

Add and Update GPU driver repository:

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update

Check which driver should install (depends on card), run to see recommendations, install 410 with xorg package:

ubuntu-drivers devices

sudo apt-get install xserver-xorg-video-nvidia-410 nvidia-driver-410

sudo reboot

(Optional, may not need this) If secureboot is enabled, MOK management bluescreen may appear, Enroll MOK -> continue -> enter secure boot password -> reboot

Verify card is being recognised:

nvidia-smi

If machine has multiple gfx cards, use nvidia.

prime-select query
prime-select nvidia

Docker installation:

Prerequesites, keys, repos:

sudo apt update
sudo apt install apt-transport-https ca-certificates curl software-properties-common

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu bionic stable"
sudo apt update

For docker hub login, may need additional packages (Cannot create an item in a locked collection error):

sudo apt-get install gnupg2 pass

Make sure docker-ce is coming from the right place:

apt-cache policy docker-ce

Install Docker:

sudo apt-get install docker-ce

Check that it’s running:

sudo systemctl status docker

Add Users to Docker group:

sudo usermod -aG docker igor

Now install Nvidia-Docker

# Add the package repositories
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get update
sudo apt-get install nvidia-container-toolkit
sudo systemctl restart docker

Verify GPU is working:

docker run --gpus all --rm nvidia/cuda nvidia-smi

Enable docker-compose to use GPUs:

Edit /etc/docker/daemon.json

{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    },

    "experimental": true
}

Working with notebooks:

After everything is configured, to work on something i launch notebooks in a docker instance, mounting a volume with my local notebooks folder /home/notebooks - change this to whatever local folder you keep your notebooks and data in:

docker run --runtime=nvidia --gpus all -u $(id -u):$(id -g) -it --rm -v $(realpath /home/notebooks):/tf/notebooks -p 8888:8888 tensorflow/tensorflow:latest-gpu-py3-jupyter

If you run that on a server, and want to remotely conect to it:

ssh -N -f -L localhost:8888:localhost:8888 you@remote.server

I use this tensorflow image as a general purpose one, even though i may not use tensorflow, it's just easy to use it.

for installing dependencies, i use the notebook itself, with

!pip install ...

For heavier or more complex things, i make my own docker image, based on the nvidia docker images with everything already there. For example:

FROM nvidia/cuda:10.0-cudnn7-devel-ubuntu18.04

ARG DEBIAN_FRONTEND=noninteractive

RUN ...

This will start you off with mostly everything, you can install miniconda here and use that image for things later. Other tags are available in https://hub.docker.com/r/nvidia/cuda/tags for different versions of CUDA.

More useful Docker commands:

When running docker build, add DOCKER_BUILDKIT=1 for faster and better builds:

DOCKER_BUILDKIT=1 docker build -t mycontainer:latest .

Cleaning Up after builds (docker cached layers take up a lot of HDD space after a while):

docker system prune

Removing tagged images:

docker rmi name:tag

Renaming Tags:

docker tag name:oldtag name:newtag
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment