PyTorch Jupyter Notebook Development with Docker and GPU Support

Complete guide to setting up PyTorch development environment with Jupyter notebooks, Docker, and GPU acceleration for ML/AI projects

Quick Navigation

Difficulty: 🟡 Intermediate
Estimated Time: 20-30 minutes
Prerequisites: Basic Docker knowledge, NVIDIA GPU with drivers, Docker Compose, Basic Python knowledge

What You'll Learn

This tutorial covers essential PyTorch development concepts and tools:

  • Docker Setup - Setting up PyTorch with Docker and GPU support
  • Jupyter Environment - Jupyter notebook development environment
  • NVIDIA Integration - CUDA integration for accelerated training
  • Docker Compose - Easy management and configuration
  • GPU Testing - Testing GPU availability and PyTorch functionality
  • Performance Monitoring - GPU monitoring and optimization
  • Security Best Practices - Secure development environment setup

Prerequisites

Before starting, ensure you have:

  • Docker and Docker Compose installed
  • NVIDIA GPU with proper drivers
  • NVIDIA Container Toolkit configured
  • Basic understanding of Docker concepts

Introduction

Developing machine learning models with PyTorch requires a robust, reproducible environment. Docker containers provide the perfect solution by ensuring consistent dependencies, easy GPU access, and seamless collaboration across different machines.

This tutorial covers setting up PyTorch with Docker and GPU support, Jupyter notebook development environment, NVIDIA CUDA integration for accelerated training, Docker Compose for easy management, and testing GPU availability and PyTorch functionality.

Step-by-Step Instructions

Create Project Directory

mkdir pytorch-jupyter-project
cd pytorch-jupyter-project

Create Docker Compose Configuration

Create a docker-compose.yml file with the following content:

version: '3.8'

services:
  pytorch-jupyter:
    image: quay.io/jupyter/pytorch-notebook:cuda12-python-3.11.9
    container_name: pytorch-jupyter-gpu
    ports:
      - "8888:8888"
    volumes:
      - ./notebooks:/home/jovyan/work
      - ./data:/home/jovyan/data
    environment:
      - JUPYTER_ENABLE_LAB=yes
      - JUPYTER_TOKEN=your_secure_token_here
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 2
              capabilities: [gpu]
    restart: unless-stopped
    networks:
      - pytorch-network

networks:
  pytorch-network:
    driver: bridge

Create Project Structure

mkdir -p notebooks data
touch notebooks/README.md

Start the Environment

docker-compose up -d

Access Jupyter Lab

Open your browser and navigate to:

http://localhost:8888

Use the token: your_secure_token_here

Testing GPU Support

Test PyTorch CUDA Availability

Create a new notebook and run:

import torch

# Check CUDA availability
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"CUDA version: {torch.version.cuda}")
print(f"GPU count: {torch.cuda.device_count()}")

if torch.cuda.is_available():
    print(f"Current GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")

Test GPU Tensor Operations

import torch
import time

# Create tensors on GPU
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

# Test GPU computation
x = torch.randn(1000, 1000).to(device)
y = torch.randn(1000, 1000).to(device)

start_time = time.time()
z = torch.mm(x, y)
torch.cuda.synchronize()  # Wait for GPU operations to complete
gpu_time = time.time() - start_time

print(f"GPU matrix multiplication time: {gpu_time:.4f} seconds")

Configuration Options

Custom PyTorch Version

To use a specific PyTorch version, modify the Dockerfile:

FROM quay.io/jupyter/pytorch-notebook:cuda12-python-3.11.9

# Install specific PyTorch version
RUN pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121

Additional Dependencies

Add to your docker-compose.yml:

services:
  pytorch-jupyter:
    # ... existing configuration ...
    environment:
      - JUPYTER_ENABLE_LAB=yes
      - JUPYTER_TOKEN=your_secure_token_here
    volumes:
      - ./notebooks:/home/jovyan/work
      - ./data:/home/jovyan/data
      - ./requirements.txt:/home/jovyan/requirements.txt
    command: >
      bash -c "pip install -r requirements.txt && 
               start.sh jupyter lab --LabApp.token='your_secure_token_here'"

Performance Monitoring

GPU Monitoring with nvidia-smi

# Monitor GPU usage from host
docker exec pytorch-jupyter-gpu nvidia-smi

# Continuous monitoring
docker exec pytorch-jupyter-gpu watch -n 1 nvidia-smi

Memory Usage Monitoring

import torch

def print_gpu_memory():
    if torch.cuda.is_available():
        print(f"GPU memory allocated: {torch.cuda.memory_allocated(0) / 1e9:.2f} GB")
        print(f"GPU memory cached: {torch.cuda.memory_reserved(0) / 1e9:.2f} GB")

# Use in your training loops
print_gpu_memory()

Troubleshooting

Common Issues and Solutions

Issue: CUDA not available

# Check NVIDIA drivers
nvidia-smi

# Verify Docker can access GPU
docker run --rm --gpus all nvidia/cuda:12.0-base-ubuntu20.04 nvidia-smi

Issue: Port already in use

# Change port in docker-compose.yml
ports:
  - "8889:8888"  # Use port 8889 instead

Issue: Permission denied

# Fix volume permissions
sudo chown -R 1000:1000 ./notebooks ./data

Debug Commands

# Check container logs
docker-compose logs pytorch-jupyter

# Enter container for debugging
docker exec -it pytorch-jupyter-gpu bash

# Check GPU devices in container
nvidia-smi

Security Best Practices

Token Security

  • Use strong, unique tokens
  • Store tokens in environment variables
  • Rotate tokens regularly
  • Never commit tokens to version control

Network Security

# Restrict network access
networks:
  pytorch-network:
    driver: bridge
    ipam:
      config:
        - subnet: 172.20.0.0/16

Next Steps

Advanced Topics

  • Multi-GPU training with PyTorch Distributed
  • Custom Docker images for specific ML frameworks
  • Integration with MLflow for experiment tracking
  • Kubernetes deployment for production workloads

External Resources

Conclusion

You've successfully set up a PyTorch development environment with:

  • Docker containerization for reproducibility
  • GPU acceleration with NVIDIA CUDA support
  • Jupyter Lab for interactive development
  • Docker Compose for easy management
  • GPU testing and monitoring capabilities

This environment provides a solid foundation for machine learning development and can be easily shared with team members or deployed to different machines.

Key Takeaways

  • Docker Containerization - Reproducible development environments
  • GPU Acceleration - NVIDIA CUDA integration for faster training
  • Jupyter Integration - Interactive notebook development
  • Easy Management - Docker Compose for simple deployment

Next Steps

  1. Start Developing - Create your first PyTorch notebook
  2. Test GPU Performance - Run GPU benchmarks and tests
  3. Customize Environment - Add your preferred ML libraries
  4. Share with Team - Export and share your Docker setup

Tags: #PyTorch #Jupyter #Docker #GPU #CUDA #MachineLearning #AI #Development #DockerCompose