PyTorch Jupyter Notebook Development with Docker and GPU Support

Complete guide to setting up PyTorch development environment with Jupyter notebooks, Docker, and GPU acceleration for ML/AI projects

4-8 minutes(1490 words)simple

Difficulty: 🟡 Intermediate
Estimated Time: 20-30 minutes
Prerequisites: Basic Docker knowledge, NVIDIA GPU with drivers, Docker Compose, Basic Python knowledge

What You'll Learn

This tutorial covers essential PyTorch development concepts and tools:

Docker Setup - Setting up PyTorch with Docker and GPU support
Jupyter Environment - Jupyter notebook development environment
NVIDIA Integration - CUDA integration for accelerated training
Docker Compose - Easy management and configuration
GPU Testing - Testing GPU availability and PyTorch functionality
Performance Monitoring - GPU monitoring and optimization
Security Best Practices - Secure development environment setup

Prerequisites

Before starting, ensure you have:

Docker and Docker Compose installed
NVIDIA GPU with proper drivers
NVIDIA Container Toolkit configured
Basic understanding of Docker concepts

GPU-Ready Docker Setup - Complete GPU Docker environment
CUDA Compatibility Guide - GPU compatibility matrix
Main Tutorials Hub - Step-by-step implementation guides

Introduction

Developing machine learning models with PyTorch requires a robust, reproducible environment. Docker containers provide the perfect solution by ensuring consistent dependencies, easy GPU access, and seamless collaboration across different machines.

This tutorial covers setting up PyTorch with Docker and GPU support, Jupyter notebook development environment, NVIDIA CUDA integration for accelerated training, Docker Compose for easy management, and testing GPU availability and PyTorch functionality.

Step-by-Step Instructions

Create Project Directory

mkdir pytorch-jupyter-project
cd pytorch-jupyter-project

Create Docker Compose Configuration

Create a docker-compose.yml file with the following content:

version: '3.8'

services:
  pytorch-jupyter:
    image: quay.io/jupyter/pytorch-notebook:cuda12-python-3.11.9
    container_name: pytorch-jupyter-gpu
    ports:
      - "8888:8888"
    volumes:
      - ./notebooks:/home/jovyan/work
      - ./data:/home/jovyan/data
    environment:
      - JUPYTER_ENABLE_LAB=yes
      - JUPYTER_TOKEN=your_secure_token_here
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 2
              capabilities: [gpu]
    restart: unless-stopped
    networks:
      - pytorch-network

networks:
  pytorch-network:
    driver: bridge

Create Project Structure

mkdir -p notebooks data
touch notebooks/README.md

Start the Environment

docker-compose up -d

Access Jupyter Lab

Open your browser and navigate to:

http://localhost:8888

Use the token: your_secure_token_here

Testing GPU Support

Test PyTorch CUDA Availability

Create a new notebook and run:

import torch

# Check CUDA availability
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"CUDA version: {torch.version.cuda}")
print(f"GPU count: {torch.cuda.device_count()}")

if torch.cuda.is_available():
    print(f"Current GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")

Test GPU Tensor Operations

import torch
import time

# Create tensors on GPU
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

# Test GPU computation
x = torch.randn(1000, 1000).to(device)
y = torch.randn(1000, 1000).to(device)

start_time = time.time()
z = torch.mm(x, y)
torch.cuda.synchronize()  # Wait for GPU operations to complete
gpu_time = time.time() - start_time

print(f"GPU matrix multiplication time: {gpu_time:.4f} seconds")

Configuration Options

Custom PyTorch Version

To use a specific PyTorch version, modify the Dockerfile:

FROM quay.io/jupyter/pytorch-notebook:cuda12-python-3.11.9

# Install specific PyTorch version
RUN pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121

Additional Dependencies

Add to your docker-compose.yml:

services:
  pytorch-jupyter:
    # ... existing configuration ...
    environment:
      - JUPYTER_ENABLE_LAB=yes
      - JUPYTER_TOKEN=your_secure_token_here
    volumes:
      - ./notebooks:/home/jovyan/work
      - ./data:/home/jovyan/data
      - ./requirements.txt:/home/jovyan/requirements.txt
    command: >
      bash -c "pip install -r requirements.txt && 
               start.sh jupyter lab --LabApp.token='your_secure_token_here'"

Performance Monitoring

GPU Monitoring with nvidia-smi

# Monitor GPU usage from host
docker exec pytorch-jupyter-gpu nvidia-smi

# Continuous monitoring
docker exec pytorch-jupyter-gpu watch -n 1 nvidia-smi

Memory Usage Monitoring

import torch

def print_gpu_memory():
    if torch.cuda.is_available():
        print(f"GPU memory allocated: {torch.cuda.memory_allocated(0) / 1e9:.2f} GB")
        print(f"GPU memory cached: {torch.cuda.memory_reserved(0) / 1e9:.2f} GB")

# Use in your training loops
print_gpu_memory()

Troubleshooting

Common Issues and Solutions

Issue: CUDA not available

# Check NVIDIA drivers
nvidia-smi

# Verify Docker can access GPU
docker run --rm --gpus all nvidia/cuda:12.0-base-ubuntu20.04 nvidia-smi

Issue: Port already in use

# Change port in docker-compose.yml
ports:
  - "8889:8888"  # Use port 8889 instead

Issue: Permission denied

# Fix volume permissions
sudo chown -R 1000:1000 ./notebooks ./data

Debug Commands

# Check container logs
docker-compose logs pytorch-jupyter

# Enter container for debugging
docker exec -it pytorch-jupyter-gpu bash

# Check GPU devices in container
nvidia-smi

Security Best Practices

Token Security

Use strong, unique tokens
Store tokens in environment variables
Rotate tokens regularly
Never commit tokens to version control

Network Security

# Restrict network access
networks:
  pytorch-network:
    driver: bridge
    ipam:
      config:
        - subnet: 172.20.0.0/16

Next Steps

GPU-Ready Docker Setup - Complete GPU Docker environment
CUDA Compatibility Guide - GPU compatibility matrix
ML Model Registry - Model versioning and deployment

Advanced Topics

Multi-GPU training with PyTorch Distributed
Custom Docker images for specific ML frameworks
Integration with MLflow for experiment tracking
Kubernetes deployment for production workloads

External Resources

Conclusion

You've successfully set up a PyTorch development environment with:

Docker containerization for reproducibility
GPU acceleration with NVIDIA CUDA support
Jupyter Lab for interactive development
Docker Compose for easy management
GPU testing and monitoring capabilities

This environment provides a solid foundation for machine learning development and can be easily shared with team members or deployed to different machines.

Key Takeaways

Docker Containerization - Reproducible development environments
GPU Acceleration - NVIDIA CUDA integration for faster training
Jupyter Integration - Interactive notebook development
Easy Management - Docker Compose for simple deployment

Next Steps

Start Developing - Create your first PyTorch notebook
Test GPU Performance - Run GPU benchmarks and tests
Customize Environment - Add your preferred ML libraries
Share with Team - Export and share your Docker setup

Tags: #PyTorch #Jupyter #Docker #GPU #CUDA #MachineLearning #AI #Development #DockerCompose

Supercharge LLM Inference with VLLM

LLM Performance Monitoring Metrics

PyTorch Jupyter Notebook Development with Docker and GPU Support

Quick Navigation

What You'll Learn

Prerequisites

Related Tutorials

Introduction

Step-by-Step Instructions

Create Project Directory

Create Docker Compose Configuration

Create Project Structure

Start the Environment

Access Jupyter Lab

Testing GPU Support

Test PyTorch CUDA Availability

Test GPU Tensor Operations

Configuration Options

Custom PyTorch Version

Additional Dependencies

Performance Monitoring

GPU Monitoring with nvidia-smi

Memory Usage Monitoring

Troubleshooting

Common Issues and Solutions

Issue: CUDA not available

Issue: Port already in use

Issue: Permission denied

Debug Commands

Security Best Practices

Token Security

Network Security

Next Steps

Related Tutorials

Advanced Topics

External Resources

Conclusion

Key Takeaways

Next Steps