Section C.2: Virtual Environments and Dependency Management

LLM libraries have complex dependency trees, and version conflicts are a daily reality. Isolating each project in its own virtual environment is not optional; it is essential for reproducibility and sanity. Code Fragment C.2.1 below puts this into practice.

Option 1: venv (Built-in)

This snippet shows how to create and activate a virtual environment using Python's built-in venv module.

# Create a virtual environment
python -m venv llm-env

# Activate it
# On Linux/macOS:
source llm-env/bin/activate
# On Windows:
llm-env\Scripts\activate

# Install packages
pip install torch transformers datasets accelerate

# Freeze dependencies for reproducibility
pip freeze > requirements.txt

# Recreate environment elsewhere
pip install -r requirements.txt

Code Fragment 1: Creating an isolated virtual environment with venv. The pip freeze command captures exact package versions for reproducibility, and requirements.txt can recreate the environment on any machine.

Option 2: Conda (Recommended for GPU Work)

This snippet shows how to set up a Conda environment with GPU support for LLM projects.

# Create an environment with a specific Python version
conda create -n llm-project python=3.11
conda activate llm-project

# Install PyTorch with CUDA support (conda handles CUDA toolkit)
conda install pytorch torchvision torchaudio pytorch-cuda=12.4 -c pytorch -c nvidia

# Install Hugging Face libraries via pip (within conda env)
pip install transformers datasets peft trl

# Export environment
conda env export > environment.yml

# Install uv
pip install uv

# Create a virtual environment and install packages
uv venv llm-env
source llm-env/bin/activate
uv pip install torch transformers datasets

Code Fragment C.2.1: Setting up a Conda environment with a pinned Python version and installing PyTorch with CUDA support. Conda manages the CUDA toolkit automatically, avoiding manual driver configuration.

Key Insight: Why Conda for GPU Work?

The main advantage of Conda over venv for LLM work is CUDA management. Installing PyTorch with conda automatically includes the correct CUDA toolkit version, sidestepping the need to install system-level NVIDIA drivers and CUDA separately. This is especially valuable on shared machines or when you need different CUDA versions for different projects.

Option 3: uv (Fast Alternative)

uv is a modern, Rust-based package manager that is dramatically faster than pip. It is gaining traction in the ML community for its speed and compatibility. Code Fragment C.2.3 below puts this into practice.

Code Fragment C.2.3: Creating a virtual environment with uv, a Rust-based package manager that resolves and installs dependencies significantly faster than pip.