Section D.4: Installing Key Libraries

With PyTorch installed, the remaining libraries install easily via pip. Here is a comprehensive installation for the full textbook experience:

# Core Hugging Face ecosystem
pip install transformers datasets tokenizers accelerate

# Fine-tuning and alignment
pip install peft trl bitsandbytes

# Inference serving
pip install vllm

# RAG and embeddings
pip install sentence-transformers langchain langchain-openai langchain-community langchain-text-splitters chromadb

# Experiment tracking
pip install wandb mlflow

# Data processing
pip install pandas numpy scikit-learn

# Jupyter notebooks
pip install jupyterlab ipywidgets

# Install vLLM (Linux only, or WSL2 on Windows)
pip install vllm

# Start a local inference server
python -m vllm.entrypoints.openai.api_server \
    --model meta-llama/Llama-3.1-8B-Instruct \
    --dtype auto \
    --max-model-len 4096

Code Fragment D.4.1: Installing the core library stack for LLM development: Hugging Face Transformers, fine-tuning tools (peft, trl), experiment tracking (wandb, mlflow), and data processing utilities.

Practical Example: Minimal Setup for Each Part of the Textbook

Parts 1 and 2 (Foundations, Understanding LLMs): pip install torch transformers datasets numpy

Part 3 (Working with LLMs): Add openai langchain langchain-openai

Part 4 (Training and Adapting): Add peft trl bitsandbytes accelerate wandb

Part 5 (Retrieval and Conversation): Add sentence-transformers chromadb langchain-community langchain-text-splitters

Parts 6 and 7 (Agents, Production): Add vllm mlflow

Installing vLLM

vLLM requires Linux and a CUDA-compatible GPU. It does not currently support Windows natively (use WSL2 on Windows). Code Fragment D.4.2 below puts this into practice.

Code Fragment D.4.2: Installing vLLM and launching a local OpenAI-compatible inference server for LLaMA 3.1 8B.