With PyTorch installed, the remaining libraries install easily via pip. Here is a comprehensive installation for the full textbook experience:
# Core Hugging Face ecosystem
pip install transformers datasets tokenizers accelerate
# Fine-tuning and alignment
pip install peft trl bitsandbytes
# Inference serving
pip install vllm
# RAG and embeddings
pip install sentence-transformers langchain langchain-openai langchain-community langchain-text-splitters chromadb
# Experiment tracking
pip install wandb mlflow
# Data processing
pip install pandas numpy scikit-learn
# Jupyter notebooks
pip install jupyterlab ipywidgets
# Install vLLM (Linux only, or WSL2 on Windows)
pip install vllm
# Start a local inference server
python -m vllm.entrypoints.openai.api_server \
--model meta-llama/Llama-3.1-8B-Instruct \
--dtype auto \
--max-model-len 4096
Code Fragment D.4.1: Installing the core library stack for LLM development: Hugging Face Transformers, fine-tuning tools (peft, trl), experiment tracking (wandb, mlflow), and data processing utilities.
Practical Example: Minimal Setup for Each Part of the Textbook
Parts 1 and 2 (Foundations, Understanding LLMs): pip install torch transformers datasets numpy
Part 3 (Working with LLMs): Add openai langchain langchain-openai
Part 4 (Training and Adapting): Add peft trl bitsandbytes accelerate wandb
Part 5 (Retrieval and Conversation): Add sentence-transformers chromadb langchain-community langchain-text-splitters
Parts 6 and 7 (Agents, Production): Add vllm mlflow
Installing vLLM
vLLM requires Linux and a CUDA-compatible GPU. It does not currently support Windows natively (use WSL2 on Windows). Code Fragment D.4.2 below puts this into practice.
Code Fragment D.4.2: Installing vLLM and launching a local OpenAI-compatible inference server for LLaMA 3.1 8B.