After completing the installation steps, run this verification script to confirm everything is working. Save it as verify_setup.py and run it in your activated environment.
"""
LLM Environment Verification Script
Run this to confirm your setup is ready for the textbook exercises.
"""
import sys
def check_python():
v = sys.version_info
assert v.major == 3 and v.minor >= 10, f"Need Python 3.10+, got {v.major}.{v.minor}"
print(f"[OK] Python {v.major}.{v.minor}.{v.micro}")
def check_torch():
import torch
print(f"[OK] PyTorch {torch.__version__}")
if torch.cuda.is_available():
print(f"[OK] CUDA {torch.version.cuda}, GPU: {torch.cuda.get_device_name(0)}")
vram = torch.cuda.get_device_properties(0).total_mem / 1e9
print(f" VRAM: {vram:.1f} GB")
else:
print("[WARN] No CUDA GPU detected. CPU-only mode.")
def check_transformers():
import transformers
print(f"[OK] Transformers {transformers.__version__}")
def check_optional(name):
try:
mod = __import__(name)
version = getattr(mod, "__version__", "installed")
print(f"[OK] {name} {version}")
except ImportError:
print(f"[SKIP] {name} not installed (optional)")
def test_inference():
"""Quick smoke test: load a tiny model and generate one token."""
from transformers import pipeline
gen = pipeline("text-generation", model="sshleifer/tiny-gpt2", device="cpu")
output = gen("Hello", max_new_tokens=5)
assert len(output[0]["generated_text"]) > 0
print("[OK] Inference smoke test passed")
if __name__ == "__main__":
print("=== LLM Environment Check ===")
check_python()
check_torch()
check_transformers()
for lib in ["datasets", "peft", "trl", "bitsandbytes", "wandb", "vllm"]:
check_optional(lib)
print("\n=== Smoke Test ===")
test_inference()
print("\n=== All checks complete! ===")
=== LLM Environment Check ===
[OK] Python 3.11.9
[OK] PyTorch 2.5.1
[OK] CUDA 12.4, GPU: NVIDIA A100-SXM4-80GB
VRAM: 79.6 GB
[OK] Transformers 4.46.2
[OK] datasets 3.1.0
[OK] peft 0.13.2
[OK] trl 0.12.1
[OK] bitsandbytes 0.44.1
[OK] wandb 0.18.7
[SKIP] vllm not installed (optional)
=== Smoke Test ===
[OK] Inference smoke test passed
=== All checks complete! ===
Expected output on a properly configured machine with an NVIDIA GPU:
# Experiment tracking setup
# Key operations: experiment tracking, dependency installation
=== LLM Environment Check ===
[OK] Python 3.11.7
[OK] PyTorch 2.5.1
[OK] CUDA 12.4, GPU: NVIDIA GeForce RTX 4090
VRAM: 24.0 GB
[OK] Transformers 4.47.0
[OK] datasets 3.2.0
[OK] peft 0.14.0
[OK] trl 0.13.0
[OK] bitsandbytes 0.45.0
[OK] wandb 0.19.1
[SKIP] vllm not installed (optional)
=== Smoke Test ===
[OK] Inference smoke test passed
=== All checks complete!
Code Fragment D.6.1: A comprehensive environment verification script that checks Python, PyTorch, CUDA, GPU availability, and all key libraries, then runs a quick inference smoke test.
Code Fragment D.6.2: Expected output from the verification script, confirming that Python, PyTorch, CUDA, and all key libraries are installed correctly.
Fun Fact: The Setup Tax
In a 2023 survey of ML practitioners, environment setup was ranked as the second most time-consuming part of starting a new project (after data cleaning). The good news: once you have a working environment, you can clone it, export it, and reuse it across projects. The initial investment pays dividends for months.
Setup Checklist
- Confirm your GPU has enough VRAM for your target model size (see Section D.1 table).
- Install the latest NVIDIA driver and verify with
nvidia-smi. - Use Conda to create an isolated environment with the correct CUDA version.
- Install PyTorch first, then
transformersand other libraries. - Run the verification script to confirm everything works end to end.
- For cloud work, start with Google Colab (free T4) and scale to RunPod or Lambda as needed.
- Set
HF_HOMEto control where model weights are cached on disk.
What Comes Next
Continue to Appendix E: Git, DVC, and Reproducibility for the next reference appendix in this collection.