Section D.6: Verifying Your Setup

After completing the installation steps, run this verification script to confirm everything is working. Save it as verify_setup.py and run it in your activated environment.

"""
LLM Environment Verification Script
Run this to confirm your setup is ready for the textbook exercises.
"""
import sys

def check_python():
    v = sys.version_info
    assert v.major == 3 and v.minor >= 10, f"Need Python 3.10+, got {v.major}.{v.minor}"
    print(f"[OK] Python {v.major}.{v.minor}.{v.micro}")

def check_torch():
    import torch
    print(f"[OK] PyTorch {torch.__version__}")
    if torch.cuda.is_available():
        print(f"[OK] CUDA {torch.version.cuda}, GPU: {torch.cuda.get_device_name(0)}")
        vram = torch.cuda.get_device_properties(0).total_mem / 1e9
        print(f"     VRAM: {vram:.1f} GB")
    else:
        print("[WARN] No CUDA GPU detected. CPU-only mode.")

def check_transformers():
    import transformers
    print(f"[OK] Transformers {transformers.__version__}")

def check_optional(name):
    try:
        mod = __import__(name)
        version = getattr(mod, "__version__", "installed")
        print(f"[OK] {name} {version}")
    except ImportError:
        print(f"[SKIP] {name} not installed (optional)")

def test_inference():
    """Quick smoke test: load a tiny model and generate one token."""
    from transformers import pipeline
    gen = pipeline("text-generation", model="sshleifer/tiny-gpt2", device="cpu")
    output = gen("Hello", max_new_tokens=5)
    assert len(output[0]["generated_text"]) > 0
    print("[OK] Inference smoke test passed")

if __name__ == "__main__":
    print("=== LLM Environment Check ===")
    check_python()
    check_torch()
    check_transformers()

    for lib in ["datasets", "peft", "trl", "bitsandbytes", "wandb", "vllm"]:
        check_optional(lib)

    print("\n=== Smoke Test ===")
    test_inference()

    print("\n=== All checks complete! ===")

=== LLM Environment Check === [OK] Python 3.11.9 [OK] PyTorch 2.5.1 [OK] CUDA 12.4, GPU: NVIDIA A100-SXM4-80GB VRAM: 79.6 GB [OK] Transformers 4.46.2 [OK] datasets 3.1.0 [OK] peft 0.13.2 [OK] trl 0.12.1 [OK] bitsandbytes 0.44.1 [OK] wandb 0.18.7 [SKIP] vllm not installed (optional) === Smoke Test === [OK] Inference smoke test passed === All checks complete! ===

Expected output on a properly configured machine with an NVIDIA GPU:


# Experiment tracking setup
# Key operations: experiment tracking, dependency installation
=== LLM Environment Check ===
[OK] Python 3.11.7
[OK] PyTorch 2.5.1
[OK] CUDA 12.4, GPU: NVIDIA GeForce RTX 4090
     VRAM: 24.0 GB
[OK] Transformers 4.47.0
[OK] datasets 3.2.0
[OK] peft 0.14.0
[OK] trl 0.13.0
[OK] bitsandbytes 0.45.0
[OK] wandb 0.19.1
[SKIP] vllm not installed (optional)

=== Smoke Test ===
[OK] Inference smoke test passed

=== All checks complete!

Code Fragment D.6.1: A comprehensive environment verification script that checks Python, PyTorch, CUDA, GPU availability, and all key libraries, then runs a quick inference smoke test.

Code Fragment D.6.2: Expected output from the verification script, confirming that Python, PyTorch, CUDA, and all key libraries are installed correctly.

Fun Fact: The Setup Tax

In a 2023 survey of ML practitioners, environment setup was ranked as the second most time-consuming part of starting a new project (after data cleaning). The good news: once you have a working environment, you can clone it, export it, and reuse it across projects. The initial investment pays dividends for months.

Setup Checklist

Confirm your GPU has enough VRAM for your target model size (see Section D.1 table).
Install the latest NVIDIA driver and verify with nvidia-smi.
Use Conda to create an isolated environment with the correct CUDA version.
Install PyTorch first, then transformers and other libraries.
Run the verification script to confirm everything works end to end.
For cloud work, start with Google Colab (free T4) and scale to RunPod or Lambda as needed.
Set HF_HOME to control where model weights are cached on disk.

What Comes Next

Continue to Appendix E: Git, DVC, and Reproducibility for the next reference appendix in this collection.