Appendices
Appendix D: Development Environment Setup

D.1 Hardware Requirements

Large language models earn their name. Even "small" open models have billions of parameters, and each parameter requires memory. Understanding hardware requirements before you begin will save you from frustrating out-of-memory errors.

GPU (Graphics Processing Unit)

A GPU is the most important piece of hardware for LLM work. NVIDIA GPUs with CUDA support are the standard; AMD and Intel GPUs have improving support but lag in ecosystem maturity.

GPU (Graphics Processing Unit) Comparison
Task Minimum VRAM Recommended GPU
Inference (7B model, 4-bit) 6 GB RTX 3060 12GB, RTX 4060 Ti
Inference (7B model, 16-bit) 14 GB RTX 4090 24GB, A5000
Inference (70B model, 4-bit) 40 GB A100 80GB, 2x RTX 4090
Fine-tuning (7B, LoRA, 4-bit) 12 GB RTX 4070 Ti, T4 (Colab free)
Fine-tuning (7B, full) 40+ GB A100 80GB
Pretraining Hundreds of GB Multi-node A100/H100 clusters
Key Insight: VRAM Is the Bottleneck

The parameter count of a model multiplied by the bytes per parameter gives you the minimum VRAM needed. A 7B parameter model in float16 (2 bytes per parameter) requires about 14 GB just for the weights. During training, optimizer states and gradients can triple this requirement. Quantization (loading in 4-bit or 8-bit precision) is the most practical way to fit larger models into smaller GPUs.

RAM (System Memory)

You need sufficient system RAM to load model weights before transferring them to the GPU. A general rule: have at least as much system RAM as GPU VRAM, plus extra for your dataset and operating system. For most setups, 32 GB of system RAM is comfortable; 16 GB is workable but tight.

Disk Space

Model weights are large files. A 7B parameter model typically occupies 13 to 14 GB on disk in float16, or about 4 GB in 4-bit quantized GGUF format. If you plan to download several models, allocate at least 100 GB of free disk space. SSDs are strongly preferred over HDDs for loading speed.

Hugging Face Cache Location

By default, the transformers library caches downloaded models in ~/.cache/huggingface/hub/. This directory can grow to tens or hundreds of gigabytes. To move it to a larger drive, set the environment variable HF_HOME=/path/to/your/cache before running any code. You can also use huggingface-cli scan-cache to see what is cached and huggingface-cli delete-cache to clean up.