How techniques like model pruning, quantization and knowledge distillation can optimize LLMs for faster, cheaper predictions.