Running VLLM

Typical runs:

vllm serve --dtype=half --max_model_len 3424  Qwen/Qwen2.5-1.5B-Instruct