vLLM on Google Cloud TPU: A Model Size vs Chip Cheat Sheet (With Interactive Tool)

· Dev.to