Return to Article Details KV Cache and Inference Scheduling: Energy Modeling for High-QPS Services Download Download PDF