Inference Job Starvation During Sale Events
At ~4k concurrent users, rec inference batch jobs started queuing. Increased queue depth by 4X was traced to a vCPU block; add-on quota delayed approval by a weekend. Ops workaround: scaled down less critical logging pipeline to free vCPUs temporarily.