Skip to main content

Rate limits

deepface.dev applies rate limiting and queue controls at the gateway.

Enforcement points

  • Per-account requests per minute
  • Maximum inflight compute requests
  • Queue depth
  • Queue timeout

Retry behavior

429 rate_limited responses include Retry-After in seconds. 503 queue_full and 503 queue_timeout indicate temporary capacity pressure. Treat them as retryable with backoff unless your own SLA policy says otherwise.

Soft-launch note

Limits are intentionally conservative during the soft launch. Teams with higher throughput needs are onboarded manually so concurrency and queue settings can be tuned safely.