Back to overview
Degraded
[RETROACTIVE] Service Degradation
Oct 29 at 11:50am CET
Affected services
REST API
Dashboard UI
Webhook Delivery via [clever.par.1]
Webhook Delivery via [scw.ams.1]
Resolved
Oct 29 at 05:15pm CET
Incident is resolved.
Affected services
Created
Oct 29 at 11:50am CET
Summary
We are seeing 100% database CPU usage. Analysis shows two causes: lots of malicious subscriptions have been created by a customer's user and a misconfiguration on our side has amplified the impact on our database.
Impact
- Significant slowdowns and timeouts in our API
- Delay in processing webhooks
Resolution
A few improvements are made to reduce the impact of housekeeping operations to database's CPU. After disabling incriminated subscriptions, webhooks queue starts to reduce and system recovers.
Next Steps
- patch worker so that it ignores webhooks from disabled subscriptions
- design and implement automatic deactivation of long-failing subscriptions
Affected services