Degraded performance on EU instance

Incident Report for seatsio

Postmortem

This is the postmortem for two incidents, one on 27 November and one on 5 December.

Both incidents happened on the EU region, no other seats.io regions (NA, SA, OC) were impacted.

On 27 November, between 14:17 and 14:53 CET, the seatsio.io EU instance experienced an extraordinary high number of requests, partially because of a DOS attack on (at least) one of our customers system. At the same time, it was experiencing unexpected memory issues, requiring server restarts every so often.

We identified the faulty API endpoint and issued a fix on 28/11: https://docs.seats.io/changelog/#version-786---28112024. Multiple testing rounds showed that the memory issue was indeed fixed.

However, yesterday, 5 December between 4:47pm and 5:02pm CET, the seats.io EU region experienced downtime once again. Some customers reported not being able to reach the server at all, others experienced intermittent timeouts but were still able to access the system. There were no memory issues this time around. However, once more, the EU instance was experiencing an extraordinary high load: multiple concurrent on-sales, combined with another DOS attack on (at least) one customer’s systems.

After analysis of this incident, a second, unrelated, issue in the CDN setup was discovered. For security reasons, we can’t go into detail, but the issue was that the CDN behaved unexpectedly under load. The issue was positively identified and fixed earlier today (6 December).

In conclusion: seats.io had two separate and unrelated issues happening at the same time during the 27/11 incident: a memory issue, and a CDN issue. During the 27/11 incident, we failed to notice the CDN issue because of the memory issue.

Both issues are fixed now, and, seats.io continues to grow, we are working hard to continually improve the security and throughput of the seats.io API.

If you have any further questions, please don’t hesitate to contact us at support@seats.io.

Posted Dec 06, 2024 - 12:33 CET

Resolved

This incident has been resolved.

Posted Dec 05, 2024 - 19:03 CET

Monitoring

A fix has been implemented and we are monitoring the results.

Posted Dec 05, 2024 - 18:25 CET

Update

We are continuing to investigate this issue.

Posted Dec 05, 2024 - 17:16 CET

Investigating

We're seeing slow requests on the eu instance. Other regions (na, sa, oc) are not impacted.

Posted Dec 05, 2024 - 17:12 CET

This incident affected: Seats.io API, Seats.io Web Application, and Seats.io Rendered Charts.