Syncs are not starting on expected schedule | REST API endpoints are unavailable | Dashboard login fails
Incident Report for Fivetran
Postmortem

What happened:

The service which handles storing and retrieving credentials failed due to a certificate issue resulting in failures of all core services that interact with it (connector syncs, transformations, setup test runners, API etc.). The outage lasted ~88 minutes and occurred on 2022-05-27 from 20:43 UTC to 22:45 UTC.

Resolution:

A new certificate was created and installed. The service was then manually restarted and Fivetran services were restored.

Root Cause Analysis:

Our alert system informed us 2 months ago about the majority of client certificates expiring May 27. A project was created to replace all expiring certs. However, this alerting system was unable to detect the expiring server certificate.

What Could Be Improved?

We are enhancing the certificate expiring alerting system to include the server certificates of the credential storage system.

Posted Jun 04, 2022 - 02:27 UTC

Resolved
Prior incident was accidentally deleted. This is to backfill the incident. Postmortem will have timeline of the outage.
Posted May 27, 2022 - 21:00 UTC