WPS Service is down, affecting all connectors that use Webhooks
Incident Report for Fivetran
Postmortem

Issue: The webhook client used by all webhook connectors uses a Google Cloud Metric API/SDK which is returning an empty response due to a Google Cloud incident. This is required to fetch the oldest unacknowledged message to be able to sync the latest webhook data.

Cause/Impact: All webhook connectors are failing because of this error and not able to sync webhook events. However these events are captured properly and the next successful run should be able to process all these events.

Actions Taken

Posted Oct 13, 2021 - 19:49 UTC

Resolved
The root cause of this was an incident within Google Cloud which caused the API calls we use to read from the webhook storage buckets to suddenly return empty responses:

https://status.cloud.google.com/incidents/rXqQALuw55aCKd2QHfM3

We do not expect any data loss as all events exist within our storage buckets, and the failure was in the API calls to read from those buckets. Events which came in will update upon the next successful webhook sync.
Posted Oct 13, 2021 - 19:46 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Oct 13, 2021 - 17:24 UTC
Update
The fix for this is in the deployment process and will be in production within the next two hours.
Posted Oct 13, 2021 - 15:46 UTC
Update
We have a fix being developed which is currently going through the testing and review stage.

Here is a list of the affected connectors currently:

- AppsFlyer
- Branch
- Eloqua
- GitHub
- Greenhouse
- Helpscout
- HubSpot
- Intercom
- Iterable
- Jira
- Mandrill
- Pipedrive
- Recharge
- Segment
- Sendgrid
- Shopify
- Snowplow
- Webhook
Posted Oct 13, 2021 - 14:56 UTC
Update
While syncing our Webhook based connectors, we use Google Cloud metrics to fetch the oldest unacknowledged message to allow us sync the latest Webhook data. However, the Google Cloud Monitoring API is returning an empty response.

We are able to capture events properly, so there is no data loss, but we cannot sync that data.

We are currently investigating the root cause for the empty response.
Posted Oct 13, 2021 - 14:29 UTC
Identified
The issue has been identified and we are working to resolve it.
Posted Oct 13, 2021 - 14:24 UTC
This incident affected: Connectors (Events).