Database connectivity
Incident Report for Cyan Mobility
Postmortem

Root Cause for issues on 7/27/18, 7/29/18, 7/30/18

These issues were caused by one instance of the Cyan API failing to connect to Azure SQL and MongoDB clusters do to a physical connection issue on the VM host. The Blue Dot team took steps to restart & redeploy the instance to a new host, but due to connection issues, this fix was not always successful.

To mitigate the chances of this happening again, we have done the following:

  1. Implemented an Automation script to restart the API instance if a connection issue is detected on that instance.
  2. Scaled Azure SQL to a higher performance tier which supports a larger connection pool.
  3. Doubled memory capacity on MongoDB primary servers.

 We are testing a new MongoDB feature that allows for retryable writes. This will also help avoid these types of issues in the future.

Thank You,

Blue Dot Support

Posted Aug 03, 2018 - 09:28 PDT

Resolved
One instance of the API was having outbound connection issues to SQL for request header validation and to MongoDB. We're working with Azure to determine why this instance is having these issues.
Posted Jul 29, 2018 - 23:00 PDT