Good morning, Denali users. A telecom issue has been identified that is affecting inbound calls. Our team is actively working to identify and address the issue. More information will be provided here as it becomes available.
The issue is affecting only Inbound calls, Outbound is not affected.
UPDATE 11:00am PT: Service to affected numbers is restored, we have confirmed that only a small number of older inbound lines were affected. The underlying cause was an issue with an upstream provider. We are working with them to obtain an RCA.
UPDATE 11:20am PT: Based on our findings, it appears that there are broader telephony issues occurring on a large scale. We have identified an issue with a different provider that is also impacting Inbound. As before, we are working with the provider to maintain performance. As more information becomes available to us, we will continue to post updates.
UPDATE 2pm PT: The second issue has been confirmed by the provider to be resolved. We are currently awaiting an RCA from them.
ROOT CAUSE ANALYSIS
On February 19, 2020 Dialsource experienced an event that affected some inbound phone numbers. Dialsource utilizes multiple inbound providers. The multiple providers deliver calls to multiple call processing servers. At approximately 9:46 am PST one of the call processing servers experienced a hardware failure that required a reboot to correct. Unfortunately due to the age of the configuration, these particular inbound phone numbers were not able to failover to alternate equipment and were not part of our standard production monitoring. This ultimately contributed to a delay in response to correct the issue. By 10:32 am PST, Dialsource personnel were working with their hosting provider to determine the root cause of the issue. At 10:39 am PST it was discovered that an out of memory condition on the server had caused processes to intermittently fail and die. The servers were rebooted and service was restored by 10:59 am PST.
Unrelated to this incident, One of Dialsource's upstream carriers supplying inbound phone numbers also experienced an issue that overlapped with the above-mentioned event. The event experienced by the carrier caused intermittent delays in inbound call processing. The expected user experience during this second event would have either been no net effect, failure for calls to set up properly or potentially calls would hang up within 1 minute of being answered. This event continued to improve from 11:00am PST to about 1:15pm PST where it cleared. The upstream carrier has explained that their issue was related to a networking issue on their side that has already been completely addressed.
Due to the way inbound calls work, it is not possible to add additional carrier redundancy on a single number. However, Dialsource takes your calls very seriously and implements redundancy wherever possible. In all of Dialsource's interconnections, carrier's utilize multiple switches that send calls to multiple call processors within Dialsource. Because of the way calls are processed and how failover works, it's possible for major issues to cascade failures across server boundaries. This is a very unlikely scenario and is the direct result of the age of that equipment. Dialsource has been in the process of migrating services off of this dated equipment. This process has now been escalated as a priority condition to ensure that we can continue to offer the reliable service you expect.
It is important to note that this event did not affect any outbound calls in any way which utilizes a completely separate pair of redundant call processors. It is this cluster of call processors in which the above-affected phone numbers will be migrated to. Your CSM team will keep you informed as to our progress as this migration occurs. No action will be required on your part to complete this migration and it will be performed after hours to minimize the impact.