API Performance Outliers Based on Location

Monitoring API performance from various locations around the world is critical in order for you to know what your global customers are experiencing. The “Weekly API Science Report” that you receive with your API Science membership provides a good summary of the status of the APIs you are monitoring. A week ago, for my post Why Location Matters in API Performance Testing, I created four monitors for the World Bank’s Countries API, calling the API from four different locations around the world. The weekly report I received for the May 11-18 period showed this result:

20150518-weekly-report

The “br” names are my calls to the World Bank Countries API for Brazil. The message shows that there were no failed calls in the past week, and that the average latency when the API was called from Washington, DC, USA, was much lower than when the API was called from elsewhere. This implies that the World Bank’s Countries API data center may be located in Washington, DC (which wouldn’t be surprising, since the World Bank headquarters is in Washington, DC).

So, that’s the overview of average API latency your customers are seeing, based on their location. But API Science allows you to dig deeper into that data. You can view the results of every monitor test, and when you do this, you may realize that API performance sometimes exhibits large extremes in latency, even when the APIs remain up.

Let’s say you have a chat contact system that lets users contact you when they are experiencing a problem. Your team is contacted by someone who says your application is timing out. As it turns out, that person’s location may be the reason they are seeing your application time out, while you’re up for the rest of the world.

For example, on May 13 at 23:10 local time (May 14 3:10 GMT), there was an anomaly in the performance of the World Bank Countries API for systems that were accessing the API from Washington, DC:

World Bank Countries API May 13, 2016 Performance Anomaly

The weekly API report states that the Countries API was called from Washington, DC 1406 times during the week that includes May 13, and the average performance for the call was 39 msec. Yet, the API call at 23:10 on May 13 took 797 msec to complete, more than 20 times the average performance.

Consider the impact if your application involved dozens of calls to the API to address a single button click made by a user. For users who access your application through your Washington, DC data center, your application could appear down, due to timeout errors.

Meanwhile, the performance for accessing the Countries API from the other locations at that time was well within the norm. For example, the average performance for accessing the API from Ireland that week was 214 msec, and the access time at GMT 3:10 on May 14 was 194 msec, within 10% of the average value. The performance from Tokyo at that time was within 5% of the average value for the week.

So, what happened when we accessed the World Bank Countries API from Washington, DC at GMT 3:10 on May 14? The World Bank’s data center is in Washington, DC. API performance from locations other than Washington, DC was nominal, yet access from Washington, DC was 1944% above normal!

To find out why, we can click on the date/time stamp for that specific monitor test instance. Clicking on “05/13/16 23:10” brings up the details page for that monitor run. At the top of this page, we see:

br-was-may13-detail

This page shows us that the abnormal performance was due entirely to a “Resolve” problem — in other words, it took a very long time (783.64 msec)  to establish initial communication between our Washington, DC data center and the World Bank’s data center. Once that link was established, the remainder of the communication and processing took only 12.88 msec.

This implies that some sort of outage or delay occurred in the local Washington, DC Internet at that time. What actually happened is likely knowable only by the companies that manage Washington’s Internet infrastructure. Still, if you received calls or messages from your East Coast USA customers relating to the performance of your API around GMT 3:10 May 14, your API Science monitor would have enabled you to provide an answer: “It wasn’t our API that went down; rather there was an Internet glitch in the Washington, DC area that affected our data feeds from the World Bank.”

Kevin Farnham