In my previous two posts, I illustrated how your team can Use cron and curl to Regularly Download API Performance Data and How to Use Python to Extract API Performance Data. In this post, I illustrate how you can use the Python MatPlotLib library to create plots of the downloaded data that will be of value to your API performance monitoring team.
In the first article, we utilized the API Science Performance Report API to download JSON data and store it in a file on our local disk. In the second article, we created a Python script that reads the JSON text file into a Python dictionary. Now, we can use MatPlotLib to create plots of the data in the Python dictionary.
The JSON data contains 168 results (one week of data for our example monitor) that look like this:
The Python dictionary allows us to access this data using the JSON titles for each data item. In other words, the Python dictionary contains individual arrays titled averageResolve, averageConnect, etc.
In this MatPlotLib example, we’ll plot the averageTotal data, that is the total number of milliseconds between the initiation of the request to the API and the completed download of the response.
Here’s the complete script for reading the API performance data JSON file and creating a plot of the averageTotal data:
# generate_report - 27 May 2018 # generate a report based on JSON data import numpy as np import matplotlib.pyplot as plt import json with open('perf.json') as f: perf = json.load(f) n_results = perf['meta']['numberOfResults'] print n_results, 'results' hourly_perf_total = np.zeros(n_results, dtype=float) for i in range(n_results): print i, perf['data'][i]['averageTotal'] hourly_perf_total[i] = perf['data'][i]['averageTotal'] plt.plot(hourly_perf_total) plt.xticks(np.arange(0, n_results + 1, 24.0)) plt.ylabel('Average Total Milliseconds') plt.xlabel('Hours Since ' + perf['meta']['endPeriod']) title = 'Monitor 1572020 Past Week Performance' plt.title(title) # log y axis plt.semilogy() plt.grid(True) plt.show()
perf is the dictionary into which the JSON was read, as described in the last post. n_results is the number of results stated in the JSON download from the API Science API (a week of data averaged in one-hour time bins).
Following the printing of n_results, we initialize hourly_perf_total, a floating point array sized at the number of results. Then, in the
for i in range(n_results) loop, we copy the 168 averageTotal values into the hourly_perf_total variable (programmers well-versed in Python will note that this is an unnecessary step; but I did it this way to provide clarity to non-Python programmers who might read this post).
Now we have an hourly_perf_total array that contains 168 values representing the total performance data for the API binned hourly over the past week.
The remaining lines use MatPlotLib to produce a plot of the past week’s total performance, for example:
The average total milliseconds are plotted in a logarithmic scale to point out to the team instances where the API’s performance was particularly poor. For example, the two peaks that touch and exceed 1000 milliseconds (one second) could represent times when your product will have been seen as down by many customers, especially if access to this API is but one of many APIs your product depends on.
In the first three posts of this series, we’ve downloaded a week of API performance data from the API Science API, read the JSON into a Python dictionary, and used MatPlotLib to create a plot of the total performance data — and accomplished all of this using a one-line csh script and 20 simple lines of Python code!
In my next post, we’ll create a simple web site that publishes this information to your team as they need it, wherever they are located at the time.