What are APIs? Why they matter and how to use them
If you work with data, you’ve heard the three-letter initialism before. If you don’t, you’ve almost certainly seen others reference them in the context of powering data visualizations or listed as a source in an investigative report. In any case, one can’t help but suspect their importance, but many may still be wondering: What exactly are APIs? Why are they such valuable tools?
This post will provide a basic introduction to APIs, using a practical example to clarify what they are, how simple it is to use them and why it’s so essential for anybody who works with data to become comfortable using them.
What does “API” stand for?
In its most abstract definition, an API (Application Programming Interface) is a specification for a computer component that exposes its functionality. Although APIs are implemented in many forms, typically when referred to among nondevelopers, it’s with regards to a certain sort called a web API. The Internet saw the proliferation of these as a means to connect data and functionality of web applications. Most major websites that you may frequent, such as Twitter, The New York Times and even the U.S. Government have their own web APIs that allow for methods of querying and communicating with their respective internal databases. Altogether these allow for powerful interconnectivity and mashing of data from across the Internet.
Sounds like something for programmers…
Au contraire! If you’re able to use a web browser, then you possess all the prerequisite skills that you need. It may take some time before you become comfortable, but being able to utilize APIs provides a powerful vehicle for accessing data as quickly as it goes online. You will have a direct pipeline so you won’t need to rely on a developer or third party to send you bulk data dumps. Further, if there’s something missing from dumps or if you want additional information, then you may not be able to obtain it immediately; however, with an API you have the freedom to explore and grab data as you need it. And the best part is that there’s no arcane barriers to stop you — good APIs have excellent documentation to help you find what you’re looking for.
Okay, so how do I use one?
Perhaps the best way to learn is through an example. Suppose your task is to research state bills and you want to compare bills from Virginia and Maryland that pertain to crime. To obtain the data, you could navigate to each states’ government website, track down the bills manually and copy the data you want by hand one-by-one into a spreadsheet for analysis. Of course, this would be a lot of tedious and unnecessary work.
Let’s first make the task much simpler by using Sunlight’s Open States project to access the legislative data in a normalized and comparable form. While one could browse and search this site for the desired data, the problem of transforming it into an easily usable form is still present.
Enter the API.
The API from Open States can provide us with the data we seek in an open, machine-readable and structured format called JSON. You can read more in Sunlight’s Open Data Policy Guidelines, but essentially this format maximizes transferability and makes analysis of large datasets significantly easier.
Your first API call
Let’s examine the API documentation for Open States now. We see that there are six core “data types” and dozens of “methods” for querying them. We’re searching for bills so let’s examine the bill search method. This page shows us how we can use the method along with examples.
Let’s construct our query now.
Communication with an API is done over HTTP, which means access is virtually no different than navigating to any other URL in our browser. For now, we’re literally going to type our query into the URL bar. From the documentation, we see that we can filter bills by both “state” and full-text query “q” attributes. If we combine the base URL http://openstates.org/api/v1
with the method /bills
and our filter /?state=VA&q=crime
, then we have our query http://openstates.org/api/v1/bills/?state=VA&q=crime
.
Drat! Why am I being denied access?
Many APIs require a token or key for authorization purposes. This is done to monitor usage statistics, limit abuse and, in some cases, to charge based on usage. All of Sunlight’s APIs are free for use, so let’s obtain a key to complete our query.
Typically, a key will be a long string of gibberish, but I’ve made a key specifically for this post with better readability. We can thus simply append the key to our others parameters from above to obtain /?state=VA&q=crime&apikey=blogpost
. You can append more parameters in the same fashion with the general form ¶meter=value
.
Let’s see what we get this time with our final query http://openstates.org/api/v1/bills/?q=crime&state=VA&apikey=blogpost
:
[{"title": "Child day centers and family day homes; regulations, national background check required, report.", "created_at": "2015-01-08 20:21:23", "updated_at": "2015-05-07 04:51:59", "id": "VAB00015499", "chamber": "lower", "state": "va", "session": "2015", "type": ["bill"], "subjects": ["Other", "Welfare and Poverty", "Family and Children Issues", "Municipal and County Issues", "Crime"], "bill_id": "HB 1570"}, ... truncated ...
Whoa, how do I read that?
The result from the call is in JSON format as explained above. Humans aren’t intended to read it raw, but if you wish to make JSON easier to read there are dozens of plugins for browsers as well as websites to accomplish this task. Most importantly, however, is that you can transform this data into another format called CSV — comma separated values — which can then be imported into spreadsheet software, like Microsoft Excel. My favorite tool for this task can be found here, created by former Sunlighter Eric Mill. (Note: The data is significantly nested, so converting to a CSV may be more difficult — one may benefit from some light programming knowledge in this case.)
So, what’s the next step?
To continue our example, we could grab the same data for Maryland by changing state=VA
to state=MD
and following the same process as above. Once we retrieve all the bills, we can then issue separate API calls to get further details on each one. From the documentation, we see this method has the pattern /bills/state/session/bill_id/
. Thus, one such query for one of the bills from Virginia may look as follows: http://openstates.org/api/v1/bills/va/2015/HB 1570/?apikey=blogpost
. This API call will retrieve all of the details about the bill HB 1570 from Virginia in the 2015 session, including the actions, sponsors, votes, summary and so forth.
That’s about it!
Hopefully this post was helpful in illustrating what APIs are and why they’re valuable for people who work with data. Indeed, in our fast pace information era, being able to access up-to-date data as needed in a machine-readable format is pivotal for those who wish to report or analyze it. And there’s so many APIs on the Internet speaking the same language and ready to accept requests that there’s really no reason not to learn now.
Check out Sunlight’s data services page to check out the different APIs we have available, sign up for a free key and dig into government data!
—–
Head to the 51:10 mark to listen to Sunlight Labs’ Clayton Dunwell talk more about APIs with Joshua Scheer on KPFK radio.