OpenFEC makes campaign finance data more accessible with new API: Here’s how to get started

by Bob Lannon

technology

Jul 8, 2015 12:35 pm

The FEC’s new API promises a whole new level of access to campaign finance data. (Image credit: 18F)

Big news in the campaign finance world: The Federal Election Commission (FEC) is taking a huge step forward by making data accessible through a modern API. With the help of a team of intrepid 18F developers, the FEC is rethinking both its website and its data offerings to better serve its mission of educating the public with real-time disclosure of campaign finance information. It’s part of the larger OpenFEC project, and we think it’s a very encouraging sign that this collaboration is going to improve access to a crucial information resource.

This is a beta release, but we’re really excited to see what’s been accomplished so far. What follows is meant as both an introduction to what’s available through this new resource and a critique of what’s working well, and the changes Sunlight would like to see in future releases of the API.

## Doesn’t the FEC already release data?

The FEC is a model disclosure authority: It has made federal campaign finance data available through a searchable web portal, in bulk CSV files, and, most impressively, a live feed of submitted disclosures. On [Influence Explorer](http://influenceexplorer.com/), we’ve made use of each of these sources in different ways — most recently turning that live feed into a searchable data resource, our [Real-Time Federal Campaign Finance tracker](http://realtime.influenceexplorer.com).

Sunlight has consistently called on government sources to make all data available in bulk. It’s difficult to know how a dataset might be used by a researcher, reporter, citizen or advocacy group; that’s why it’s important that government bodies release all of it in machine-readable bulk files to allow the fullest exploration of what’s available and to give context to any given data point. The FEC has historically set an excellent example in making bulk data available.

## Additional benefits from an API

We think pretty highly of what the FEC already offers, and encourage them to continue to make both bulk and streamed data available. Here at Sunlight, though, we tend to make the data we release available both in bulk and through APIs, because we think that APIs are the right kind of access for particular users and use-cases. So what additional advantages are offered by an API?

### Selective data views

Not every user or developer can effectively make use of bulk data. It typically doesn’t fit in a spreadsheet, so the point-and-click crowd can be at a loss right away. Even if you’re technically skilled enough to load it into R or Pandas, though, you may hit a barrier if the operations you want to carry out require that the data be loaded into memory.

Furthermore, a bulk release may contain a lot of data that isn’t relevant to a particular use-case or investigation. Let’s say I want to look at contributions to House candidates who are Democrats in 2012. If I use the bulk release, I’m going to get a lot of data that’s not interesting to me, including all of the contributions to noncandidate PACs, members of other parties and contributions to presidential or senatorial candidates.

Pulling out exactly what you need usually requires loading everything into a database, setting up some indices and running queries. True, there are some tools for working directly with CSV files, like the excellent [csvkit](https://csvkit.readthedocs.org/en/0.9.1/), but depending on your query, you again might run into memory issues. Good old *nix standbys `grep`, `cut`, `sed` and `awk` can also get you pretty far, if you’re willing to hone your shell scripts.

With an API, though, you can hand off this selection business to the data’s source (in this case, the FEC’s data warehouse). As long as the API supports it, you can formulate your query and retrieve it with confidence. That’s an important qualifier, though: The onus is on the API’s design team to make sure that the views which are offered meet the needs of its likely users.

### Aggregate views

Another advantage to having an API is the ability to show up-to-date aggregations of the records in your database. This includes totals, rankings and metadata that might change over time.

Again, aggregation is something that anyone can perform on bulk data. That is, anyone with the technical know-how discussed in the last section and the domain knowledge needed to properly compute aggregates.

In the case of FEC data in particular, summing the dollar amounts of individual transactions can be deceptively difficult. Whether or not two amounts can be added together depends on the type of committee, the type of transaction and sometimes also the type of contributor.

A certain level of legal and regulatory expertise is needed when calculating these sums, but might be out of scope for a developer that wants to add or explore some summary statistics from campaign finance, but for whom campaign finance is not the main focus of investigation. Maybe the focus is on projected vote share, and the campaign finance information is going to be added for context. In cases like that, it’s useful to source the aggregate totals published by the FEC itself — and an API is a great way to deliver that data.

### Live data

In addition to being more focused and infused with expertise, data views made available through an API can be tied to live data. In the case of the FEC’s new API, the data is updated daily. This partially avoids the need for a developer to repeatedly update their database with successive bulk data releases.

In fact, for some use-cases, an API might obviate the need for a database altogether. Imagine again the case of a website that shows some other, non-campaign-finance data, such as legislative activity or election results. If campaign finance data could be a helpful addition to that kind of app, the developer can avoid having to build a big addition onto their database by making client-side calls with javascript. The site, which may be backed by a large database, can deliver data in a web app, and then obtain FEC’s aggregate totals or summary facts on-the-fly, allowing them to show up if and when the site’s designers choose.

—

## Brief tour

Here’s a very quick tour of what you can expect from the OpenFEC API.

### What’s available

The [official documentation](https://api.open.fec.gov/developers) for the API is the best source for getting to know what it has to offer, but here are some of the things you can look forward to interacting with.

1. **Search By Name**: Nearly every question that can be answered with FEC data requires knowing the unique identifiers that FEC has assigned to the entities involved. The search endpoints make acquiring these identifiers straightforward. 2. **Candidate and Committee Details**: Armed with the right identifiers, one can access a lot of important information about any candidate or committee, including location information, FEC designations and the entity’s history. 3. **Financial Reports**: For each committee, you can obtain the top-line numbers describing contributions, receipts, expenditures and loans. These are sourced from the committee’s periodically filed financial reports. These come complete with links to the PDF (shudder) of the original filing. 4. **Per-Cycle Committee Summaries**: In addition to the individual reports’ numbers, the API also makes available top-line numbers aggregated on a per-cycle basis.

In other words, FEC’s first API is off to a very promising start. Armed with just this data, there are already a lot of opportunities to keep up with campaign finance during the 2016 election.

### Our wish list

While we were excited to see the progress made so far, there’s a few things we’d really like to see added to the API.

1. **Endpoints for Itemized Data**: Having endpoints dedicated to the itemized transactions that show up on Schedule A (receipts), Schedule B (disbursements) and Schedule E (expenditures) should be the first priority for new additions the API. There is a tremendous amount of useful information that is contained in these line items. Without an endpoint for itemized transactions, the bulk data and live feeds still offer much that the API does not. *NOTE: Thanks to the OpenFEC team’s [in-the-open development](https://github.com/18F/openFEC) on Github, there’s [evidence](https://github.com/18F/openFEC/blob/2022c0a965f878e0c62521d321104a52c9e500e5/webservices/rest.py#L178-L194) that this is on the way!* 2. **Per-Contributor/Per-Recipient Aggregates**: Campaign finance data is essentially the description of the relationships between contributors and their recipients. Endpoints are needed that list (a) the per-contributor, per-cycle aggregate totals of receipts and (b) the per-recipient, per-cycle aggregate totals of disbursements. It’s unfortunate that, given the state of disclosure, these can only include PAC-to-candidate and PAC-to-PAC transactions, but they’re very useful nonetheless. 3. **Independent Expenditure Aggregates**: In a world strongly influenced by the behavior of independent-expenditure-only PACs (super PACs), it’s very important to be able to ask two questions: For a given super PAC, who have they spent the most money targeting negatively/positively?; and, for a given candidate, who has spent the most targeting them negatively/positively? This data is available from FEC, but isn’t an endpoint yet.

## Sunlight from the inside out

A quick note: The group of developers working on OpenFEC includes two former Sunlight labs members. We couldn’t be prouder of the work they’ve been doing during their time “on the inside.” It’s unsurprising, though, that they’ve been effective at 18F, and more specifically on the OpenFEC project. Lindsay Young developed our portal for accessing a live feed of filings related to the Foreign Agent Registration Act, and Alison Rowland was my predecessor as project lead on Influence Explorer. We miss them both, but we’re very grateful for the hard work they and their team are putting into improving public access to campaign finance disclosure at the federal level.

—

## Exploring the API

The base URL for the API is:

BASE_URL = ‘http://api.open.fec.gov/v1’

You’ll also need a Data.gov API key, which you can obtain [here](https://api.data.gov/signup/). I save my API keys in a plain text file in my home directory, so that they’re always handy and so that I can use them without revealing them in notebooks like this one:

API_KEY = open(os.path.expanduser(‘~/.api-keys/data.gov’),’r’).read().strip()

Conceptually, there are two main areas of focus for the API: candidates and committees. When looking at contributions, however, remember that recipients are always committees. Candidates do not receive contributions directly, their committees do. Here are the relevant branches:

– `/candidate`: individual candidate information – `/committee`: individual committee information

### Documentation

We’re going to cover a fair bit of ground in this introduction, but for more details on what’s possible, check the [official OpenFEC API documentation](https://api.open.fec.gov/developers).

## Helpful utils

Some methods and global vars to help us stay succinct are below:

def all_results(endpoint, params): _params = deepcopy(params) _params.update({‘api_key’: API_KEY}) _url = BASE_URL+endpoint logging.info(‘querying endpoint: {}’.format(_url))

initial_resp = requests.get(_url, params=_params)

logging.debug(‘full url eg: {}’.format(initial_resp.url))

initial_data = initial_resp.json()

num_pages = initial_data[‘pagination’][‘pages’] num_records = initial_data[‘pagination’][‘count’] logging.info(‘{p} pages to be retrieved, with {n} records’.format( p=num_pages, n=num_records))

current_page = initial_data[‘pagination’][‘page’] logging.debug(‘page {} retrieved’.format(current_page))

for record in initial_data[‘results’]: yield record

while current_page < num_pages: current_page += 1 _params.update({'page': current_page}) _data = requests.get(_url, params=_params).json() logging.debug('page {} retrieved'.format(current_page)) for record in _data['results']: yield record logging.info('all pages retrieved') def count_results(endpoint, params): _params = deepcopy(params) _params.update({'api_key': API_KEY}) _url = BASE_URL+endpoint _data = requests.get(_url, params=_params).json() return _data['pagination']['count'] ## FEC identifiers: The keys to all data To get data associated with a candidate or a committee, you need to know the identifier that FEC has assigned to that entity. In case you don't have those memorized, though, there are two ways to obtain the IDs that you need: You can search for them, or obtain optionally filtered lists. ### Searching Data on candidate and committee entities can be found using the search endpoints for each type: - `/candidates/search` - `/committees/search` Let's try looking for a candidate. q_obama = { 'q': 'obama', } [r for r in all_results('/candidates/search', q_obama)] Here's the result: [{u'active_through': 2000, u'candidate_id': u'H0IL01087', u'candidate_status': u'P', u'candidate_status_full': u'Statutory candidate in a prior cycle', u'cycles': [2000], u'district': u'01', u'election_years': [2000], u'incumbent_challenge': None, u'incumbent_challenge_full': u'Unknown', u'name': u'OBAMA, BARACK H', u'office': u'H', u'office_full': u'House', u'party': u'DEM', u'party_full': u'Democratic Party', u'principal_committees': [{u'candidate_ids': [u'H0IL01087'], u'committee_id': u'C00347583', u'committee_type': u'H', u'committee_type_full': u'House', u'cycles': [2000, 2002, 2004], u'designation': u'P', u'designation_full': u'Principal campaign committee', u'expire_date': None, u'first_file_date': None, u'last_file_date': u'2004-10-13T00:00:00+00:00', u'name': u'OBAMA FOR CONGRESS 2000', u'organization_type': None, u'organization_type_full': None, u'party': u'DEM', u'party_full': u'Democratic Party', u'state': u'IL', u'treasurer_name': u'LIONEL BOLIN'}], u'state': u'IL'}, {u'active_through': 2010, u'candidate_id': u'S4IL00180', u'candidate_status': u'C', u'candidate_status_full': u'Statutory candidate', u'cycles': [2004, 2006, 2008, 2010], u'district': None, u'election_years': [2004, 2010], u'incumbent_challenge': u'I', u'incumbent_challenge_full': u'Incumbent', u'name': u'OBAMA, BARACK', u'office': u'S', u'office_full': u'Senate', u'party': u'DEM', u'party_full': u'Democratic Party', u'principal_committees': [{u'candidate_ids': [u'S4IL00180'], u'committee_id': u'C00411934', u'committee_type': u'S', u'committee_type_full': u'Senate', u'cycles': [2006, 2008, 2010], u'designation': u'P', u'designation_full': u'Principal campaign committee', u'expire_date': u'2015-05-11T00:00:00+00:00', u'first_file_date': u'2005-05-25T00:00:00+00:00', u'last_file_date': u'2009-10-19T00:00:00+00:00', u'name': u'OBAMA 2010 INC', u'organization_type': None, u'organization_type_full': None, u'party': u'DEM', u'party_full': u'Democratic Party', u'state': u'IL', u'treasurer_name': u'HARVEY S WINEBERG'}, {u'candidate_ids': [u'S4IL00180'], u'committee_id': u'C00381442', u'committee_type': u'S', u'committee_type_full': u'Senate', u'cycles': [2002, 2004, 2006], u'designation': u'P', u'designation_full': u'Principal campaign committee', u'expire_date': None, u'first_file_date': u'2002-08-22T00:00:00+00:00', u'last_file_date': u'2005-08-05T00:00:00+00:00', u'name': u'OBAMA FOR ILLINOIS INC', u'organization_type': None, u'organization_type_full': None, u'party': u'DEM', u'party_full': u'Democratic Party', u'state': u'IL', u'treasurer_name': u'HARVEY S. WINEBERG'}], u'state': u'IL'}, {u'active_through': 2012, u'candidate_id': u'P80003338', u'candidate_status': u'C', u'candidate_status_full': u'Statutory candidate', u'cycles': [2008, 2010, 2012], u'district': None, u'election_years': [2008, 2012], u'incumbent_challenge': u'I', u'incumbent_challenge_full': u'Incumbent', u'name': u'OBAMA, BARACK', u'office': u'P', u'office_full': u'President', u'party': u'DEM', u'party_full': u'Democratic Party', u'principal_committees': [{u'candidate_ids': [u'P80003338'], u'committee_id': u'C00431445', u'committee_type': u'P', u'committee_type_full': u'Presidential', u'cycles': [2008, 2010, 2012, 2014, 2016], u'designation': u'P', u'designation_full': u'Principal campaign committee', u'expire_date': u'2015-05-11T00:00:00+00:00', u'first_file_date': u'2007-01-16T00:00:00+00:00', u'last_file_date': u'2013-01-31T00:00:00+00:00', u'name': u'OBAMA FOR AMERICA', u'organization_type': None, u'organization_type_full': None, u'party': u'DEM', u'party_full': u'Democratic Party', u'state': u'IL', u'treasurer_name': u'NESBITT, MARTIN H'}], u'state': u'US'}] Wait, there are three Barack Obamas? Well, not quite. The FEC assigns an identifier each time someone runs for a particular office. Obama has an FEC ID that starts with `P` because he ran for president, but also picked up two more when he ran for seats in the House (`H`) and Senate (`S`).

The FEC data doesn’t do any formal reconciliation of these records, so it’s something to look out for when you’re looking at someone’s history. For instance, if we were to use `P80003338` to look up Obama’s history using the `/candidate/{candidate_id}/history` endpoint, we might expect to see those other identifiers somewhere. Unfortunately, that’s not the case:

[r for r in all_results(‘/candidate/P80003338/history’, {})]

Here’s the result:

[{u’address_city’: u’CHICAGO’, u’address_state’: u’IL’, u’address_street_1′: u’PO BOX 8102′, u’address_street_2′: None, u’address_zip’: u’60680′, u’candidate_id’: u’P80003338′, u’candidate_inactive’: None, u’candidate_status’: u’C’, u’candidate_status_full’: u’Statutory candidate’, u’cycles’: [2008, 2010, 2012], u’district’: None, u’election_years’: [2008, 2012], u’expire_date’: None, u’form_type’: u’F2Z’, u’incumbent_challenge’: u’I’, u’incumbent_challenge_full’: u’Incumbent’, u’load_date’: u’2015-05-11T12:15:43+00:00′, u’name’: u’OBAMA, BARACK’, u’office’: u’P’, u’office_full’: u’President’, u’party’: u’DEM’, u’party_full’: u’Democratic Party’, u’state’: u’US’, u’two_year_period’: 2012}, {u’address_city’: u’CHICAGO’, u’address_state’: u’IL’, u’address_street_1′: u’PO BOX 8102′, u’address_street_2′: None, u’address_zip’: u’60680′, u’candidate_id’: u’P80003338′, u’candidate_inactive’: None, u’candidate_status’: u’C’, u’candidate_status_full’: u’Statutory candidate’, u’cycles’: [2008, 2010, 2012], u’district’: None, u’election_years’: [2008, 2012], u’expire_date’: u’2015-05-11T00:00:00+00:00′, u’form_type’: u’F2′, u’incumbent_challenge’: u’O’, u’incumbent_challenge_full’: u’Open seat’, u’load_date’: u’2015-05-11T12:15:43+00:00′, u’name’: u’OBAMA, BARACK’, u’office’: u’P’, u’office_full’: u’President’, u’party’: u’DEM’, u’party_full’: u’Democratic Party’, u’state’: u’US’, u’two_year_period’: 2010}, {u’address_city’: u’CHICAGO’, u’address_state’: u’IL’, u’address_street_1′: u’PO BOX 8102′, u’address_street_2′: None, u’address_zip’: u’60680′, u’candidate_id’: u’P80003338′, u’candidate_inactive’: None, u’candidate_status’: u’C’, u’candidate_status_full’: u’Statutory candidate’, u’cycles’: [2008, 2010, 2012], u’district’: None, u’election_years’: [2008, 2012], u’expire_date’: u’2015-05-11T00:00:00+00:00′, u’form_type’: u’F2′, u’incumbent_challenge’: u’O’, u’incumbent_challenge_full’: u’Open seat’, u’load_date’: u’2015-05-11T12:15:43+00:00′, u’name’: u’OBAMA, BARACK’, u’office’: u’P’, u’office_full’: u’President’, u’party’: u’DEM’, u’party_full’: u’Democratic Party’, u’state’: u’US’, u’two_year_period’: 2008}]

### Listing

We can also obtain a list of many candidates, applying optional filtering constraints if we don’t want the entire list. This can be done at the `/candidates` endpoint. The metadata in the records returned can help when building a local reference resource or lookup table.

q_all_2012_candidates = { “cycle”: 2012, }

This query is going to return quite a lot of candidates:

count_results(‘/candidates’, q_all_2012_candidates)

Here’s the result:

3024

You can limit the list by specifying the `candidate_status`. Most of the time, what we care about are candidates with `candidate_status=C`, which means they are a declared candidate who has raised at least $5,000 in that cycle.

q_all_2012_present_candidates = { “cycle”: 2012, “candidate_status”: “C” }

count_results(‘/candidates’, q_all_2012_present_candidates)

Here’s the result:

1885

It’s true that we’re looking at all federal races in 2012, but that’s still a pretty big number. Let’s pull that data down and see how it looks.

candidates_2012 = [c for c in all_results(‘/candidates’, q_all_2012_present_candidates)]

Picking one at “random”:

[c for c in candidates_2012 if ‘OBAMA’ in c[‘name’]]

Here’s the result:

[{u’active_through’: 2012, u’candidate_id’: u’P80003338′, u’candidate_status’: u’C’, u’candidate_status_full’: u’Statutory candidate’, u’cycles’: [2008, 2010, 2012], u’district’: None, u’election_years’: [2008, 2012], u’incumbent_challenge’: u’I’, u’incumbent_challenge_full’: u’Incumbent’, u’name’: u’OBAMA, BARACK’, u’office’: u’P’, u’office_full’: u’President’, u’party’: u’DEM’, u’party_full’: u’Democratic Party’, u’state’: u’US’}]

For ease of use and demonstration, let’s convert the results to a Pandas DataFrame:

candidates_2012_df = pd.DataFrame(candidates_2012) candidates_2012_df.head()

	active_through	candidate_id	candidate_status	candidate_status_full	cycles	district	election_years	incumbent_challenge	incumbent_challenge_full	name	office	office_full	party	party_full	state
0	2012	S2UT00229	C	Statutory candidate	[2012]	None	[2012]	C	Challenger	AALDERS, TIMOTHY NOEL	S	Senate	REP	Republican Party	UT
1	2012	H2CA01110	C	Statutory candidate	[2012]	01	[2012]	C	Challenger	AANESTAD, SAMUEL	H	House	REP	Republican Party	CA
2	2012	H2AZ02279	C	Statutory candidate	[2012]	02	[2012]	C	Challenger	ABOUD, PAULA ANN	H	House	DEM	Democratic Party	AZ
3	2012	H2CA25176	C	Statutory candidate	[2012]	25	[2012]	C	Challenger	ACOSTA, DANTE	H	House	REP	Republican Party	CA
4	2014	H8NC03043	C	Statutory candidate	[2008, 2010, 2012, 2014]	03	[2008, 2014]	C	Challenger	ADAME, MARSHALL RICHARD	H	House	DEM	Democratic Party	NC

Since we had some high counts, let’s look at how they break down (note the log scale on the x axis).

candidates_2012_df.pivot_table( index=’party’, columns=’office’, values=’candidate_id’, aggfunc=np.size ).plot( kind=’barh’, subplots=True, figsize=(6,10), logx=True, legend=False, xticks=[1, 10, 100, 1000] )

![png](https://horseradish.s3.amazonaws.com/CACHE/images/photos/0a/30/faa7fc064388/candidate__count_by_office_and_party-800.png)

So while these numbers seem a bit higher than you might expect, they’re in the right proportion: Democrats and Republicans are the most common parties (at least among the congressional candidates), there are far more candidates for house than there are for senate and candidates for president make up the smallest population. Still, why are there so many more candidates than we remember seeing in 2012?

The answer is that, while mainstream election coverage typically focuses on candidates that are likely to be competitive and/or associated with a major national party, the FEC is responsible for reporting the campaign finance records for everyone who registers with the FEC as a candidate. As a result, it’s a much higher number than many people perceive.

### Focusing on select entities

Let’s look at the names of those candidates who raised more than $5000 in a bid for the oval office:

q_all_2012_present_prez_candidates = { “cycle”: 2012, “candidate_status”: “C”, “office”: “P”, }

count_results(‘/candidates’, q_all_2012_present_prez_candidates)

Here’s the result:

Now, we’ll build the DataFrame:

prez_candidates_2012 = [c for c in all_results(‘/candidates’, q_all_2012_present_prez_candidates)] prez_candidates_2012_df = pd.DataFrame(prez_candidates_2012) prez_candidates_2012_df[[‘name’,’party’,’candidate_id’]].sort(‘party’)

	name	party	candidate_id
12	GOODE, VIRGIL H JR	999	P20004685
6	CARTER, WILLIE FELIX	DEM	P80000268
14	HERMAN, RAPHAEL	DEM	P20002184
23	OBAMA, BARACK	DEM	P80003338
27	RICHARDSON, DARCY G	DEM	P20001376
22	MESPLAY, KENT P	GRE	P40003279
33	STEIN, JILL	GRE	P20003984
11	FARNSWORTH, VERL	IND	P20002853
21	MCCALL, JAMES HATTON	IND	P80003361
25	RAKOWITZ, ARTHUR FABIAN	IND	P20003448
28	RISLEY, MICHEALENE CRISTINI	IND	P20004727
34	TERRY, RANDALL A	IND	P20002424
35	WELLS, ROBERT CARR JR	IND	P20004065
37	WIFORD, SAMUEL TIMOTHY II	IND	P20003489
13	HARRIS, RICHARDJASON SATAWK	LIB	P20003364
38	WRIGHTS, ROGER LEE	LIB	P20002952
5	BROWN, HARLEY D	NNE	P00004275
17	KOTLIKOFF, LAURENCE J	NNE	P20004511
26	REED, JILL ANN	NNE	P20003208
31	ROTH, CECIL JAMES	NNE	P20003836
1	ANDERSON, ROSS C (ROCKY)	OTH	P20004263
2	BARR, ROSEANNE CHERRI	OTH	P20002804
10	DURHAM, STEPHEN	OTH	P20004651
20	LOPEZ, CHRISTINA (VICE PRES)	OTH	P20004669
29	ROEMER, CHARLES E. ”BUDDY” III	OTH	P20002523
36	WHITE, JEROME S	OTH	P20004677
0	ADESHINA, YINKA ABOSEDE	REP	P60004793
3	BLANKENSHIP, JARED	REP	P20002598
7	CISNEROS, CESAR	REP	P20002390
8	DAVIS, L JOHN JR	REP	P20002325
9	DRUMMOND, KEITH	REP	P20003430
15	HILL, CHRISTOPHER V	REP	P20002838
16	KARGER, FRED	REP	P20002564
18	LAWSON, EDGAR A	REP	P20003950
24	PAUL, RON	REP	P80000748
30	ROMNEY / PAUL D. RYAN, MITT	REP	P80003353
4	BLOCK, JEFF	UNK	P20003398
19	LINDSAY, PETA	UNK	P20004636
32	SCHRINER, JOSEPH CHARLES	UNK	P00003962

Yep, that’s quite a large field. Keep this in mind when pulling your data: You’ll probably want to make editorial choices about which candidates you’d like to focus on. That could be as easy as filtering your results after obtaining them from the API:

candidates_to_focus_on = [‘PAUL, RON’, ‘OBAMA, BARACK’, ‘ROMNEY / PAUL D. RYAN, MITT’]

candidate_filter = prez_candidates_2012_df.name.str.match( ‘|’.join(candidates_to_focus_on), case=False)

prez_candidates_2012_df[candidate_filter].T

	23	24	30
active_through	2012	2012	2012
candidate_id	P80003338	P80000748	P80003353
candidate_status	C	C	C
candidate_status_full	Statutory candidate	Statutory candidate	Statutory candidate
cycles	[2008, 2010, 2012]	[1988, 1990, 1992, 1994, 1996, 1998, 2000, 200…	[2008, 2010, 2012]
district	None	None	None
election_years	[2008, 2012]	[1988, 1990, 2008, 2012]	[2008, 2012]
incumbent_challenge	I	C	C
incumbent_challenge_full	Incumbent	Challenger	Challenger
name	OBAMA, BARACK	PAUL, RON	ROMNEY / PAUL D. RYAN, MITT
office	P	P	P
office_full	President	President	President
party	DEM	REP	REP
party_full	Democratic Party	Republican Party	Republican Party
state	US	US	US

If you plan to regularly update your data, though, you might want to store the identifiers for the entities you’re interested in and use those for future API calls.

q_my_2012_prez_candidates = { “cycle”: 2012, “candidate_status”: “C”, “office”: “P”, “candidate_id”: [‘P80003338’, ‘P80000748’, ‘P80003353’, ‘P20002523’, ‘P20003984’] }

my_2012_prez_candidates = [c for c in all_results(‘/candidates’, q_my_2012_prez_candidates)] my_2012_prez_candidates_df = pd.DataFrame(my_2012_prez_candidates) my_2012_prez_candidates_df.T

	0	1	2	3	4
active_through	2012	2012	2012	2012	2012
candidate_id	P80003338	P80000748	P20002523	P80003353	P20003984
candidate_status	C	C	C	C	C
candidate_status_full	Statutory candidate	Statutory candidate	Statutory candidate	Statutory candidate	Statutory candidate
cycles	[2008, 2010, 2012]	[1988, 1990, 1992, 1994, 1996, 1998, 2000, 200…	[2012]	[2008, 2010, 2012]	[2012]
district	None	None	None	None	None
election_years	[2008, 2012]	[1988, 1990, 2008, 2012]	[2012]	[2008, 2012]	[2012]
incumbent_challenge	I	C	C	C	C
incumbent_challenge_full	Incumbent	Challenger	Challenger	Challenger	Challenger
name	OBAMA, BARACK	PAUL, RON	ROEMER, CHARLES E. ”BUDDY” III	ROMNEY / PAUL D. RYAN, MITT	STEIN, JILL
office	P	P	P	P	P
office_full	President	President	President	President	President
party	DEM	REP	OTH	REP	GRE
party_full	Democratic Party	Republican Party	Other	Republican Party	Green Party
state	US	US	US	US	US

## Using identifiers to obtain candidate data

If we want to know more about a given candidate, we have some options. Using the `candidate_id` field, we can make requests to the `/candidate` endpoint to get a detailed profile. Note that the identifier needs to be included as part of the path, not as a GET argument.

[r for r in all_results(‘/candidate/P80003338′,{})]

Here’s the result:

[{u’active_through’: 2012, u’address_city’: u’CHICAGO’, u’address_state’: u’IL’, u’address_street_1′: u’PO BOX 8102′, u’address_street_2′: None, u’address_zip’: u’60680′, u’candidate_id’: u’P80003338′, u’candidate_inactive’: None, u’candidate_status’: u’C’, u’candidate_status_full’: u’Statutory candidate’, u’cycles’: [2008, 2010, 2012], u’district’: None, u’election_years’: [2008, 2012], u’expire_date’: None, u’form_type’: u’F2Z’, u’incumbent_challenge’: u’I’, u’incumbent_challenge_full’: u’Incumbent’, u’load_date’: u’2015-05-11T12:15:43+00:00′, u’name’: u’OBAMA, BARACK’, u’office’: u’P’, u’office_full’: u’President’, u’party’: u’DEM’, u’party_full’: u’Democratic Party’, u’state’: u’US’}]

### Looking up candidate committees

Let’s continue to look at those presidential candidates. How much did each one raise in 2012? We can start to answer that question by looking at their committees, using the following endpoint:

/candidate/{candidate_id}/committees/history/{cycle}

Let’s look up the committees associated with Barack Obama.

count_results(‘/candidate/P80003338/committees’,{‘cycle’:2012})

Here’s the result:

Hm, that’s odd. He probably didn’t have 21 committees.

[r[‘name’] for r in all_results(‘/candidate/P80003338/committees’,{‘cycle’:2012})]

Here’s the result:

[u’ALASKAN WOMEN FOR OBAMA’, u’CALIFORNIANS FOR CHANGE’, u’COALITION FOR CHANGE’, u’DC LGBT FOR SECOND TERM’, u’OBAMA – COMMITTEE TO ELECT’, u’OBAMA FOR AMERICA’, u’OBAMA VICTORY FUND’, u’OBAMA VICTORY FUND 2012′, u’PA MOVING FORWARD’, u’REALISTIC AND TRUTHFUL’, u’SUPPORT THE PREZ’, u’SWING STATE VICTORY FUND’, u’WNC FOR CHANGE’, u’YES WE CAN NEBRASKA’]

What’s happening here is that the API is returning all committees that claim to be associated with Obama. Some do so because they intended to raise money specifically for him, and others are “Single Candidate Independent Expenditure” groups. Most, though, are of designation “Unauthorized”.

[(r[‘designation_full’],r[‘committee_type_full’]) for r in all_results(‘/candidate/P80003338/committees’,{‘cycle’:2012})]

Here’s the result:

[(u’Unauthorized’, u’Single Candidate Independent Expenditure’), (u’Unauthorized’, u’PAC – Nonqualified’), (u’Unauthorized’, u’Single Candidate Independent Expenditure’), (u’Unauthorized’, u’Single Candidate Independent Expenditure’), (u’Unauthorized’, u’Single Candidate Independent Expenditure’), (u’Principal campaign committee’, u’Presidential’), (u’Joint fundraising committee’, u’PAC – Nonqualified’), (u’Joint fundraising committee’, u’PAC – Nonqualified’), (u’Unauthorized’, u’Single Candidate Independent Expenditure’), (u’Unauthorized’, u’Single Candidate Independent Expenditure’), (u’Unauthorized’, u’Single Candidate Independent Expenditure’), (u’Joint fundraising committee’, u’PAC – Nonqualified’), (u’Unauthorized’, u’PAC – Nonqualified’), (u’Unauthorized’, u’PAC – Nonqualified’)]

For now, let’s focus on Obama’s principal campaign committee. We can limit the results using the `designation` field and the `committee_type` field:

[r for r in all_results(‘/candidate/P80003338/committees’, {‘cycle’:2012, ‘designation’: ‘P’, ‘committee_type’: ‘P’})]

Here’s the result:

[{u’candidate_ids’: [u’P80003338′], u’city’: u’CHICAGO’, u’committee_id’: u’C00431445′, u’committee_type’: u’P’, u’committee_type_full’: u’Presidential’, u’custodian_city’: None, u’custodian_name_1′: None, u’custodian_name_2′: None, u’custodian_name_full’: None, u’custodian_name_middle’: None, u’custodian_name_prefix’: None, u’custodian_name_suffix’: None, u’custodian_name_title’: None, u’custodian_phone’: None, u’custodian_state’: None, u’custodian_street_1′: None, u’custodian_street_2′: None, u’custodian_zip’: None, u’cycles’: [2008, 2010, 2012, 2014, 2016], u’designation’: u’P’, u’designation_full’: u’Principal campaign committee’, u’email’: u’OFAFEC@BARACKOBAMA.COM’, u’expire_date’: u’2015-05-11T00:00:00+00:00′, u’fax’: None, u’filing_frequency’: u’Q’, u’first_file_date’: u’2007-01-16T00:00:00+00:00′, u’form_type’: u’F1Z’, u’last_file_date’: u’2013-01-31T00:00:00+00:00′, u’leadership_pac’: None, u’load_date’: u’2015-05-11T12:36:16+00:00′, u’lobbyist_registrant_pac’: None, u’name’: u’OBAMA FOR AMERICA’, u’organization_type’: None, u’organization_type_full’: None, u’party’: u’DEM’, u’party_full’: u’Democratic Party’, u’party_type’: None, u’party_type_full’: None, u’qualifying_date’: None, u’state’: u’IL’, u’state_full’: u’Illinois ‘, u’street_1′: u’PO BOX 8102′, u’street_2′: None, u’treasurer_city’: u’CHICAGO’, u’treasurer_name’: u’NESBITT, MARTIN H’, u’treasurer_name_1′: None, u’treasurer_name_2′: None, u’treasurer_name_middle’: None, u’treasurer_name_prefix’: None, u’treasurer_name_suffix’: None, u’treasurer_name_title’: u’TREASURER’, u’treasurer_phone’: u’3129851700′, u’treasurer_state’: u’IL’, u’treasurer_street_1′: u’PO BOX 8102′, u’treasurer_street_2′: None, u’treasurer_zip’: u’60680′, u’website’: u’HTTP://WWW.BARACKOBAMA.COM’, u’zip’: u’60680′}]

We’ll have to combine multiple API calls to get everyone we care about.

my_2012_prez_committees = []

for i, row in my_2012_prez_candidates_df.iterrows(): endpoint = ‘/candidate/{c}/committees’.format(c=row.candidate_id) for res in all_results(endpoint, {‘cycle’:2012, ‘designation’: ‘P’, ‘committee_type’: ‘P’}): res[‘candidate_id’] = row.candidate_id my_2012_prez_committees.append(res)

my_2012_prez_committees_df = pd.DataFrame(my_2012_prez_committees) my_2012_prez_committees_df[[‘name’,’committee_id’,’candidate_id’]]

	name	committee_id	candidate_id
0	OBAMA FOR AMERICA	C00431445	P80003338
1	RON PAUL 2012 PRESIDENTIAL CAMPAIGN COMMITTEE …	C00495820	P80000748
2	BUDDY ROEMER FOR PRESIDENT, INC.	C00493692	P20002523
3	ROMNEY FOR PRESIDENT, INC.	C00431171	P80003353
4	JILL STEIN FOR PRESIDENT	C00505800	P20003984

### Obtaining committee summaries

Now that we have identifiers for the primary campaign committees associated with each candidate, we can obtain some interesting summary information about them. There are two different endpoints for getting financial information:

– `/committee/{committee_id}/totals` (straightforward cycle-wide totals) – `/committee/{committee_id}/reports` (actual reports submitted — advanced content!)

Let’s look at the more straightforward totals endpoint:

my_2012_prez_committee_totals = []

for i, row in my_2012_prez_committees_df.iterrows(): endpoint = ‘/committee/{c}/totals’.format(c=row.committee_id) for res in all_results(endpoint, {‘cycle’:2012}): my_2012_prez_committee_totals.append(res)

my_2012_prez_committee_totals_df = pd.DataFrame(my_2012_prez_committee_totals) my_2012_prez_committee_totals_df[[‘committee_id’,’contributions’,’disbursements’,’receipts’,]]

	committee_id	contributions	disbursements	receipts
0	C00431445	549594250	737507855	738503770
1	C00495820	39928730	39968390	41060317
2	C00493692	400036	739453	780900
3	C00431171	304959168	483073478	483452331
4	C00505800	819034	1122027	1263540

Merging these facts together with the metadata that we’ve already collected, we can start to produce some good comparisons:

comparison = my_2012_prez_committees_df.set_index(‘committee_id’).join( my_2012_prez_committee_totals_df.set_index(‘committee_id’), rsuffix=’.cmte’)

comparison = comparison.set_index(‘candidate_id’).join( my_2012_prez_candidates_df.set_index(‘candidate_id’), rsuffix=’.cand’)

comparison.set_index(‘name.cand’)[[‘disbursements’,’receipts’,]].plot(kind=’barh’)

![png](https://horseradish.s3.amazonaws.com/CACHE/images/photos/d8/d6/302006cb41f2/committee__total_disbursements_and_reciepts-800.png)

comparison.set_index(‘name.cand’)[ [‘individual_itemized_contributions’, ‘individual_unitemized_contributions’, ‘transfers_from_affiliated_committee’, ‘other_political_committee_contributions’, ‘candidate_contribution’ ] ]

	individual_itemized_contributions	individual_unitemized_contributions	transfers_from_affiliated_committee	other_political_committee_contributions	candidate_contribution
name.cand
OBAMA, BARACK	315170951	234409690	181700000	0	5000
PAUL, RON	21916605	18009455	1000500	2670	0
ROEMER, CHARLES E. ”BUDDY” III	374937	0	0	0	25100
ROMNEY / PAUL D. RYAN, MITT	103245581	25499257	146516071	1126219	0
STEIN, JILL	386655	427592	0	1786	0

comparison.set_index(‘name.cand’)[ [ ‘individual_itemized_contributions’, ‘individual_unitemized_contributions’, ‘transfers_from_affiliated_committee’, ‘other_political_committee_contributions’, ] ].plot(kind=’barh’, stacked=True, figsize=(10,10))

![png](https://horseradish.s3.amazonaws.com/CACHE/images/photos/cc/92/4b231faf4f90/committee__other_totals-800.png)

—

So, there you have it: a brief rundown of the OpenFEC API and some quick pointers on how to use it. The FEC making data available through an API is an encouraging step forward, and though it could use some improvements, we’re excited to see the FEC making positive changes to better educate the public on campaign finance in America.

Sunlight Foundation

Follow Us

OpenFEC makes campaign finance data more accessible with new API: Here’s how to get started