Datafest: ‘Amazing things can happen in a very short time’
The regression analysis, data visualization and computationally driven sound effects were definitely different. Still, there was much about the weekend’s bicoastal datafest that made a newsroom veteran feel right at home: the room full of bleary-eyed obsessives, the wrinkled piles of notes, clothes and discarded potato chip bags, and yes, the bouquet of stale bagels and flop sweat as deadline approached.
Something new and hopeful for journalism emerged from the storied room on the Columbia University campus where the Pulitzer Prizes are annually juried, and on Sunday six judges deliberated over datafest entries. Along with counterparts at Stanford University, they awarded $7,000 in prizes to teams that used data and technology to examine the influence of money in politics.
Conceived by Teresa Bouza, the deputy Washington bureau chief of the global news agency EFE, the bicoastal datafest was administered by the Sunlight Foundation in partnership with the Brown Institute for Media Innovation, Columbia’s journalism school, and Stanford’s graduate program in journalism. Major funding for the event came from the MacArthur Foundation. The awards ceremonies held simultaneously in California and New York on Sunday culminated a 36-hour marathon in which teams worked on their projects while gleaning lessons from the technology and data experts on hand to help their work. They also got pep talks from journalists about the importance of their effort.
“I have seen the future of journalism and its name is Big Data,” Steve Engelberg, the editor-in-chief of Pro Publica, told participants in a New York keynote address that was live-streamed to Stanford.
One example of what it can achieve: the FMS Symphony entry that New York judges made one of their top prize winners and that the audience at Stanford and Columbia voted best in show. Data analysts partnered with journalists from Reuters, the New York Times, the Huffington Post and the Daily Beast to scrape eight years of otherwise unparsable balance sheets that the U.S. Treasury issues every day to “create the first-ever electronically searchable database of the Federal government’s daily cash spending and borrowing.”
CSV Soundsystem, as the group puckishly dubbed itself, turned this into revealing data visualizations that illustrate, for instance, how many days what the government spends on Medicare tops what it collects in taxes. And the team literally made music of its work, interpreting the data in sound. “Chords were selected based on the derivative of account balance, and a melody was composed based on the federal interest rate. We also included a contrapuntal riff driven by the distance between accumulated federal debt and the legal debt ceiling,” the team wrote.
Other winners used data from Sunlight’s Influence Explorer, among other sources, to:
- Map campaign contributions of Silicon Valley tech firms that, the group noted, will be lobbying for more visas for highly skilled workers in the upcoming immigration debate;
- Apply rigorous statistical analysis of assumptions about the role of money in politics,
- And try to figure out whether Congress is really as useless as much of the public seems to think it is.
Another team used a well-known statistical analysis methodology to build a tool that will help journalists or voters flag potential fraud in county government spending patterns.
Still another winning team — co-piloted by a student in Columbia’s journalism school — combined painstaking data entry with a couple of mashups to produce an interesting visual study of lawmaker’s stock portfolios.
One participant, Jeremy Merrill, may not have won a prize, but he did come away with a good story. Merrill, a data journalist at Pro Publica, showed that 136 lawmakers were AWOL for votes at times when Sunlight’s Political Party Time showed themat fundraisers.
The datafest focused a dazzling array of talent on the challenge of bringing more transparency in politics. In addition to journalists, teams included PhD candidates in marketing and finance, a business professor from Iowa, a master’s candidate in biostatistics, and an energy researcher from MIT. Participants were exuberant about the cross-disciplinary cooperation and the results it achieved. “I never would have found a PhD in math if I hadn’t come here,” exulted one.
Both the Stanford and Columbia judging teams boasted a similarly diverse array of talent and credentials.
“What we have done this weekend is one of the noblest things that a university can do — really bring people together and build connections, said Bernd Girod, a professor of electrical engineering who directs the Brown Institute of Media Innovation at Stanford University. “Amazing things can happen in a very short time.”
Spoken like a true newspaper editor.
See below to learn more about the bicoastal datafest and share videos and background materials for the presentations. Check the event wiki for a full list of the projects and links to the work.