Follow Us

Tag Archive: tools

Some tools that might help the Pajamas Media’s Transparency Project

by

Roger L. Simon, writing at Pajamas Media, announces a new transparency project, soliciting suggestions from readers on what the blogosphere-bloomed news organization should dig into. I wouldn't presume to play assigning editor for the effort, but hope I can help by pointing to some resources (full disclosure--many, but not all, are built by or supported by the Sunlight Foundation) that might help Pajamas Media readers do some digging on their own and get the ball the rolling.

Simon notes that government spending is a big issue, and starts by asking about spending on government employees. He writes, "Some of ...

Continue reading
Share This:

Announcing Checking Influence

by

Checking Influence logoThis morning the Data Commons team released their newest tool: Checking Influence, a bookmarklet that lets online banking users gain insight on how the merchants with whom they do business are influencing our political system. We think it's a great example of the future of influence disclosure -- hopefully you'll agree.

But I won't prattle on about it any more here. The announcement blog post goes into more detail. I hope you'll give that a read, and give the tool a try.

Continue reading
Share This:

ScraperWiki is Extremely Cool

by

ScraperWiki logo

ScraperWiki is a project that's been on my radar for a while. Last week Aine McGuire and Richard Pope, two of the people behind the project, happened to be in town, and were nice enough to drop by Sunlight's offices to talk about what they've been up to.

Let's start with the basics: remedial screen scraping 101. "Screen scraping" refers to any technique for getting data off the web and into a well-structured format. There's lots of information on web pages that isn't available as a non-HTML download. Making this information useful typically involves writing a script to process one or more HTML files, then spit out a database of some kind.

It's not particularly glamorous work. People who know how to make nice web pages typically know how to properly release their data. Those who don't tend to leave behind a mess of bad HTML. As a result, screen scrapers often contain less-than-lovely code. Pulling data often involves doing unsexy thing like treating presentation information as though it had semantic value, or hard-coding kludges ("# ignore the second span... just because"). Scraper code is often ugly by necessity, and almost always of deliberately limited use. It consequently doesn't get shared very often -- having the open-sourced code languish sadly in someone's Github account is normally the best you can hope for.

The ScraperWiki folks realized that the situation could be improved. A collaborative approach can help avoid repetition of work. And since scrapers often malfunction when changes are made to the web pages they examine, making a scraper editable by others might lead to scrapers that spend less time broken.

Continue reading
Share This:

Open Notebook: Tax Havens

by

Just a few odds, ends and bits of reporting that didn't make it into this post that relies on data from the Foreign Lobbying Influence Tracker that we collaborated on with our friends from ProPublica:

Numbers: I really hesitated to use the administration's claims of $210 billion in tax revenue raised (there's a fairly good breakdown of which proposals raise what part of the $210 billion in the New York Times here). Tax havens offer secrecy, so even if the government knows how much money they hold (I doubt that it does) it can't determine who ...

Continue reading
Share This:

CFC (Combined Federal Campaign) Today 59063

Charity Navigator