It's back to work after a 4th of July filled with hamburgers, hot dogs, and other non-meat options. Here's what the Labs was up to over the past month...Continue reading
I admit it: we missed the May Labs Update entirely. I'm sorry! It's been as busy as always around here, with a number of really neat -- but also really involved -- projects beginning to see some light at the end of their respective tunnels. Amidst that effort, we just plain forgot to get out a timely update last month. We promise we won't keep you in the dark like this again.
The most exciting news is that we have a bunch of new faces in the labs offices:
Drew Vogel has joined the Subsidyscope team, and after a few weeks of work by remote, he's now in the office, in person, and doing great things.
Ryan Sibley recently moved over to Labs from the Reporting Group. Ryan's been with Sunlight for a while, but the idea of having a journalist embedded with our developers is a new one, and something that I'm pretty excited about.
Casey Kimmey has joined the Open States team for the summer, where she'll be doing some invaluable data quality validation.
And finally, Montserrat Lobos is joining us for three weeks from our friends at Ciudadano Inteligente in Chile. She's going to be working with the design team; we're excited to have her.
Here's what the team has been working on:
Tom has been finally -- finally! -- finishing the process of getting the team back to a full headcount. He's also been doing the usual mixture of proposal writing, project oversight, and general triage. Also: some soldering and messing around with Titanium Mobile, the results of which will hopefully be published in a few months.
Alison added a new, corporate accountability dataset to TransparencyData and Influence Explorer. She has also been working on several name matching tasks and making some major additions, including parsing for organization names, to our Name Cleaver name parsing library. In addition, she has been experimenting with some visualizations based upon campaign finance data, which you may be able to look forward to cropping up on our site or in a blog post in the future.
Drew added a few usability enhancements to the Subsidyscope search tool. Now he is working on changes to the data importer that will allow us to provide direct expenditure figures based on more current USASpending data.
Jeremy has been hard at work destroying Sunlight web sites, but in a good way. We have decided to retire Public Equals Online and integrate the features into the main Sunlight Foundation site and organizing page. Jeremy also rebuilt TransparencyCamp.org to add a brand new mobile app and HTML based informational screens to display session information on monitors at the conference.
Eric has been integrating full-text searching with ElasticSearch into our Real Time Congress API for bills. There'll be new endpoints and features announced soon. He's also been wrapping up work on our soon-to-be-released iOS/Android mobile app to help people make better local health care decisions. Finally, he's been working on getting the first round of House expenditure data from the 112th Congress up into our expenditure database and House staff directory.
Aaron updated Follow the Unlimited Money for the new election cycle -- just in time for the special election in New York's 26th District -- and made some improvements to the Reporting Group's Lobbyist Tracker. He's also been working on Capitol Words (preview: the top words in Congress so far this year are "job", "cut", "create" and "repeal").
Kaitlin has been working on a third Roku app as well as building some backend tools for Subsidyscope. She's also been busy writing bombastic blog posts and checking up on her FOIA for contracting data quality reports. Also, she's been working with the other Caitlin to update the design and functionality of the Subsidyscope site.
Chris has been working on a variety of projects including: coding the new design for the House Staff Directory, creating graphics/signs/and other deliverables for Transparency Camp, new background theme for Sunlight's YouTube channel, and misc graphics for other Sunlight projects. She is currently working on a new theme for Sunlight's Data Viz Tumblr blog and continuing to work on the House Staff Directory site.
Ethan has been coordinating a number of new products and features in Data Commons. This week we released Inbox Influence and added POGO's contractor misconduct database to Transparency Data and Influence Explorer. We're hard at work on several new data sets to be released in July.
With Sunlight Health teetering on the edge of completion, Caitlin has turned her focus to building out the Subsidyscope redesign, interrupted briefly by a jaunt through the South and once again to help with updates to the Roku app that Kaitlin has been working on.
James has been adding support for Maine, New Hampshire, and Oregon to the Open State Project. He's also been working with new Open States intern Casey who has been doing data checking and cleaning to help promote more states from experimental to ready status.
Andrew prepped the Inbox Influence project for its launch, which was announced at the PdF conference in New York. He has also continued to work on extracting data from Regulations.gov.
Ali has been working on numerous small tasks to support the foundation in their design needs. She's been doing a little bit of everything from branding to visualizing data to giving cfbp advice on their new mortgage forms. Currently the big project on her desk is the new Capitol Words site.Continue reading
Tom has been working on finding some new team members, organizing an event about open corporate identifiers, writing grant proposals, and -- fingers crossed! -- arranging a grant from Sunlight to get a very cool project's very cool code open-sourced. More on that soon, he hopes.
Eric has been chugging along on building Sunlight Health, an upcoming iOS/Android app that aims to use open data about hospitals, pharmaceuticals, and more to help people make better local health care decisions.
Luigi conducted a webinar on HTML5 for the online News University. Naturally, an interactive HTML5 version of the presentation is available online. He also prepared an article on WebSockets and EventSource which should be published soon, and has continued work on Datajam.
Upon returning from SXSW, Jeremy released django-mediasync 2.1 which added support for Django 1.3. He has also been working with David on an analytics dashboard project funded by the Knight Foundation. As always, the month has been filled with a slew of improvements to various Sunlight properties including a relaunch of Read the Bill.
James and Michael continued their work on Open States work by writing more scrapers for state legislative data. At PyCon we hosted another Open Government Hackathon and the Open State Project saw quite a few new contributions. Additionally James expanded our bulk data download offerings and Michael changed the Open States Geo lookups to use boundaryservice, a project from our friends at the Chicago Tribune. The speed at which new states are being brought online is increasing and we expect we'll start turning on 2-3 new states per month.
Andrew has been working on a new scraper for pulling and parsing public comment data from Regulations.gov, as well as implementing the front-facing portions of some new functionality for Influence Explorer.
Ethan has been working on a clustering tool for detecting duplicate comments in federal rule making. The tool will be used by Reporting to find corporate influence in public comments.
Alison has been working on two name-matching tasks, matching politicians with officers of non-profit organizations and White House visitors with lobbyists. She has also been working on streamlining our data update process for Transparency Data and Influence Explorer.
Last week Kaitlin released the housing sector on Subsidyscope and interviewed some candidates for the open position on the Subsidyscope team. She also continues to plug away on new features for the tax expenditure database on Subsidyscope and some internal tools for grants and contracts analysis.
Caitlin has completed the design work for Sunlight Health and is working with Eric to build it out. She continues to work with Kaitlin on the redesign of Subsidyscope.
For the month of March, Chris continued working on wireframes and comps for the House Staff Directory; designed promotional materials for the Advisory Committee on Transparency; created some graphics for an upcoming partnership; made a bunch of presentations for Ellen; and created assorted design elements for other projects, like an icon for the Foreign Lobbying Influence Tracker.
The internet is a scarey place and timball had to deal with that badness for the month of march. First via an ISP switch over and then with a small contained "security" issue on a dev instance when a former consultant's keyring got hax0red. Otherwise the month of March has been learning and cooking with chef (learning the finer points of ruby has been interesting). He also bought a comically large amount of styrofoam peanuts for an April Fools' prank.
Aaron added yet another data set to the Reporting Group's lobbying tracker. We've already got a database of foreign lobbying filings, but it's updated infrequently. This new feature, scheduled to launch this week, will allow users to see Foreign Agent Registration Act filings as soon as they're posted on the government's fara.gov. But instead of having to use the government's search interface, users will be able to see the filings as a stream as they come in. He also continues to work on other Reporting Group projects and on Capitol Words.
Ali has been working on promoting Transparency Camp, creating some design elements for a cool email tool that the Data Commons team has been working on, building small organizing campaign pages, starting to build out the new design for Sunlight Live and teaching CSS classes to the organization.Continue reading
... or what was going on in the labs. I'm horribly late in posting this -- it turns out that I'm much, much worse at this than Josh was. Just another piece of evidence that we need more talented folks around here! Remember, we still have open positions.
Ethan attended the Computer Assisted Reporting Conference, worked on an algorithm for fast entity matching in text, and researched new content for the Influence Explorer homepage. He's now planning for new corporate accountability datasets and new lobbying-related features.
Kaitlin had a lovely vacation and then spent several days updating the USASpending data on Subsidyscope and is now squashing bugs in the soon-to-be-expanded tax expenditure database on the site. She also interviewed many a candidate for Subsidyscope and pitched in a little bit on the Clearspending testimony.
timball has been crying a lot over ISPs and is starting to familiarize himself with Chef, a new ruby based scaling solution. Also he says he gained 5lbs from eating in NOLA. We thought you should know.
Chris has been fabulously wireframing new layouts for the House Staff Directory, designing magically delicious HTML emails and newsletters, creating spectacular presentations promoting Sunlight's awesomeness, and providing Sugar-free-Red-Bull-fueled graphics support for a variety of little projects along the way (e.g. Capitol Defense, one Influence Explorer postcard, Sunlight's meetup page, new Twitter background, etc).
James and Michael have continued the process of expanding the reach of the Open States Project and migrating content to the new site The most recent update brings the project to 20 states and the District of Columbia. New functionality in the API is in the works, including the ability to query for bills by sponsor or issue area. We are also working on adding more ways for people to access the data without having to access the API directly.
Aaron added an additional lobbying dataset to the Reporting Group's lobbying tracker. Users can now see a list of post-employment notifications for former congressional staffers and members, including when they'll be eligible to lobby their old colleagues. He's also continued work on Capitol Words.
David is working on an analytics dashboard. He uploaded some sample data to Google's Public Data Explorer. He worked on pulling out structured data from GAO reports -- making some progress but also hit some obstacles.
Caitlin has been working with Eric and the reporting team on nailing down wireframes for the healthcare app and has been translating them into pretty sexy comps. She is also working with the other Kaitlin to redesign and streamline the Subsidyscope site. ...and stuff. She also helped launch the new Openstates site since the last Labs update.
Ali has been making a lot of ads lately to remarket the Sunlight Foundation and the reporting group and for new and upcoming Sunlight Live events. She has also been working on building out a new page for the organizing section of the Foundation and Sunlight Live.
Andrew has been working on new tools for adding influence-related context to text, focusing on a plugin for enhancing Gmail. He has also been experimenting with new scraping technologies.
Alison has been updating our Wikipedia scraper to pull in corporate logos to display on the organization pages in Influence Explorer. She has also been working on adding information to Influence Explorer detailing which bills organizations hired lobbyists to work on.
...and I (Tom) have been working on a bunch of proposals, organizing meetings around the corporate ID issue, writing some testimony related to Clearspending, and trying to find staff to fill the spots left by Josh and Kevin's departures. Also, daydreaming about what we're going to do with these enormous 7-segment LEDs.Continue reading
For those of you keeping an eye on the ball, working hard on your Apps for America 2 entries, I've got some great news for you: Data.gov has given itself a slight upgrade, adding a bunch more feeds. To compensate, Data.gov has turned itself into three subcatalogs: A raw data catalog, a tool catalog and a geodata catalog.
By far and away, the Tool and Geodata catalogs exceed the Raw Data catalog, but we still don't have our 100,000 "feeds." We have 999 data sources in the Geodata Catalog, 999 data sources in the Tool Catalog, and 267 in the Raw Data Feeds catalog. These 999 numbers are troubling. Hopefully the software supports more than 1000 data feeds in each subcatalog.Continue reading