No PDFs!


This week, Speaker Pelosi asked House administrators to post House members’ expenses on the Web, for the first time. We are quite excited about  Speaker Pelosi’s action , as it demonstrates a strong commitment toward increasing transparency and accountability. (Hard as this might be to imagine but currently, the House collects and publishes members’ expenses in a bound paper book on a quarterly basis.)

So now we hear that the first batch of expenditure reports will be posted before Aug. 31 in the PDF format. PDFs are notoriously challenging because they are difficult for computers to index and people to search .Now we are not so happy.

Congress needs to be urged to provide these reports in a format that is structured, searchable, downloadable and mashable. This will enable the reuse of information to improve public scrutiny. Assurances should be given to the public that these records will be permanently archived and the House should be encouraged to make these reports happen in as close to real-time disclosure as feasible.

All this will improve the public’s ability to better analyze the data, and that is key to making this new disclosure mandate fulfill Pelosi’s promise to increase transparency and ensure greater accountability to the public.

And, there’s no reason the Senate should be allowed to continue to keep their expenditures in the dark. The Senate leadership should be encouraged to follow the House’s lead and also publish senators’ expenditures online.

The Sunlight Foundation has repeatedly called for Congress to post these expenditure reports online—initially in March 2008 as part of our model Transparency in Government Act, which we posted on Since then, we have further encouraged online disclosure through blog posts and communications with Pelosi’s staff.

Speaker Pelosi’s new mandate also follows a recent scandal in the UK that has resulted in several lawmakers, including the Speaker of the British Parliament, to step down because of rampant misuse of public funds spent by Members of Parliament on personal items including repair of a castle moat. News of this scandal has hit the front pages of newspapers around the world. Undoubtedly, British Members of Parliament would have spent their allowances differently had they expected their purchases to be under public scrutiny.

Today, our newly networked citizenry has rising expectations of greatly expanded access to governmental information, so that it may play a fuller role in understanding, evaluating and participating in the workings of its government. At the same time, online transparency enhances the press’ ability to mind the work of government and be the eyes, ears and voice of the people.

Congress should be encouraged to maximize this new opportunity to show its dedication to truly creating the kind of transparency that technology now makes possible and that the public has come to expect.

Categorized in:
Share This:
  • tshirtman

    The main point is that pdf are “finished documents” it’s very difficult to manipulate the data inside, it’s a good thing to be able to produce them, but it’s a bad thing to have only that, what people need is the possibility to search the data and to produce the reports THEY want, to have some pdf is not a bad thing for people who wand general purpose information, but an API to query any independant information that would be in those pdf is the real needed thing.

    Well, now I hope one day my country will take this step…

  • PDF has its place. Except if you’re suggesting sending documents in TeX format…

  • Vincent Manis

    There’s a serious case of babies and bathwater here. Some PDF documents indeed can’t be searched, notably those that are scanned from paper documents. Others can be, and quite straightforwardly. Google does it all the time. So too do the PDF readers I normally use, Acrobat Reader, Mac Preview, and okular and evince on Linux. I find them just as easy to search as an HTML document, and easier to search than a tree of HTML documents.

    Speaking as a document producer, I find PDF invaluable. I produce documents on computer science, and HTML handles math too poorly to be of any use to me.

    But I spoke of babies and bathwater. It is essential to define a specific compliance level; I would recommend PDF 1.7, which is an ISO standard (please, no remarks about OOXML, the PDF standard is kosher), but maybe it’s too soon for that; perhaps an earlier level, say PDF 1.4, would support more software. It is quite reasonable to insist that PDF be searchable EXCEPT if it’s a scanned document (and no doubt software that searches such scanned documents will become practical eventually). It is also reasonable to insist that all DRM be turned off in the resulting document, so no password is needed to open the document, and it may reasonably be printed. But please, don’t take away a format that allows the producer to define the layout, use strange fonts that may not be on the consumer’s computer, and supports math, or chemical notation, or music.

  • rgz

    @Gerald Ellis

    Rubish, PDF can be edited by anyone.

  • Greg

    I don’t understand why it is so common to equate the PDF file format with security. If I cannot edit your website then who cares if I cannot edit a file downloaded from it? And why do people seem to believe that PDF files are somehow edit resistant just because it costs money to buy editor software? Is this not true of MS Word documents as well?

    And since this is public data regarding public money, should we not be able to take the files and edit them in order to perform analysis or combine them with other sources of data? Making them PDF files creates barriers that run counter to the reason for posting them in the first place.

  • P Roy

    1. I also, have found PDF files, thus far, to be easily accessible and resistant to being edited by just anyone.

    2. The Senate should be held to the same levels of accountability as the House!

  • Gerald Ellis

    I have found PDF files to be easily accessed and downloaded. However, and I consider this an advantage, the can not be edited by just any one. I feel this will keep certain malicious acts from occuring in this high speed era.

  • We need to get through to the folks in government that PDF != transparency. In some ways it’s worse than print, because it lets them say the information is online, yet it’s almost totally inaccessible.