Vivek Kundra

 

Momentum Building To Avert Budget Technopocalypse

While members of Congress and the White House debate whether $33 billion is the right amount by which to cut the federal budget, the rest of us are focused on where these cuts will fall. For our part, we’re trying to save the $34 million that funds Obama-era tech innovations -- like Data.gov, USASpending.gov, and the IT Dashboard -- from the budget ax. And we’re not alone.

For example:

  • And we’re giving Charlie Sheen a run for his money with hundreds if not thousands of tweets at #savethedata (& we’re #winning!)

So how can you help? Go to our website to sign our letter to Congress, write a letter to the editor, and spread the word via Facebook or Twitter. For more information, read my original report, our open letter, and this update from yesterday.

Budget Technopocalypse Deepens: Transparency Sites Will Go Dark In A Few Months

Federal News Radio has an interesting follow-up to my interview with them yesterday on the budget technopocalypse. I wrote last week that Data.gov, USASpending.gov, and other Obama tech innovations face virtual extinction because it appears that Congress will cut their collective budgets from $34m to $2m. We and many others are sending an open letter to congress in an effort to save these vital transparency programs.

Federal News Radio executive editor Jason Miller reports on the stakes:

One government official, who requested anonymity because they didn't get permission to discuss the topic, said funding will begin to run out on April 20 for public sites IT Dashboard, Data.gov and paymentaccuracy.gov. The source said OMB also is planning on shutting down internal government sites, including Performance.gov, FedSpace and many of the efforts related the FEDRamp cloud computing cybersecurity effort.

The official said two other sites, USASpending.gov and Apps.gov/now, will run through July 30 but go dark soon after. "We need at least another $4 million just to keep USASpending.gov operating this year," the official said. "We are looking at a pass-the-hat approach, but it could be challenging to get that done in time."

Rep. Serrano weighed in:

"The detrimental effect of HR 1 on so many areas of government is clear—and perhaps no more so than on the efforts to ensure the government's IT infrastructure upgrades are proceeding on schedule and on budget," said Rep. Jose Serrano (D-N.Y.), ranking member of the House Appropriations Subcommittee on Financial Services and General Government. "We cannot have a more streamlined, efficient and open government without using the best technology available. Unfortunately the cuts in H.R. 1 to e-government fund will have the unintended consequence of making government less accountable and transparent."

As did Senator Lieberman:

"Economic conditions demand wise budget decisions, but cutting money from multiple federal IT programs is penny-wise and pound foolish," said Leslie Phillips, a spokeswoman for the Senate Homeland Security and Governmental Affairs Committee, which Lieberman is the chairman of. "Programs that modernize technology ultimately improve management and save taxpayers billions of dollars. Transparency and e-government programs encourage public participation in government. Small investments in IT modernization can reap enormous rewards, which is why Senator Lieberman opposes the proposed cuts to the e-gov fund and the administration's IT reform efforts."

I’ll keep you updated as developments happen.

Open Letter: Congress Must Protect Transparency Programs in Budget Negotiations

Today we are releasing an open letter to congressional leaders in an effort to save vital transparency programs. In light of quickly evolving circumstances, we prepared the following document and are encouraging organizations and individuals to sign-on. Please add your names and organizations in the comments. Later on we will transmit the final version with the signatories.

Last week I wrote about proposed cuts to the Electronic Government Fund that would effectively defund Data.gov, USASpending.gov, the IT Dashboard, and other programs. Time is running out for Congress to pass a budget for FY 2011, and a rush to avert a government shutdown may result in these programs falling by the wayside. We cannot afford to let the government's transparency efforts go dark.

Open Letter: Congress Must Protect Transparency Programs in Budget Negotiations

Budget Technopocalypse: Proposed Congressional Budgets Slash Funding for Data Transparency

Data.gov, USASpending.gov, and other Obama tech innovations face virtual extinction if the FY 2011 budget bill passed by the House of Representatives in February or considered by the Senate in March becomes law. The funding source for these e-government initiatives is the Electronic Government Fund, a $34 million bucket of money that would be drained to $2 million for the remainder of this fiscal year. The House and Senate’s inability to agree on long-term budget legislation has kept these initiatives alive at FY 2010 levels.

Some projects facing defunding include the recently-launched cloud computing initiative, the information repository data.gov, the government-spending reporting site USASpending.gov, citizen engagement tools, and online collaboration tools. Altogether, six project areas apparently will be affected by the cuts. Vivek Kundra, the Federal CIO who is responsible for allocating the Electronic Government Fund, will have to make some very difficult choices.

Although the Electronic Government fund was never allocated the kinds of money envisioned by the authors of the E-Government Act of 2002, starting in FY 2010 the fund was beefed up to $34 million by the incoming Obama administration and Democratically-controlled Congress. Funding levels for the past decade hovered around $2-3 million.

The funding necessary to keep these programs in place is illuminated by the IT Dashboard, one of the spending-tracking initiatives under Vivek Kundra's leadership. According to the Dashboard, over the last few years data.gov has cost $8.3 million; the cloud computing initiative has cost $1.4 million; and USASpending.gov has cost $13.3 million -- the legislation creating USASpending.gov was co-sponsored by Senator Coburn and then-Senator Obama.

The returns from these e-government initiatives in terms of transparency are priceless. They will help the government operate more effectively and efficiently, thereby saving taxpayer money and aiding oversight. Although we have significant issues with some of these program’s data quality, and we are concerned that the government may be paying too much for the technology, there should be no doubt that we need the transparency they enable. For example, fully realized transparency would allow us to track every expense and truly understand how money -- like that in the electronic government fund -- flows to federal programs. Government spending and performance data must be available online, in real time, and in machine readable formats.

Ultimately, it’s unlikely that either budget bill will be enacted into law in their current forms. But there is reason for alarm. Each house has considered providing only $2 million for the Electronic Government Fund, although the six continuing resolution have so far sustained current funding levels on a pro-rated basis. Looking ahead, the Administration called for $34 million in its budget request for FY 2012. The unsettled financial climate means that we can expect this funding fight to continue.

An open and accountable government is a prerequisite for democracy, and keeping these programs alive costs a mere pittance when compared to the value of bringing the federal government into the sunlight.

White House Announces Leading Practices Winners

On Thursday, the White House announced the winners of their Leading Practices initiative, that they first outlined in April.

The Leading Practices were designed to highlight examples where agencies have risen above the expectations set by the White House, and proactively attained a higher standard of transparency. (I participated in helping to establish the leading practices standards.)

The winners are a collection of some of the best transparency work being done at federal agencies, with HHS taking the slot for transparency (quite deservingly). These winners are a nice counterpart to the White House page on Open Government Highlights.

As I wrote when the Leading Practices were first announced, though, there is a bittersweet element to this congratulatory platform. As the White House rightly points to the great work some agencies are undertaking, we can't help but wonder whether there is an analagous effort being undertaken with agencies who are struggling with (or blowing off) the Directive's requirements.

While we can hardly expect the White House or OMB to publicly chastise any laggard agencies, we do have to wonder how much of a private stick exists to go along with this public carrot.

Improvements Needed For High Value Datasets On Data.gov

This morning a number of organizations -- POGO, OMB Watch, CREW, National Security Archive, the Center for Democracy and Technology  and the Open The Government coalition-- and Sunlight sent a letter to Vivek Kundra, Federal CIO, about improvements needed to the release of High Value Datasets on Data.gov. Here are the core recommendations included. Please tell us what you think in the comments below.

As advocates for government openness, we support the Administration’s efforts to provide the public with access to information through Data.gov. We are eager to work with you to ensure the success of Data.gov and, in that spirit, write to raise our concerns with the datasets submitted by agencies to fulfill their requirement under the Open Government Directive to post three high value datasets by January 22, and to offer constructive suggestions for improving their usefulness. As an overall recommendation, we urge you to add public representatives to the Open Government Initiative interagency working committee and ask the committee to address the problems and recommendations identified below. Release Format and Usability by the Public We understand one of the primary purposes of Data.gov is to enable the technology community and transparency advocates to most effectively use the data to make a direct impact on the daily lives of the American people. The format of the data plays a key role in its usability; many within the community of advocates who re-use and repackage government data would prefer data in CSV format, rather than the XML format in which many of the posted databases are provided. Accordingly, we recommend that you strike an appropriate balance between formats (such as XML) that serve the coding community and web-based presentations by agencies that can be used and understood by the general public. In addition, some of the currently posted files are quite large, ranging upward to several hundred megabytes. Their large size undermines their usefulness for most people or organizations. The large number of currently posted datasets also makes it difficult to find a particular database of interest. We therefore recommend that if a Data.gov dataset is available from an agency through a web-based interface, Data.gov link to that interface on the dataset's Data.gov landing page. For a consumer looking for information on a car seat, for example, it would be far easier to search the Department of Transportation's online database rather than scrolling through screen after screen of raw data in XML format. Additionally, as agencies continue to post datasets to Data.gov, efforts should be made to identify those of greatest public interest that lack such interfaces and develop web interfaces that allow the data to be explored online. Further, while we agree there is value in aggregating government data in a single site, it is questionable how much the collocation of the currently posted information on Data.gov actually benefits the public. The site is not searchable by topic and does not provide any way to bring together data from different sources on similar topics. As an enhancement to the organization of the site, we recommend that you use tagging or metadata to enable the public to bring together information on a topic. The thesaurus that USA.gov uses provides a useful example of the needed vocabulary. Value of Data The release of the datasets also has prompted discussions about the value and the quality of the released data, and the additional value provided by access to existing data in a new format. We believe repackaging old information is of marginal value, yet that is what many agencies have done with their recent postings on Data.gov. According to the Sunlight Foundation, of 58 datasets posted by major agencies, only 16 were previously unavailable in some format online. This leaves the impression that agencies posted easily available data, the proverbial low-hanging fruit, rather than seriously considering which of their datasets truly are of high value. While these initial postings can be considered a test run, more attention needs to be directed toward ensuring the overall quality and usefulness of the data. In addition, sustained attention should be paid to the possibility of making some of the datasets available as feeds that are constantly up to date, rather than as static datasets that are pulled down and then reposted on an occasional basis. We recommend that agencies be required to explain why the data is high value by having them designate which of the “high value criteria” the data meets: information that can be used to increase agency accountability and responsiveness; improve public knowledge of the agency and its operations; further the core mission of the agency; create economic opportunity; or respond to need and demand as identified through public consultation. Similarly, we recommend requiring agencies to indicate whether a high value dataset was previously unavailable, available only with a FOIA request, available only for purchase, or available, but in a less user-friendly format. Going forward, this will make it much easier to track how agencies are complying with the other requirements of the Open Government Directive. While we appreciate the value of data that furthers the mission of an agency, we believe it is equally important to make available to the public data that holds an agency accountable for its policy and spending decisions. We hope to see more datasets of this type available in the near future. Quality As is to be expected in efforts of this type, there were a number of glitches--datasets that could not be downloaded or, once downloaded, could not be opened (the Central Contractor Registration FOIA extract from the General Services Administration seems to have caused several users problems). Additionally, some datasets were incomplete (the Hazard Grant Mitigation Program data released by FEMA is missing 23 years of data between 1966 and 1989). Even more troubling, some did not have header rows, and for those that did, their Data.gov pages did not always link to code sheets explaining what those header rows meant. Without this information, the data cannot be used. We therefore urge the implementation of a responsive feedback mechanism that allows the public to alert an agency that a specific dataset is not working, lacks information, or is missing explanatory material and provides a response to the concerns within a specified time. One way to address this may be to include an agency contact with the ability to resolve any database problems or provide information about the database. The interagency working group could sample the quality of these agency-specific dialogues to ensure that they are having an impact and to develop recommendations on best practices to improve the responsiveness. Additionally, we strongly recommend that all datasets on Data.gov be directly associated with their code sheets. Finally, we are concerned with the current lack of public notice when data is removed from the site. We respectfully urge you to note all raw tools and data that are removed from Data.gov, and to provide an explanation for their removal. Many of the concerns outlined above apply across all or many of the agencies’ datasets. Accordingly, we think that standards for handling these types of problems can easily be addressed through the interagency working group and then disseminated amongst the agencies.

Hearing on Contractor Database Transparency

If you've ever tried to research federal contracts you'll find that the databases used to house those contracts online are not so great. Sen. Claire McCaskill held a hearing yesterday titled, "Improving Transparency and Accessibility of Federal Contracting Databases." Nancy Scola wrote up the hearing and it isn't pretty:

All told, there are a million lines of code involved. But there's really no all told here, because the databases don't talk to one another. For example, FPDS, the Federal Procurement Data System doesn't communicate with EPLS, which stands for Excluded Parties List. Which means that theUSASpending.gov website -- heralded as the American public's window into the inner-workings of government, but powered by FPDS -- doesn't even know that contractors contained within it have been banished from government service for defrauding the United States government or otherwise behaving badly. What's more, on some of these legacy systems, a search for Contractor X, Inc. won't return results for Contractor X Inc. The shorthand for that particular wrinkle came to be known, during the hearing, as "the comma problem." In fact, GAO's William Woods explained to the senators, the poor state of those databases meant that when his agency was asked by Congress to detail how many contractors were billing the United States government for work in Afghanistan and Iraq, the government watchdog group was forced by technology to admit its ignorance. "We could not answer those questions," said Woods. How many KBRs are at work in American war zones, being paid with taxpayer dollars? How many Blackwaters? Dunno.

The biggest problem, however, didn't turn out to be the current state of disrepair, but rather the inability to figure out what to do with the whole disclosure regime. To the surprise of almost everyone in the committee room, the General Services Administration (GSA) has been working to create a more sensible contractor disclosure regime with a more accessible public face. It was difficult for federal Chief Information Officer Vivek Kundra to identify who exactly would be overseeing the -- yes -- contract to revamp the databases. Ultimately that responsibility came down to either the GSA, the Office of Management and Budget or the Office of Federal Procurement.

As Scola writes, "Senator Robert Bennett spoke for many of us today when he sat up on the dais in room 342 of the Dirksen Senate Office Building and rubbed his temples over, and over, and over, and over again."

Real-Time Data Program Wins Innovation Award

I know this is a couple days old, but it hasn't been mentioned here yet. The District of Columbia's real-time online data disclosure project was one of six winners of the Innovations in American Government awards given out by the Harvard Kennedy School's Ash Institute for Democratic Governance and Innovation. The project was spearheaded by then-D.C. Chief Technology Officer (CTO) and current federal Chief Information Officer (CIO) Vivek Kundra. You can see the two sites singled out for praise below:

According to the Ash Institute, "this is the first initiative in the country that makes virtually all current district government operational data available to the public in its raw form rather than in static, edited reports." Real-time data disclosure is becoming far more common in cities across the nation with San Francisco introducing DataSF.org and the New York City legislature examining open data legislation. (Vancouver, Canada has also endorsed the release of city data in raw form.)

Real-time, raw data disclosure is the cutting edge in transparency and government innovation. While the federal government has released Data.gov, a raw data site similar to D.C.'s, there are countless sets of public data compiled by the federal government that are in one or more of the following three categories: 1) Not online; 2) Not in a structured format; 3) Not compiled and disclosed in real-time. As many public data sets as possible should meet these three criteria. For some data it is unreasonable to ask for real-time disclosure. These sets should then, at least, meet the first two.

Side note: It's great to see my city defy our Rodney Dangerfield-like existence and finally get some respect.

This Week in Transparency - July 17, 2009

Here are a few of the more interesting media mentions of Sunlight and our friends and allies from the week:

Jeff Jacoby, columnist for The Boston Globe, mentioned ReadTheBill.org in a piece he wrote calling on congressional lawmakers read legislation before they vote on it. Glenn Reynolds, at his Instapundit blog, linked to Jacoby's column. Andrew Sullivan's blog, The Daily Dish, followed by linking to Reynolds.

In Washington Monthly's July/August edition, Charles Homans wrote about the Obama administration's "experiments with data-driven democracy." The article centers on the work of Vivek Kundra, the White House's chief information officer, and mentions both the District of Columbia's Apps for Democracy contest and Sunlight's Apps for America contest. Homans quotes Clay Johnson, Sunlight Labs' director, saying Kundra has his work cut out for him. "I have nothing but respect for what he’s trying to do. But it’s a hard job, and it’s going to take some time for this to actually happen right. I mean years." While discussing Kundra's launch of Data.gov, Homans again quotes Clay, "The top data source is on the world’s copper smelters, which isn’t going to tell us very much about what’s going on inside of our government."

As Ellen Miller, Sunlight's director, wrote earlier this week, "When it comes to following the money that’s flowing to power on Capitol Hill, no one does it better than the Center for Responsive Politics." For instance, MAPLight.org used CRP data to show how money watered down the energy bill, the American Clean Energy and Security Act of 2009 (HR 2454). With Congress debating health care reform, Forbes used CRP data to show how America's Health Insurance Plans, the political advocacy and trade group for the health insurance industry, has spent nearly $10 million on lobbying Congress in the past two years. Robert J. S. Ross, writing at The Huffington Post, quotes CRP about how the insurance industry has contributed $568 million to political campaigns since 1998. CNN's Jonathan Mann used CRP data in noting how doctors have spent roughly two-thirds of a billion dollars lobbying lawmakers in the last 10 years.

Sunlight's launch of the National Data Catalog generated a number of good media mentions. Federal News Radio's Dorothy Ramienski interviewed Clay about the launch, who said the impetus for the new site is that Data.gov can't go as far as some would like because of laws that are already in place, such as the Paperwork Reduction Act. "For instance, right now Data.gov only has information around the executive branch of government. It doesn't have any information around the judicial or the legislative branch of government and we don't have any indication as to whether or not it can." Marshall Kirkpatrick at ReadWriteWeb asked, "Can Sunlight build a one-stop-shopping destination for public data, and will people make use of that? Time will tell, but it sounds like a very important project." And Next.gov's Aliya Sternstein referred to the catalog as "a public-service Web site that pulls and repackages federal data - fulfilling the aim of the White House's 'democratizing data' campaign."

National Public Radio's Dina Temple-Raston, in a piece that aired on the network's "Morning Edition," reported how analysts at the FBI and CIA are turning to software to help find patterns among terrorists — hoping to spot clues in everything from phone calls to credit card and ATM usage. She interviewed Jim Dempsey, the director of the Center for Democracy and Technology, "There had been, over the past seven years, this sense that if you collect more and more data and put it into a powerful enough computer, shake it and bake it the right way you'll come up with the unknowns" — terrorists who aren't yet on law enforcement's radar screens, Jim said. "I think, and other people who are more technically adept than I think, that's really a fool's errand."

John Moore at Federal Computer Week wrote how Web 3.0 could help make President Obama’s dream of government transparency a reality, but he’ll need a second term to see it happen. "The Web’s traditional function is to simply present content, such as a government report posted online. The Semantic Web goes a step further by seeking to illuminate the content’s meaning," Moore wrote. While discussing the challenges, Moore lists the time and effort required to tag and describe the government’s vast data holdings. He quotes Clay expressing concern that the government might become preoccupied with formatting data rather than releasing it. “I would hate to see them get bogged down in trying to make their data Semantic Web compatible before it even sees the light of day,” Clay said. Gary Bass, director of OMB Watch, said his group would like to look at government contractors to see if they comply with Occupational Health and Safety Administration, Equal Employment Opportunity Commission and other agency directives. But the group would need to know that a company listed in one database is the same entity listed in others. “Semantic technology, if done properly, should be able to tell us that,” Gary said.

Veteran reporter J. Scott Orr, writing at Parade magazine, reports on how federal contracts often waste taxpayer money for lack of proper oversight. He cites an investigation (PDF) by the Government Accountability Office that found required performance assessments were conducted for less than one-third of the 23,000 contracts it surveyed. Orr quotes Scott Amey, general counsel to the Project on Government Oversight, saying the feds would save billions of dollars if they would more efficiently collect and share performance data. “Considering Uncle Sam spent over $530 billion last year,” Amey says, “a higher priority must be placed on choosing contractors that are a wise investment.”

U.S. Rep. Bill Cassidy (La.) wrote a column in The Huffington Post calling for more earmark disclosure in Congress. He wrote how he and Rep. Jackie Speier (Calif.) worked with Taxpayers for Common Sense and Sunlight to introduce House Resolution 440, which would strengthen transparency and accountability in the earmarking process.

Think Progress' Matt Corley wrote about a memo GOP message guru Frank Luntz wrote defining the Republican rhetoric on health care reform. Corley quotes from and links to Sunlight senior writer Paul Blumenthal's blog post where he used Capitol Words to show how congressional Republicans are following Luntz's advice. At his Liberaland blog, Alan Colmes, the liberal commentator, syndicated radio talk show host and Fox News Channel political contributor, also linked to Paul's post and republished the infographic that used Capitol Words data to show the impact of the memo.