On company identifiers, the web and reinventing the wheel


This guest post is by Chris Taggart, who co-chaired a workshop on organisational identifiers at the Open Government Data Camp in Warsaw last month. Chris is co-founder of OpenCorporates :: The Open Database Of The Corporate World.

How do you identify a company in a global world? Such a simple question, just 10 words, yet it is one that has been exercising many august institutions over the past few years, and one that is becoming more urgent as the world becomes more interconnected.

One of the problems is that ‘global’ aspect, requiring something that works across the world, respecting the different and varied jurisdictions and sovereignties, and avoiding a heavyweight governance structure, with all that that entails. Another, has to do with answering the question, ‘what is a company’. Is it something like this:

Or this:

Or like this:

The answer is, it depends on your perspective, and what you are trying to do – tracking political donations, identifying securities issued, identifying where the liability for a bad debt lies, etc. Most large ‘companies’ are in fact a complex network of interlinked corporate legal entities, in different countries, and with complex ownership structures.

In addition, different countries, different states even, take a different view of what they register as companies. And it’s from such uncertainties that Dun & Bradstreet have profited with their near 50-year domination (in the US at least) with the proprietary DUNS system.

However, from a legal perspective (which in the end is what all these things may come down to) it is a form of business that has a distinct legal identity, and this legal identity or personality allows it to agree contracts, employ people, have a bank account. This legal identity has to be created by something, and in virtually every case this is by a company register, which acts on behalf of the state to create it, to give it legal personality – and in fact in many countries, companies are referred to as ‘unnatural persons’.

The question is, however, how to identify these things, and how to refer to them. A common solution in the US is to use Dun & Bradstreet’s proprietary DUNS number. For a number of reasons, this is a terrible way to attack the problem – brilliantly demonstrated by Kaitlin Lee and colleagues at the Sunlight Foundation in their blog post.

In fact it was in trying to solve this problem that we first came up with the idea of OpenCorporates, whose goal is to have an entry for every legal corporate entity in the world. Only when you have such a list, openly licensed for free reuse (including commercially), will we be able to get a handle on how they are connected, where they exist, and what we think about them.

So, what do would a working company identification system look like, what would be the key properties?

  1. Free to reuse, unencumbered by Intellectual Property issues
  2. One that doesn’t create a monopoly ID system, that is it doesn’t require a lookup to an ‘owner’ of these IDs. Monopoly IDs create a single point of failure/dominance and are nearly always encumbered with subtle or de facto intellectual property (IP) issues. (Look at many of the standards organisations in the world and check out their IP restrictions.)
  3. Relate uniquely to the legal entity. This is a critical but overlooked requirement, and comes to the heart of company identifiers. What exactly are we trying to identify – if it’s the vague idea of ‘Microsoft’, we’re talking about a complex organism that changes over time, and depending on your perspective includes/or doesn’t include various majority and minority holdings, joint ventures and non-trading special purpose vehicles. Frankly if you’re talking about this, you might as well use the Wikipedia entry as proxy for what ordinary people consider to be the company.
  4. Be useful. From it you must have a route to get information about the legal entity – its status, filings, legal address, etc.
  5. Be up-to-date, i.e. there’s little or no time-lag between a company forming and an ID coming into existence
  6. Work on a global level. While there are many companies restricted to a single country, there are very few large ones, and it’s questionable whether very large companies even have a home country, as the issues surrounding BP and the Gulf Oil spill made clear.

Let’s look at some of the suggested solutions:

  1. Dun & Bradstreet’s DUNS system This spectaculary fails every single one of these tests, being proprietary, not linked to the legal entity (a company may have many factories, each of which has a DUNS number; also many legal entities don’t have one), and not useful. There have been attempts by various other companies to mirror this, tinkering with some of the issues, but all are essentially trying to create a monopoly system that locks you into their ID system.
  2. Company Names These change over time to an amazing extent, and not only can the same company have many different names over its life, the same company name can apply to multiple different and unconnected legal entities over a period of time.
  3. Stock market tickers (e.g. NASDAQ:GOOG) and security numbers (e.g ISIN). This falls down after just a couple of minutes of examination – a single company may have multiple listings, for different types of stock or security, listed on multiple exchanges. In addition many large companies aren’t listed at all, or have just one corporate legal entity out of many hundreds listed. Finally, they are rarely explicitly linked to legal entities, although that’s somethingOpenCorporates is working on fixing.
  4. Tax numbers – These superficially appear to tick many of the boxes, until you look at them more closely. For a start, tax numbers generally relate to a lot more than just companies, including individuals, unincorporated associations, etc. Corporate tax information has also in recent times been private (although that’s not always been the case, and there are certainly arguments that it shouldn’t be), and for this reason a company’s tax number is generally not made public. Many countries also have multiple tax systems, and multiple tax numbers – income tax, employee tax (social security), sales tax, national, local. Finally the tax number doesn’t usually map to legal entities – depending on the country a company may have multiple tax numbers (one for each division), or share a parent company’s tax number – and even when they do there’s rarely a route to the information about the legal entity. In fact, many of the most ‘interesting’ companies (those used in tax planning, and complex corporate structures) don’t actually have tax numbers – in the US this apparently can include S Corporations, some Limited Partnerships, and various flow-through companies. Having said that, tax offices undoubtedly have a lot of information about corporate structure, and opening up that would certainly shine a light on complex corporate entities. It’s also worth adding that in some jurisdictions (Spain, and apparently also Massachusetts) the tax number and the company number are the same.
  5. SEC numbers – these are great to know, but only relate to the biggest companies doing business in the US, and not only don’t they map to US corporate entities (many are for foreign companies), there’s no explicit link to the legal entity. In fact, the information they give is very problematic unless you know what you’re looking at. This, for example, is the SEC entry for Bunge, which at first sight would appear to be a New York company, but is actually registered in Bermuda.

Over the past few years, many have looked at this problem, coming up with increasingly complex solutions to address these issues. Most centre on a new system of IDs, with a new global register (with some mix of mandatory/voluntary registration), introducing errors, time lag, governance, technical issues, and of course meaning that many of the companies that the wider world wants to identify won’t be on there. In short, they are proposed 20th-century solutions for a 21st century world.

What we need is not AOL, but the web.

Bizarrely, this problem was encountered – and solved – a long time ago, when the first company registers were set up. In the UK (or rather England & Wales), this was in 1856, as the result of the 1844, 1855 and 1856 companies acts, and the first company on the register was the National Savings Bank Association Ltd with company number 1.

That company is no longer in existence, but the Ashford Cattle Market Company Limited is, registered on 25 September 1856, with company number 118. Over that time much has changed in UK and global company law and practice, but that company remains and it’s company number is still the same (though today it’s normally displayed as 00000118), and information on it can be found on OpenCorporates, other websites, and of course, the official company register, Companies House.

And in fact this pattern, of company registers issuing identifying numbers, is pretty much universal. But how do you distinguish between company number 12345 in Delaware say, and in Michigan?

A simple solution is to combine the name of the jurisdiction to the company number, so DELAWARE1234 and MICHIGAN1234 or UNITEDKINGDOM1234. A better solution in an interconnected internet world is to use web URIs to identify companies, for example the Ashford Cattle Market Company Limited is identified by the OpenCorporates URI of http://opencorporates.com/id/companies/gb/00000118 and the UK Companies House URI of http://business.data.gov.uk/id/company/00000118.

This has several advantages. A URI is by its nature unique, as the domain part of it can only be registered by one organisation at a time, which will serve up a web page corresponding to the bit after the domain – this makes it easy to distinguish between company number 1234 in Georgia, the US state, rather than Georgia the former Soviet republic. It also allows linking between ID systems – should the UK government start publishing tax numbers, for example, it’s easy for those to be added to the information at the Companies House page.

In fact the W3C wrote a very interesting paper in 2009, referenced by Kaitlin in her list of resources for the six degrees project. As I said on the open-companies mailing list when I came across it, when we decided on the OpenCorporates URI structure (which is OpenCorporates.com/id/companies/[ISO code for the jurisdiction]/[company number] ) we must have been channelling their thoughts, as there’s so much in common. A more likely explanation is that we were both looking at this in the context of the global interconnected network of the web.

Now that’s not to say there aren’t issues to be resolved – not least weaning the US government and other US institutions off the awful DUNS numbering system. There are also two jurisdictions in the world we have found that don’t issue company numbers – Sri Lanka and New York State. But these are issues for corporate governance in those areas, rather than identifiers as such, as there’s no way of knowing which legal entity you’re dealing with, and compared with the alternatives are trivial problems to solve.

And of course, it would be good to require US companies to put their company number on all the correspondence and websites, as happens in many other countries. But this system doesn’t require that to happen, just that it would be sensible for it to do so.

Using jurisdiction-based company numbers also has far fewer legal issues than central register solutions, allowing those that bring a company into existence to continue to identify it. Overriding the right of the individual US states to regulate and identify their own companies, still less other countries, particularly in the highly charged political backdrop of the financial crisis, is not somewhere anyone should want to go.