Metaweb has announced an open source release of structured data from Wikipedia. Via the email from get.theinfo email list:
“Hello from Metaweb. We’ve just released a GFDL licensed extraction of
Wikipedia in XML + relational form. Anyone is welcome to use it for
This follows Reuters recent announcement of Open Calais API to extract people, places, things, and simple relationships from unstructured text. (We are experimenting with similar techniques of entity tagging via open protocols at Sunlight.) Metaweb’s WEX’s is 57GB of download-able structured data from the largest peer-production encyclopedia project ever. The Semantic Web, so long discussed, is now beginning a virtuous cycle of innovation. We are entering the age of open source semantics. Like compounding interest, Moore’s Law, and exercise, results from the cycle of innovation around open source semantics will multiply quickly. If you thought Google circa 2007 is impressive, buckle your seatbelt and reach for your helmet. Things are about to move even faster.
Addendum: DBPedia is another project extracting data from Wikipedia in the RDF format.