RWW: Earlier this year you gave an inspiring talk at TED about Linked Data. You described Linked Data as a sea change akin to the invention of the WWW itself – i.e. we’ve gone from a Web of documents to a Web of data. Can you please explain though how Linked Data relates to the Semantic Web, is it a subset of it?
TBL: They fit in completely, in that the linked data actually uses a small slice of all the various technologies that people have put together and standardized for the Semantic Web.
Linked Data uses a small slice of the technologies that make up the Semantic Web.
We started off with the Semantic Web roadmap, which had lots of languages that we wanted to create. [However] the community as a whole got a bit distracted from the idea that actually the most important piece is the interoperability of the data. The fact that things are identified with URIs is the key thing.
The Semantic Web and Linked Data connect because when we’ve got this web of linked data, there are already lots of technologies which exist to do fancy things with it. But it’s time now to concentrate on getting the web of linked data out there.
Web inventor Tim Berners-Lee and ReadWriteWeb founder Richard MacManus
Linked Data and Governments
RWW: In a recent Design Issues note, you urge governments to put their data online as Linked Data (although you’d also be happy for governments to just make available the raw data – presumably so that others can then structure it). What do you realistically expect, for example, the U.S. or U.K. governments to do over the next year? And in the near future, do you foresee different governments interconnecting their Linked Data sets?
TBL: One can’t generalize, governments are (like most big organizations) fascinatingly diverse inside them. So you’ll find that there are places inside governments where you get a champion who gets linked data and who’s just written a script and produced some linked data. So in the UK government for example, you’ll find there’s RDFa [in the code of its website] for civil service jobs. So if somebody wants to make a database of all the jobs, they can do that very easily.
“The first step of actually putting the data out there is the one that nobody else can do.”
There are other cases where the easiest thing for somebody to do is to just put data up in whatever form it’s available. Comma separated values (CSV) files are remarkably popular. They’re exported sometimes from spreadsheets. It’s remarkable how much information is in spreadsheets. Or sometimes pulled out of a database and then put up on the web. It’s not as good, not as useful to the community, as if Linked Data had been put up there and linked. But the first step of actually putting the data out there is the one that nobody else can do.
The way to go is for government departments to go the extra step and convert [their data] into Linked Data. One of the nice things about Linked Data, when they have a pile of it, is that they could run a SPARQL server on it. SPARQL servers are a commodity product, a solution for all of the people who say ‘but actually I wanted to have XML.’ A SPARQL server will generate an XML file [and] allow somebody to write out, effectively, a URL for the XML file.
“Linked Data is the backplane, it’s the thing that you connect to in both directions.”
In fact, I don’t see why SPARQL servers shouldn’t provide CSV files, something which as far as I know isn’t in the standards. But I’d recommend it, certainly in government context, because CSV files are what people have and what people want.
So the message [for government] is to use RDF. Linked Data is the backplane, it’s the thing that you connect to in both directions. As a [web] producer your job is to make sure that you produce Linked Data one way or another. And as a consumer, there are lots of ways to consume that data once it’s out there as Linked Data.