And now: Semantic Statistics (SemStats)

Official Statistics has a long tradition in creating and providing high-quality metadata. And the Semantic Web needs just this: metadata!

So it’s not surprising that these two find together, more and more.
A special workshop will be organized during the The 12th International Semantic Web Conference ISWC, 21-25 October 2013, Sydney, Australia.

It is the 1st International Workshop on Semantic Statistics (SemStats 2013) organized by Raphaël Troncy (EURECOM), Franck Cotton (INSEE), Richard Cyganiak (DERI), Armin Haller(CSIRO) and Alistair Hamilton (ABS).

ISWC 2013 is the premier international forum for the Semantic Web / Linked Data Community. Here, scientists, industry specialists, and practitioners meet to discuss the future of practical, scalable, user-friendly, and game changing solutions.’

The workshop summary

How to publish linked statistics? And: How to use linked data for statistics? These are the key questions of this workshop.

‘The goal of this workshop is to explore and strengthen the relationship between the Semantic Web and statistical communities, to provide better access to the data held by statistical offices. It will focus on ways in which statisticians can use Semantic Web technologies and standards in order to formalize, publish, document and link their data and metadata.

The statistics community faces sometimes challenges when trying to adopt Semantic Web technologies, in particular:

  • difficulty to create and publish linked data: this can be alleviated by providing methods, tools, lessons learned and best practices, by publicizing successful examples and by providing support.
  • difficulty to see the purpose of publishing linked data: we must develop end-user tools leveraging statistical linked data, provide convincing examples of real use in applications or mashups, so that the end-user value of statistical linked data and metadata appears more clearly.
  • difficulty to use external linked data in their daily activity: it is important do develop statistical methods and tools especially tailored for linked data, so that statisticians can get accustomed to using them and get convinced of their specific utility.’

A tradition

RDF, Triples, Linked Data … these are topics statisticians already treated and adapted. But rather on an individual track and not as an organization.

This blog has a lot of information about Semantic Web and Official Statistics, about 40 posts since 2007.

See this post (2012) with a recent paper from Statistics Switzerland (where a study on publishing linked data has just been finished in collaboration with the Bern University of Applied Sciences):

Or this (2009) about SDMX and RDF or about LOD activities in 2009:

Europe has an Open Data Portal, too

The European Commission

opened its Open Data Portal some days ago.
Powered by CKAN.
Most of the 5811 datasets (97%) are statistical ones provided by Eurostat.
Top Publishers
Eurostat (5634 datasets)
European Environment Agency (106 datasets)
Joint Research Centre (37 datasets)
Directorate-General for Health and Consumers (12 datasets)
Publications Office (11 datasets)
Directorate-General for Education and Culture (3 datasets)
Directorate-General for Communications Networks, Content and Technology (2 datasets)
Directorate-General for Employment, Social Affairs and Inclusion (1 datasets)
Directorate-General for Enterprise and Industry (1 datasets)
Directorate-General for Regional and Urban Policy (1 datasets)‘.

Linked Open Data are provided

An important step! ‘The European Commission Open Data Portal is well aligned with the initiatives of linked data and semantic web technologies. The dataset metadata is available as triples on triple store and attached to the dataset records.’

One App for the moment…

… but an interesting one creating visualisations based on RDF input, with javascript based output (Highcharts charting library).
CubeViz is a facetted browser for statistical data utilizing the RDF Data Cube vocabulary which is the state-of-the-art in representing statistical data in RDF. This vocabulary is compatible with SDMX and increasingly being adopted.’