LOD MOOC

Massive Open Online Courses (MOOC) are available worldwide and offer tons of topics, also about Linked Open Data (LOD). An easy way to enter the semantic web.

Two examples:

HPI

snip_opehpi

The Hasso Plattner Institute, Potsdam provides, for some years now, a course in Linked Data Engineering with a certificate. I did it some years ago and enjoyed it.

FUN (INRIA)

snip_lodcourse2

The French platform FUN offers a LOD course, too. (Thanks to Adrian at zazuko.com for the hint)

And books

Step by step Bob DuCharme introduces RDF, SPARQL, LOD …

snip_ducharmesparql

.

snip_ducharmesparql-preface

Open Data Portals: News

There are new or refurbished open data portals to be announced.

opendata.swiss

Switzerland just published opendata.swiss in a new look for a better presentation of data. See the press release.

snip_opendataswiss

snip_swissopendataabout

europeandataportal.eu

The European Commission published some months ago the European Data Portal.

snip_EuropeanDataPortal

europeandataportal.eu is much more than a collection of open data. It is an ecosystem with lots of documents explaining and promoting open data.

snip_euportalaims

SPARQL inside!

The portal offers metadata as linked open data with an SPARQL endpoint for powerful searching.

snip_sparql

select ?theme (count(?theme) as ?count) where {?s a dcat:Dataset . ?s dcat:theme ?theme} GROUP BY ?theme LIMIT 100  gives all  data categories/themes and their number of datasets .

Impact studies

Most of all these data are already published on other websites. The advantage of such open data portals are a centralized access and clear licence information, A main intention is to attract developers, to foster data usage and with this economic growth.

A Swiss study (January 2014) assesses the economic impact of Open Government Data: ´The report determined that the economic benefits from OGD for Switzerland lie most likely between CHF0.9 B and CHF1.2 B´.

snip_ogdstudie       All the details >>> here  (look for the extended executive summary).

European Study (November 2015) within the context of the launch of the European Data Portal got these results: “The aim of this study is to collect, assess and aggregate economic evidence to forecast the benefits of the re-use of Open Data for the EU28+. Four key indicators are measured: direct market size, number of jobs created, cost savings, and efficiency gains. Between 2016 and 2020, the market size of Open Data is expected to increase by 36.9%, to a value of 75.7 bn EUR in 2020. The forecasted number of direct Open Data jobs in 2016 is 75,000 jobs. From 2016 to 2020, almost 25,000 extra direct Open Data jobs are created. The forecasted public sector cost savings for the EU28+ in 2020 are 1.7 bn EUR. Efficiency gains are measured in a qualitative approach. ”

snip_EUimpactSee the details >>> here

Next: LOD

Open and machine-readable formats help to access data and foster the economic impact. Even better when the data have metadata in a standardized description. Linked Open Data (LOD) in RDF format provide this; europeandataportal.eu uses this format describing the harvested datasets (metadata). The next step will and must be data in this format in order to link masses of data in the linked data cloud.

With data.admin.ch a first step is been made in Switzerland.

snip_dataadmin

Linked Data? In europeandataportal.eu’s ecosystem well made videos present explanations:

snip_learnLOD

 

 

LOD Cloud Growing

Linked Open Data Cloud is growing. The new diagram as of April 2014 shows this development, compared to 2011 (diagram below).

Linked Data Cloud 2014

2014-07-26LODCloudDiagram

In the government sector growth is especially visible with the geospatial reference portal provided by the Office for National Statistics ONS.

‘The ONS linked data portal is the access point for information on statistical geographies required to support the use of official statistics. It is designed to allow users to discover, view and use geospatial data.’

Other statistical data portals are now visible too. Linked data become a new standard. So data.admin.ch by the Swiss Federal Statistical Office (BFS), IMF, FAO, Worldbank or Eurostat.

2014-07-26-LODCloud2014-partial

 

 

Linked Data Cloud 2011

2014-07-26-LODCloud2011

 

And now: Semantic Statistics (SemStats)

Official Statistics has a long tradition in creating and providing high-quality metadata. And the Semantic Web needs just this: metadata!

So it’s not surprising that these two find together, more and more.
A special workshop will be organized during the The 12th International Semantic Web Conference ISWC, 21-25 October 2013, Sydney, Australia.

It is the 1st International Workshop on Semantic Statistics (SemStats 2013) organized by Raphaël Troncy (EURECOM), Franck Cotton (INSEE), Richard Cyganiak (DERI), Armin Haller(CSIRO) and Alistair Hamilton (ABS).

ISWC 2013 is the premier international forum for the Semantic Web / Linked Data Community. Here, scientists, industry specialists, and practitioners meet to discuss the future of practical, scalable, user-friendly, and game changing solutions.’

The workshop summary

How to publish linked statistics? And: How to use linked data for statistics? These are the key questions of this workshop.

‘The goal of this workshop is to explore and strengthen the relationship between the Semantic Web and statistical communities, to provide better access to the data held by statistical offices. It will focus on ways in which statisticians can use Semantic Web technologies and standards in order to formalize, publish, document and link their data and metadata.

The statistics community faces sometimes challenges when trying to adopt Semantic Web technologies, in particular:

  • difficulty to create and publish linked data: this can be alleviated by providing methods, tools, lessons learned and best practices, by publicizing successful examples and by providing support.
  • difficulty to see the purpose of publishing linked data: we must develop end-user tools leveraging statistical linked data, provide convincing examples of real use in applications or mashups, so that the end-user value of statistical linked data and metadata appears more clearly.
  • difficulty to use external linked data in their daily activity: it is important do develop statistical methods and tools especially tailored for linked data, so that statisticians can get accustomed to using them and get convinced of their specific utility.’

A tradition

RDF, Triples, Linked Data … these are topics statisticians already treated and adapted. But rather on an individual track and not as an organization.

This blog has a lot of information about Semantic Web and Official Statistics, about 40 posts since 2007.

See this post (2012) with a recent paper from Statistics Switzerland (where a study on publishing linked data has just been finished in collaboration with the Bern University of Applied Sciences): https://blogstats.wordpress.com/2012/10/15/imaodbc-2012-and-the-winner-is/

Or this (2009) about SDMX and RDF https://blogstats.wordpress.com/2009/10/27/sdmx-and-rdf-getting-acquainted/ or about LOD activities in 2009: https://blogstats.wordpress.com/2009/04/25/semantic-web-and-official-statistics/

LOGSD

Curious about abbreviations? Here’s (a new) one: Linked Open Government Statistical Data LOGSD.

LOGSD are statistical data official statistics agencies provide in a LOD format for reuse. And such reuse may combine (mash up) statistical LOD with other sources in the LOD Cloud.

For example: ONS

The Office of National Statistics ONS and others in UK are very active in this field. So for better accessing geographical metadata which are essential in presenting statistics:

‘The solution is to use data.gov.uk as a single access point for discovery of geographic data, and to link from there to a geoportal (that is currently in development) where users could download the geographic products online. This goes most of the way to delivering the tools that users need to work with statistical data but there is also an opportunity to go further and provide geographic data as linked data, using the GSS codes that uniquely identify each geography to link the attributes from the different geographic products. Now, instead of a 9 character GSS identifier, each geography is given a URI that allows it to not only be uniquely identified but also makes it available online. We therefore end up with identifiers such as http://statistics.data.gov.uk/id/statistical-geography/E05008305 that only require users to change the GSS code at the end to get to the geographic information that they need.’ http://data.gov.uk/blog/update-from-ons-on-data-interoperability-0

Explain-a-LOD

And here an example how LOD and statistical (not yet LOGSD) data could work together. It’s an experimental proof of concept using data from Mercer quality of living survey and Transparency International, enriching these data with more information from DBpedia and calculating correlations that lead to hypotheses about the data.

Heiko Paulheim from Technische Universität Darmstadt made this interesting experiment which illustrates how linking data works. Abstract of Paulheim’s study “Generating Possible Interpretations for Statistics from Linked Open Data’ :

Statistics are very present in our daily lives. Every day, new statistics are published, showing the perceived quality of living in different cities, the corruption index of different countries, and so on. Interpreting those statistics, on the other hand, is a difficult task. Often, statistics collect only very few attributes, and it is difficult to come up with hypotheses that explain, e.g., why the perceived quality of living in one city is higher than in another. In this paper, we introduce Explain-a-LOD, an approach which uses data from Linked Open Data for generating hypotheses that explain statistics. We show an implemented prototype and compare different approaches for generating hypotheses by analyzing the perceived quality of those hypotheses in a user study.’

LOD Essentials

www.semantic-web.at provides a quick start guide for all interested in Open Data, Open Government Data and especially in Linked Open Data (LOD) which is the five-star format in publishing data.

‘ This is a quick start guide for decision makers who need to quickly get up to speed with the Linked Open Data (LOD) concept, and who want to make their organization a part of this movement.

It gives a quick overview of all key aspects of LOD, and gives practical answers to many pertinent questions including:
• What do the terms Open Data, Open Government Data and Linked Open Data actually mean, and what are the dierences between them?
• What do I need to take into account in developing a LOD strategy for my organization?
• What does my organization need to do technically in order to open up and publish its data sets?
• How can I make sure the data is accessible and digestible for others?
• How can I add value to my own data sets by consuming LOD from other sources?
• What can be learned from three case studies of best practices in LOD?
• REEEP’s clean energy information portal reegle.info
• NREL’s Open Energy Information Portal
• The ocial home of UK legislation: legislation.gov.uk
• What are the potentials offered by this fundamental step-change in the way data is shared and consumed via the web?’