State of Open Data in Europe
The European Commission (Directorate General for Communications Networks, Content and Technology) just published the second Open Data Maturity Report.
‘The two key indicators used to measure Open Data Maturity are Open Data Readiness and Portal Maturity.
— The first key indicator, Open Data Readiness, assesses to what extent countries have an Open Data policy in place, licensing norms and the extent of national coordination regarding guidelines and setting common approaches.
— The second key indicator, Portal Maturity, explores the usability of the portal regarding the availability of functionalities, the overall re-usability of data such as machine readability and accessibility of data sets, for example, as well as the spread of data across domains.
The two key indicators as well as the sub-indicators are shown in the table below.
Open Data Maturity in Europe 2016: The results
Overview by countries:
Results from the Open Data Maturity Assessment for the EU28+ countries for 2016 (p.59)
From European Data Portal, all the details here.
And one more result blog about stats enjoys:
One of the criteria for open data maturity is the re-usability of data and especially machine readability of data. Six questions focus on this item:
‘When looking at the data on the European Data Portal, over 49 different file formats are used. The most used data formats are CSV, HTML and WMS. The fourth most used data format is PDF. PDF is one of the few data formats that is not machine-readable. The following most frequent distributions are ZIP, JSON, XLS and XLSX, followed by WFS and XML. Numbers range from nearly 49,000 CSV formats to just over 23,000 JSON formats to the least used 263 shape formats. Most data formats are or are related to a spreadsheet, which enables to analyse the data more swiftly.’ (p. 49).
That is not enough and the report recommends:
‘On the more technical side, some improvements are still necessary. To further develop automated processes each national portal should have an API in combination with a complete metadata profile. This allows a portal to share the data with data users more easily. This can for instance enable harvesting data directly from public administrations in an automated fashion, saving efforts in manual uploading of data and limiting errors when editing data and metadata manually. … Typos or different spellings can limit the discovery of data. Here activities conducted at EU level on controlled vocabularies can be of interest to learn from in order to increase semantic interoperability. ‘ (p.63)
There is more
PDF is poor, XLS is better, CSV even better, also JSON and APIs; and metadata are of crucial importance. The European Data Portal gives a good example: it organizes the datasets in the triple format (RDF) and offers an SPARQL search.
But there is more.
Not only the datasets could foster semantic interoperability but also the data in these sets. Linked data and adequate formats can assure this interoperability and with this extended machine readability and use of data. So why not add this criterion to the questionnaire (Question 7.6 +) and lead the national portals in this direction?
Linked data? Tim Berners-Lee explains