Two Years Ago

He was a pioneer and a great inspiration for what public statistics always strives for: more visibility, more understanding and more resonance. Two years ago Hans Rosling (27 July 1948 – 7 February 2017) died too young.

Demanding and enriching was an encounter with Hans Rosling. His demand for public statistics was urgent and a prerequisite for his enlightening work: that statistical data should be open to all. Here he saw successes. It was and is enriching how he conveyed these data combined with a message. With innovative, precise, entertaining and always very personal presentations, he clarified what had happened and what developments could be desired. He was a realist regarding his effectiveness and yet always an optimist ….. better: a “possibilist”. What remains for me is how he taught to see with numbers – a constant challenge for public statistics.

“One little humble advice” he gave to his audience at the end of a presentation in 2013:

Full presentation here:  
DON'T PANIC — Hans Rosling showing the facts about population

It Goes On

Gapminder (“a fact tank, not a think tank”), with its innovative tools and commitment, continues to live with Anna Rosling Rönnlund and Ola Rosling.

And recently Factfulness, a book by the three (Hans Rosling, Anna Rosling Rönnlund, Ola Rosling) has been published with the subtitle “Ten Reasons We’re Wrong About The World – And Why Things Are Better Than You Think”

“Factfulness: The stress-reducing habit of only carrying opinions for which you have strong supporting facts. “


Open Data Portals: News

There are new or refurbished open data portals to be announced.

opendata.swiss

Switzerland just published opendata.swiss in a new look for a better presentation of data. See the press release.

snip_opendataswiss

snip_swissopendataabout

europeandataportal.eu

The European Commission published some months ago the European Data Portal.

snip_EuropeanDataPortal

europeandataportal.eu is much more than a collection of open data. It is an ecosystem with lots of documents explaining and promoting open data.

snip_euportalaims

SPARQL inside!

The portal offers metadata as linked open data with an SPARQL endpoint for powerful searching.

snip_sparql

select ?theme (count(?theme) as ?count) where {?s a dcat:Dataset . ?s dcat:theme ?theme} GROUP BY ?theme LIMIT 100  gives all  data categories/themes and their number of datasets .

Impact studies

Most of all these data are already published on other websites. The advantage of such open data portals are a centralized access and clear licence information, A main intention is to attract developers, to foster data usage and with this economic growth.

A Swiss study (January 2014) assesses the economic impact of Open Government Data: ´The report determined that the economic benefits from OGD for Switzerland lie most likely between CHF0.9 B and CHF1.2 B´.

snip_ogdstudie       All the details >>> here  (look for the extended executive summary).

European Study (November 2015) within the context of the launch of the European Data Portal got these results: “The aim of this study is to collect, assess and aggregate economic evidence to forecast the benefits of the re-use of Open Data for the EU28+. Four key indicators are measured: direct market size, number of jobs created, cost savings, and efficiency gains. Between 2016 and 2020, the market size of Open Data is expected to increase by 36.9%, to a value of 75.7 bn EUR in 2020. The forecasted number of direct Open Data jobs in 2016 is 75,000 jobs. From 2016 to 2020, almost 25,000 extra direct Open Data jobs are created. The forecasted public sector cost savings for the EU28+ in 2020 are 1.7 bn EUR. Efficiency gains are measured in a qualitative approach. ”

snip_EUimpactSee the details >>> here

Next: LOD

Open and machine-readable formats help to access data and foster the economic impact. Even better when the data have metadata in a standardized description. Linked Open Data (LOD) in RDF format provide this; europeandataportal.eu uses this format describing the harvested datasets (metadata). The next step will and must be data in this format in order to link masses of data in the linked data cloud.

With data.admin.ch a first step is been made in Switzerland.

snip_dataadmin

Linked Data? In europeandataportal.eu’s ecosystem well made videos present explanations:

snip_learnLOD

 

 

For a fact-based Worldview

2015-10-07_RoslingMedia

Hans Rosling, co-founder and promoter of the Gapminder Foundation and of gapminder.org fights with statistics against myths (‘Our goal is to replace devastating myths with a fact-based worldview.’) and tries to counterbalance media focussing on war, conflicts and chaos.

Here one more example (and this in a media interview…): ‘You can’t use media if you want to understand the world’ (sic!)

And this statement on gapmider.org; ‘Statistical facts don’t come to people naturally. Quite the opposite. Most people understand the world by generalizing personal experiences which are very biased. In the media the “news-worthy” events exaggerate the unusual and put the focus on swift changes. Slow and steady changes in major trends don’t get much attention. Unintentionally, people end-up carrying around a sack of outdated facts that you got in school (including knowledge that often was outdated when acquired in school).’ http://www.gapminder.org/ignorance/

 

 

From Quantity to Quality

Open Data is a much-debated topic and – since the Obama administration launched Data.gov on May 21, 2009 – an international competition, too. Nearly 400 Open-Data Portals emerged meanwhile. But very often there is more concern about the number of published data than about the content of these data.

GODI

This issue has been addressed by Open Knowledge (okfn) with its Global Open Data Index (Global Open Data Index). 2015-08-24_OpenDataIndex‘ …simply putting a few spreadsheets online under an open license is obviously not enough. Doing open government data well depends on releasing key datasets in the right way.
Moreover, with the proliferation of sites it has become increasingly hard to track what is happening: which countries, or municipalities, are actually releasing open data and which aren’t? Which countries are releasing data that matters? Which countries are releasing data in the right way and in a timely way?
The Global Open Data Index was created to answer these sorts of questions, providing an up-to-date and reliable guide to the state of global open data for policy-makers, researchers, journalists, activists and citizens.’

.

The Challenge: Be more than a simple measurement tool

The Open Knowledge Community just started ‘a discussion with the open data community and our partners in civil society to help us determine which datasets are of high social and democratic value and should be assessed in the 2015 Index. We believe that by making the choice of datasets a collaborative decision, we will be able to raise awareness of and start a conversation around the datasets required for the Index to truly become a civil society audit of the open data revolution. – See more at http://blog.okfn.org/2015/08/20/the-2015-global-open-data-index-is-around-the-corner-these-are-the-new-datasets-we-are-adding-to-it/

The result is a list of datasets that can be found on Google docs.

Statistics

For National Statistics (in okfn’s definition), these are the (few) chosen sets:

‘Key national statistics such as demographic and economic indicators (GDP, unemployment, population, etc).
To satisfy this category, the following minimum criteria must be met:
– GDP for the whole country updated at least quarterly
– Unemployment statistics updated at least monthly
– Population updated at least once a year’

Open, Useful, Reusable

In OECD’s brand new publication ‘Government at a Glance 2015’ we can find a new indicator: The OUR Index. It stands for ‘Open, Useful, Reusable Government Data’.

‘The new OECD OURdata Index reveals that many countries have made progress in making public data more available and accessible, but large variations remain, not least with respect to the quality of data provided. Governments need to make participation initiatives more accessible, targeted, relevant and appealing.’ (p.8)

Method

‘The data come from the 2014 OECD Survey on Open Government Data. Survey respondents were predominantly chief information officers in OECD countries and two candidate countries (Colombia and Latvia). Responses represent countries’ own assessments of current practices and procedures regarding open government data. Data refer only to central/federal governments and exclude open government data practices at the state/local levels.’ (p.150)

.

Based on G8 recommendations

‘The OECD OURdata Index measures government efforts to implement the G8 Open Data charter based on the availability, accessibility and government support to promote the reuse of data, focusing on the central OGD portal in each country'( p.33)

‘The G8 Open Data Charter defines a series of five principles: 1) open data by default; 2) quality and quantity data; 3) usable by all; 4) releasing data for improved governance and; 5) releasing data for innovation, as well as three collective actions to guide the implementation of those principles.’
‘As a first step in producing a comprehensive measure of the level of implementation of the G8 Open Data Charter, the OECD pilot Index on Open government data assesses governments’ efforts to implement open data in three dimensions:
1. Data availability on the national portal (based on principle 1 and collective action 2);
2. Data accessibility on the national portal (based on principle 3) and
3. Governments’ support to innovative re-use and stakeholder engagement (principle 5).
The only principle not covered in this year’s index is Principle 4: Releasing Data for improved governance value (e.g. transparency) as existing measurement efforts have focused primarily on socio economic value creation’ (p.150)

.

And here comes the ranking

2015-07-10-OURdataIndexData for this chart: http://dx.doi.org/10.1787/888933249180

Detailed data for the countries: http://dx.doi.org/10.1787/888933249175

The publication

The publication: OECD (2015), Government at a Glance 2015, OECD Publishing, Paris. http://dx.doi.org/10.1787/gov_glance-2015-en

2015-07-10-Govataglance2015

The Cui-bono Approach to Open Data

What’s the problem? Which data are needed to solve it? Who gets an advantage of it?

These few questions are valuable key for implementing the open data culture. Open data not as ‘l’art pour l’art’ but in a pragmatic approach, demonstrating that the ‘proof of the pudding is in the eating’.

2015-07-03_opendatatriangle

It seems to work very well as Ton Zijlstra showed in his presentation at the Swiss Opendata Conference 2015.

He gives some examples of situations where open data helped to provide a solution to a problem and where stakeholders got an answer to their issues.

2015-07-03_Zijlstra

link ti

Next Step after OGD: Government’s Big Data Scientist

Open Government Data (OGD) Initiatives have been important steps helping to give broader access to administrative data.

But there was some disappointment because OGD didn’t bring up the mass of apps many hoped. And meanwhile big discussions about using Big Data emerged.

Now the US make a step forward going for a Big Data Initiative: President Obama just nominated a Chief Data Scientist in his Office, DJ Patil.

https://m.whitehouse.gov/blog/2015/02/18/white-house-names-dr-dj-patil-first-us-chief-data-scientist

‘Patil’s new role will involve the application of big data to all government areas, but particularly healthcare policy.’ (Source)

2015-03-19_Patil-Q&A

Open Data Index

There are lots of indexes.
The most famous one may be the  Index Librorum Prohibitorum  listing books prohibited by the cathoilic church. It contained eminent scientists and intellectuals (see the list in Wikipedia) and was abolished after more than 400 years in 1966 only.

Open Data Index

One index everybody would like to be registered in and this with a high rank is the Open Data Index.

‘An increasing number of governments have committed to open up data, but how much key information is actually being released? …. Which countries are the most advanced and which are lagging in relation to open data? The Open Data Index has been developed to help answer such questions by collecting and presenting information on the state of open data around the world – to ignite discussions between citizens and governments.’

2013-10-29_odindex-overall

‘The Open Data Index is an initiative of the Open Knowledge Foundation based on contributions from open data advocates and experts around the world. …. The Open Data Index is a community-based effort initiated and coordinated by the Open Knowledge Foundation with participation from many different groups and individuals. The Open Data Census, upon which the Open Data Index is based, was launched in April 2012 to coincide with the OGP meeting in Brasilia.’
See also https://blogstats.wordpress.com/2013/06/15/open-data-census/

‘The 2013 Open Data Index launches just before the Open Government Partnership summit in London, at a time when governments and civil society meet to make commitments, monitor progress, and plan for greater open government and transparency around the world.’ (more).

Country Comparison

2013-10-29_odindex

.

Country Details: Switzerland

2013-10-29_odindexCH

.

What criteria matters in the assessment of the datasets?’

‘When submitting a dataset, there is a list of questions to answer about the availability and openness of the datasets. These answers appear in the Country overview page for each country:

Question Details Weighting
Does the data exist? Does the data exist at all? The data can be in any form (paper or digital, offline or online etc). If it is not, then all the other questions are not answered. 5
Is data in digital form? This question addresses whether the data is in digital form (stored on computers or digital storage) or if it only in, for example, paper form. 5
Publicly available? This question addresses whether the data is “public”. This does not require it to be freely available, but does require that someone outside of the government can access it in some form (examples include if the data is available for purchase, if it exist as PDFs on a website that you can access, if you can get it in paper form – then it is public). If a freedom of information request or similar is needed to access the data, it is not considered public. 5
Is the data available for free? This question addresses whether the data is available for free or if there is a charge. If there is a charge, then that is stated in the comments section. 15
Is the data available online? This question addresses whether the data is available online from an official source. In the cases that this is answered with a ‘yes’, then the link is put in the URL field below. 5
Is the data machine readable? Data is machine readable if it is in a format that can be easily processed by a computer. Data can be digital but not machine readable. For example, consider a PDF document containing tables of data. These are definitely digital but are not machine-readable because a computer would struggle to access the tabular information (even though they are very human readable!). The equivalent tables in a format such as a spreadsheet would be machine readable. Note: The appropriate machine readable format may vary by type of data – so, for example, machine readable formats for geographic data may be different than for tabular data. In general, HTML and PDF are not machine-readable. 15
Available in bulk? Data is available in bulk if the whole dataset can be downloaded or accessed easily. Conversely it is considered non-bulk if the citizens are limited to just getting parts of the dataset (for example, if restricted to querying a web form and retrieving a few results at a time from a very large database). 10
Openly licensed? This question addresses whether the dataset is open as per http://opendefinition.org. It needs to state the terms of use or license that allow anyone to freely use, reuse or redistribute the data (subject at most to attribution or sharealike requirements). It is vital that a licence is available (if there’s no licence, the data is not openly licensed). Open Licences which meet the requirements of the Open Definition are listed at http://opendefinition.org/licenses/. 30
Is the data provided on a timely and up-to-date basis? This question addresses whether the data is up-to-date and timely – or long delayed. For example, is election data made available immediately or soon after the election, or is it only available many years later? Any comments around uncertainty are put in the comments field. 10
URL of data online? The link to the specific dataset if that is possible. Otherwise to the home page for the data. If that is not possible, then the link to main page of site on which the data is located. Only links to official sites are eligible, not third party sites. When it is necessary for submitters to provide third party links, then they are put in the comments section.
Date the data became available? This question describes when the data first became openly available (online, in digital form, openly licensed etc). Sometimes this is approximate. For example, “2012” or “Jan 2012”. If there is a precise date, then they are typed in in a yyyy-mm-dd format.
If the data is not open, then this question will instead describe the date the data first became available at all. (Note: some open data will have been available in other forms previously, so the date specified here is the date it became openly available).
Format of data? This question describes the form that the data is available in. For example, for tabular data it might be: Excel, CSV, HTML or even PDF. For geodata it might be shapefiles, geojson or something else. If available in multiple formats, the format descriptors are listed separated with commas. Any further information is put in the comments section.’

For Switzerland Timetables (of major government operated (or commissioned) *national-level* public transport services (specifically bus and train))
and National government budget (at a high level (e.g. spending by sector, department etc)) are less open.
Data from swisstopo and Statistics Switzerland (partially thanks to the new opendata.admin.ch/ portal) have most criteria in green, the main question lies in licensing (not freely available, not free for commercial use).

.

Featured Visualisation: An example how to present Open Data

NYC Open Data Site Finder
This interactive graphic, inspired by Chris Whong’s d3.js network diagram, allows users to access every link in the NYC Open Data site. Hover over a circle in the packed bubble chart to see link info, and click on a circle to access the site in a new browser tab. Use the bar charts and filters to focus your view.’

2013-10-29_nyc-finder