The Cui-bono Approach to Open Data

What’s the problem? Which data are needed to solve it? Who gets an advantage of it?

These few questions are valuable key for implementing the open data culture. Open data not as ‘l’art pour l’art’ but in a pragmatic approach, demonstrating that the ‘proof of the pudding is in the eating’.


It seems to work very well as Ton Zijlstra showed in his presentation at the Swiss Opendata Conference 2015.

He gives some examples of situations where open data helped to provide a solution to a problem and where stakeholders got an answer to their issues.


link ti

Basic Needs and Delighters

How to find out user needs? Which method to choose?

These questions find an innovative answer in an article from Ilka Willand (of Destatis, the German Statistical Office) published in number 31 of IAOS’ Statistical Journal

Beyond traditional customer surveys: The reputation analysis
Authors: Willand, Ilka
DOI: 10.3233/sji-150866
Journal: Statistical Journal of the IAOS, vol. 31, no. 2, 2015

Here a short version with pieces taken from this article:


‘An important strategic goal of Destatis is to continuously collect information about the customer satisfaction and the perception of important stakeholders and target groups. We conduct frequent customer surveys since 2007. But not all important stakeholders and target groups are necessarily registered customers. To learn more about their demands a reputation analysis was conducted in 2013 in cooperation with a market researcher. To determine a manageable frame for the study, we focused on three target groups: Respondents (households and enterprises), fast multipliers (online and data journalists) and young multipliers (young academics). The analysis was mainly based on the “Kano-Model”, a methodological approach, which is often used in quality management and product development. In the following article the survey design and the main results will be presented.’

Basic needs and Delighters

‘The most important category is the basic needs. Basic needs are taken for granted and they are typically unspoken. If they are fulfilled, they do not increase satisfaction. If they are not fulfilled, they will cause dissatisfaction.
Delighters are unexpected features that make customers happy. They do not necessarily cause dissatisfaction when not fulfilled, because they are not expected.’

 Three Target Groups in Focus

‘To determine a manageable frame we focused on three target groups who became increasingly important for the work of the Federal Statistical Office in the past years:
a) Respondents (households, enterprises)
b) Fast multipliers (online and datajournalists)
c) Young multipliers (young graduates and PhD students of social and economic sciences).

‘Target groups were asked for their basic needs and delighters concerning data search, data use and the reporting process.
On a scale from 0 (very bad) to 7 (very good) the reputation values are 5.3 for the fast and the young multipliers, 4.7 for the households and 4.6 for the enterprises.’




‘Most important basic needs and delighters: Especially for the responding enterprises it is a basic need important to get survey results after the survey is completed. A telephone service is a basic need especially for the bigger companies and the households to support the reporting process.
It is a delighter for enterprises to respond only online. This is currently being implemented in Germany, regardless of the results of the survey.’


Fast Multipliers

Most important basic needs and delighters: Fast multipliers expect more than databases and datasets. For almost every second a telephone-support is a basic need. This is quite interesting because there are many internal discussions at Destatis to give up that service for the journalists. Also they expect to find data they are looking for as fast as possible and for free on the internet. After an average of 14 minutes of searching on the Destatis website they will contact the information service if they are not able to find what they are looking for. To satisfy their basic need to find data as quick as possible we have to improve the search engine.
Most of our data is already available for free. Interactive charts would delight most of the journalists. Application programming interfaces (APIs) to grab huge amounts of primary data are the delighter especially for the data journalists.’


Young Multipliers

Most important basic needs and delighters: There are intersections between the young and the fast multipliers. Young multipliers also want data as fast as possible and for free on the internet. Most of the PhD students expect detailed methodological descriptions related to the datasets. What are the delighters? Surprisingly one half of the young academics mentioned examples on how to read tables and charts as a delighter. Similar to the fast multipliers we have overestimated their statistical knowledge in the past. Already more than one third of them see the opportunity to search for data via smartphone or tablet as a delighter. That means we have to offer more appropriate publication formats in the future.’


Results at a Glance


 See also

Ilka Willand got the award at IMAODBC 2013 for presenting this reputation study. See he slides at

Big Data and Official Statistics



Big Data is THE topic of the freshly published Statistical Journal of the IAOS – Volume 31, issue 2.


Five articles deal with Big-Data topics:

In the editorial Fride Eeg-Henriksen and Peter Hackl give an overview of the Big-Data discussions hold in Official Statistics. Here some remarks taken from this editorial:

‘In spite of the wide interest in and the great popularity of Big Data, no clear and commonly accepted definition of the notion Big Data could be established so far [3]. Modern technological, social and economic developments including the growth of smart devices and infrastructure, the growing availability and efficiency of the internet, the appeal of social networking sites and the prevalence and ubiquity of IT systems are resulting in the generation of huge streams of digital data. The complexities of the structure and dynamic of corresponding datasets, the challenges in developing the suitable software tools for data analytics, generally the diversity of potentials in making use of the masses of available data make it difficult to find a suitable and generally applicable definition. The often mentioned characterization of Big Data by 3 – or more – Vs (volume, velocity, variety – as well as veracity and value), does not capture the enormous scope of the corresponding data sets and the extensive potentials of making use of these data. A highly relevant aspect is that Big Data are so large and complex that traditional database management tools and data processing applications are not feasible and efficient means. This is illustrated by a look at the categories of data sources which typically are seen in the context of Big Data: Such data sources may be
– Administrative, e.g., medical records, insurance records, bank records.
– Commercial transactions, e.g., credit card transactions, scanners in supermarkets.
– Sensors, e.g., satellite imaging, environmental sensors, road sensors.
– Tracking devices, e.g., tracking data from mobile telephones, GPS
– Tracks of human behaviour, e.g., online searches, online page viewing.
– Documentation of opinion, e.g., comments posted in social media.


‘A general conclusion from the set of articles in this Special Section can be drawn as follows: The feasibility and the potentials of using Big Data in official statistics have to be assessed from case to case. In some areas the use of Big Data sources has already proved to be feasible. The choice of the appropriate IT technology and statistical methods must be specific for each situation. Also issues like the representativity and the quality of the resulting statistics, or the confidentiality and the risk of disclosure of personal data need to be assessed individually for each case. There is no doubt that Big Data will have a place in the future of official statistics, helping to reduce costs and burden on respondents. However, major efforts will be necessary to establish the routine wise use of Big Data, and new approaches will be needed for assessing all aspects of quality.’

[3] C. Reimsbach-Kounatze, (2015), The Proliferation of “Big Data” and Implications for Official Statistics and Statistical Agencies: A Preliminary Analysis”, OECD Digital Economy Papers, No. 245, OECD Publishing.


See also: Big Data in Action May 2015


Data Journalism avant la lettre

From Data to Insight

Where there are data, there is insight. However, insight needs know how – know how about data sources, know how about analyzing data (with particular tools), about the context of the data and – last but not least – know how about presenting and communicating the insight.


William Playfair

These steps characterize what for some time now is called data journalism. More than 200 years ago we can find a brilliant example of ‘data journalism avant la lettre’ by the person who is thought to have invented statistical charts (or ‘lineal arithmetic’): William Playfair.

In his book ‘Lineal Arithmetic’ published in 1798 he presents several short articles about trade relations and the income produced by this trade. His aim is to describe long time developments not the actual situation in his difficult period of revolution and war. Mercantilism seems to be the context of his argumentation, but his primary interest surely is to demonstrate his innovative visual presentation.


Open Data 1798

Playfair gets his data from the House of Commons’ yearly accounts. Open Data 18th century!


Analyzing and presenting

Playfair’s data research is quite easily done. There aren’t big data to be traveled. Some time series of import and export data are the result. It’s  his presentation that marks the point!



Playfair presents his findings in a new form. The visual presentation of data is his invention, and he proudly explains this visual ‘mode of representing‘ in the introduction of his work. That’s scientific and convincing.

2015-05-17_playfair-table ,

And to make his readers familiar with charts, especially bar charts, he gives a fascinating explanation leading from real-world  money staples to abstract bars of a painted chart:

‘This method has struck several persons as being fallacious, because geometrical measurement has not any relation to money or to time; yet here it is made to represent both. The most familiar and simple answer to this objection is by giving an example. Suppose the money received by a man in trade were all in guineas, and that every evening he made a single pile of all the guineas received during the day, each pile would represent a day, and its height would be proportioned to the receipts of that day; so that by this plain operation, time, proportion, and amount, would all be physically combined.
Lineal arithmetic then, it may be averred, is nothing more than those piles of guineas represented on paper, and on a small scale, in which an inch (perhaps) represents the thickness of five millions of guineas, as in geography it does the breadth of a river, or any other extent of country.’ (p.7/8)



Charts and textual explanation go hand in hand. Playfair discusses all charts in short texts. For chart 3 (Germany)  – see above – it looks like this:



‘ … to aim at facility, in communicating information’ (p.8)

Communicating information is where Playfair excels. And he has studied how to do this and where his target groups are:

‘ …. we think it better to confine this work to mere matter of fact, as much as possible, being’ fully satisfied that in this small volume is contained what every man in this country, who aims at the reputation of a well-informed merchant, ought to be acquainted with; at the same time, that the Statesman will find in it things which he perhaps already knows, but which are here painted to the eye in a more agreeable and distinct manner than is possible to be done by writing or figures. It is on these grounds that this small, but compendious volume, claims the public attention.(p.4)


 The title has the message


Big Data in Action

Not long ago in Official Statistics the topic ‘Big Data’ was mostly discussed in a theoretical manner.


However, now more and more real, and solid examples appear and demonstrate how Big Data work and what their outcome could be.

Some of these examples come from (Official) Statistics. These institutions use Big Data as a source and start applying a new analytical paradigm.


Example 1: Global Pulse (UN)

Global Pulse is a flagship innovation initiative of the United Nations Secretary-General on big data. Its vision is a future in which big data is harnessed safely and responsibly as a public good. Its mission is to accelerate discovery, development and scaled adoption of big data innovation for sustainable development and humanitarian action. … Big data represents a new, renewable natural resource with the potential to revolutionize sustainable development and humanitarian practice.’ –>

See some examples of using Big Data below:

  • analyse social media data for perceptions related to sanitation, in order to baseline public engagement
  • use of mobile phone data as a proxy for food security and poverty indicators
  • how risk factors (e.g., tobacco, alcohol, diet and physical activity) of non-communicable diseases (e.g., cancer, diabetes, depression) could be inferred from big data sources as social media and online internet searches


‘This paper outlines the opportunities and challenges, which have guided the United Nations Global Pulse initiative since its inception in 2009. The paper builds on some of the most recent findings in the field of data science, and findings from our own collaborative research projects. It does not aim to cover the entire spectrum of challenges nor to offer definitive answers to those it addresses, but to serve as a reference for further reflection and discussion. The rest of this document is organised as follows: section one lays out the vision that underpins Big Data for Development; section two discusses the main challenges it raises; section three discusses its application. The concluding section examines options and priorities for the future.’


Example 2: CBS

In Statistics Netherlands (CBS) Big Data is an important research topic.




Several examples were studied:

  • road sensors for traffic and transport statistics
  • mobile phone data for travel behaviour (of active phones) or tourism (new phones that register to network)
  • social media data for a sentiment analysis tracking words with their associated sentiment in Twitter, Facebook, Google+, Linkedin, etc.



Example 3: Report of the Global Working Group on Big Data for Official Statistics

In March 2015, the forty-sixth session of the UN Statistical Commission received a report about Big Data in Official Statistics:

‘The report presents the highlights of the International Conference on Big Data for Official Statistics, the outcome of the first meeting of the Global Working Group and the results of a survey on the use of big data for official statistics.’ …

‘The potential of big data sources resides in the timely — and sometimes real‑time — availability of large amounts of data, which are usually generated at minimal cost.  …. before introducing big data into official statistics …. it needs to adequately address issues pertaining to methodology, quality, technology, data access, legislation, privacy, management and finance, and provide adequate cost-benefit analyses.’

UN Statistical Commission Forty-sixth session 3-6 March 2015,
The full report (


Example 4: UNECE Statistics Wiki on Big Data in OfficialStatistics

A dedicated wiki offers an overview of the ever growing activities in the field of Official Statistics and Big Data. It’s managed by the Geneva Office of UNECE.2015-05-23_BIGData-UNECE-wiki

The wiki provides an interesting Big Data Inventory


A new animated population pyramid for Germany 1950–2050

Today Destatis released a new projection of Germany’s population by 2060 accompanied by an all new animated population pyramid. It is the first population pyramid that really moves upwards.


In case the above doesn’t display in your preferred language, here are the distinct links for english, french, spanish, russian, german.

The pasted screenshot is the mobile version you will automatically see on small screens. There is much more to explore on larger displays, as birthyears are labeled directly, you can lock an outline for comparison and there are four different variants to choose from, so that you can judge the outcome with different assumptions.

Apart from starting the animation with the (Play) button you can navigate through the years by mousewheel, left/right cursor keys or on touch devices directly by swiping up or down on the pyramid.