Visual insights

In large amounts of data, information is hidden that can hardly be recognized with simple means. Special methods for data analysis are in demand and visualization techniques in particular help to overview the information gained and to pass it on in an understandable way.

Media have recognised the potential of statistical and other data years ago; this has led to what has been practised as data journalism in various large newspapers and also in newspaper co-operations.

The Datablog

A pioneer is The Guardian, whose datablog celebrated its 10th anniversary in March 2019:

Computer-assisted reporting

 But hardly anyone is ever the first. Especially when it comes to the visualization of data, there are examples that date back centuries.
But a new era has dawned with the use of computers in data analysis to generate interesting journalistic stories.
Of central importance here is the person of Philip Meyer, who began to use computer-assisted reporting as a journalist in the 1960s.

In his book Precision Journalism: A Reporter’s Introduction to Social Science Methods‘, published in his first edition in 1973, Meyer describes the demands on journalism that are still valid today and that are becoming data journalism.

‘There was a time when all you [as a journalist] needed was dedication to truth, plenty of energy, and some talent for writing. You still need those things, but they are no longer sufficient. The world has become so complicated, the growth of available information so explosive, that the journalist needs to be a filter, as well as a transmitter; an organizer and interpreter, as well as one who gath ers and delivers facts. In addition to knowing how to get information into print, online, or on the air, he or she also must know how to get it into the receiver’s head. In short, a journalist has to be a database manager, a data processor, and a data analyst. …..
In the information society, the needs are more complex. Read any of the popular journals of media criticism and you will find the same complaints about modern journalism. It misses important stories, is
too dependent on press releases, is easily manipulated by politicians and special interests, and does not communicate what it does know in an effective manner. All of these complaints are justified. Their Cause is not so much a lack of energy, talent, or dedication to truth, as the critics some times imply, but a simple lag in the application of information science—a body of knowledge—to the daunting problems of reporting the news in a time of information overload.
….
Today’s journalist must also be familiar with the growingjournalistic body of knowledge, which, therefore, must include these elements:
1 How to find information.
2 How to evaluate and analyze it
3 How to communicate it in a way that will pierce the babble of infor-
mation overload and reach the people who need and want it.
4 How to determine, and then obtain, the amount of precision needed
for a particular story. ‘

(Meyer, p. 1-2)


‘Data is not just about numbers’

Today’s data journalism is closely linked to the philosophy of open data. Data should be available in easily usable formats and be evaluable for everyone. But the claim of current data journalism – as represented by the Guardian authors – still follows the essential ideas of Philip Meyer.

‘We keep some of Meyer’s approach alive in how we do data journalism and we work alongside reporters to get the most out of the combination of data and specialist knowledge. Data is not just about numbers, and behind every row in a database there is a human story. They’re the stories we’re striving to tell. ‘ The Guardian Sat 23 Mar 2019

Examples

Since then, data-based journalism has set a trend. Many others publish data using graphics and are always looking for new ways to communicate the analysed data in an understandable way.
One of many examples is the New York Times, which celebrates Upshot’s 5th anniversary in 2019:

‘Five years ago today, The New York Times introduced The Upshot with the aim of examining politics, policy and everyday life in new ways. We wanted to experiment with formats, using whatever mix of text, data visualizations, images and interactive features seemed best for the subject at hand.


In the meantime there are networks that share their knowledge and offer help for data journalism or Data Driven Journalism DDJ. One of them (mostly in German) is datenjournalismus.net

Outstanding

Among the thousands of data-based stories and their visualizations there are highlights again and again. I don’t want to withhold my recent favourite. It is the analysis and visualization of the internal migration after the German reunification. Die Zeit presented this with a lot of effort and fascinating results in May 2019.

… and much more

Two Years Ago

He was a pioneer and a great inspiration for what public statistics always strives for: more visibility, more understanding and more resonance. Two years ago Hans Rosling (27 July 1948 – 7 February 2017) died too young.

Demanding and enriching was an encounter with Hans Rosling. His demand for public statistics was urgent and a prerequisite for his enlightening work: that statistical data should be open to all. Here he saw successes. It was and is enriching how he conveyed these data combined with a message. With innovative, precise, entertaining and always very personal presentations, he clarified what had happened and what developments could be desired. He was a realist regarding his effectiveness and yet always an optimist ….. better: a “possibilist”. What remains for me is how he taught to see with numbers – a constant challenge for public statistics.

“One little humble advice” he gave to his audience at the end of a presentation in 2013:

Full presentation here:  
DON'T PANIC — Hans Rosling showing the facts about population

It Goes On

Gapminder (“a fact tank, not a think tank”), with its innovative tools and commitment, continues to live with Anna Rosling Rönnlund and Ola Rosling.

And recently Factfulness, a book by the three (Hans Rosling, Anna Rosling Rönnlund, Ola Rosling) has been published with the subtitle “Ten Reasons We’re Wrong About The World – And Why Things Are Better Than You Think”

“Factfulness: The stress-reducing habit of only carrying opinions for which you have strong supporting facts. “


Heroes of Slovenia

My colleagues published the Slovene multi-player statistical quiz app on Tuesday 22nd. I love it!

We’ve all heard “statistics is boring”, but once you add lovely design, humorous content and a strategic game to it, it can be fun. In two days after the announcement there are more than 1200 players who already played abt. 24.000 games. And we know a lot more about Slovenia than we knew three days ago 😉

How the game works: First one selects his player name, then a favourite character (a hero from Slovenia), turns the wheel of fortune (automatic selection of the region one plays for) and then looks for an opponent (a region or a player). Each game has 7 questions, the last one always being a number range slider (statistical data). The one who wins gets some resources and so each region gradually evolves from the prehistory to the future. The competiton of the 12 regions lasts for 7 days, then the game resets to the starting point (keeping the overall score board of players).

In the current version we have about 2000 questions, 500 of these are statistical (others include interesting info about local peculiarities, history, literature, language, geography …).

At the moment there is the Slovene version only, but the App is ready to be translated or adjusted for another country (if adjusting you’d have to invest also into some graphic design adjustments). I sincerely hope we’d make an English version of the Slovene game someday, for our foreign visitors or fans of Slovenia 😉

Website announcement:  https://lnkd.in/gg3byB9

Game website: http://junaki-slovenije.si/

FB page: https://www.facebook.com/JunakiSlovenije

Developer: http://proxima.si/sl/project/junaki-slovenije

 

 

Statistical Self-Defense

No day without numbers in (social) media, in everyday life. And they not only want to inform us, they also want to orient us in one direction or the other.

And every day are among them deliberately or unintentionally false or misleading numbers.

Therefore, statistics must arm themselves against incorrect use of data and repeatedly teach the correct handling of statistical data.

There have long been numerous works on this subject. Here is another quite basic presentation by the Dutch journalist Sanne Blauw.

She picks out five statistical sins.

The fact that such presentations often use numbers themselves, which would also have to be viewed critically, does not diminish the value of her warnings.

Easy-to-understand Statistics for the Public

In a recently published EUROSTAT publication, the authors demand innovative forms of communication from public statistics in order not to lose their socially important role. Among other things, they demand ‘…. to tell stories close to the people; to create communities around specific themes; to develop among citizens the ability to read the data and understand what is behind the statistical process.’

Telling Stories

The UNECE hackathon that has just been completed responds to this challenge.
‘A hackathon is an intensive problem-solving event. In this case, the focus is on statistical content and effective communication. The teams will be challenged to “Create a user-oriented product that tells a story about the younger population”. During the Hackathon, fifteen teams from nine countries had 64.5 hours to create a product that tells a story about the younger population. The teams were multidisciplinary – with members from statistical offices and other government departments. The product created should be innovative, engaging, and targeted towards the general public (that is, not specialists). There was no limit on the form of the product, but the teams had to include a mandatory SDG indicator in the product.
The mandatory indicator was “Proportion of youth (aged 15-24 years) not in education, employment or training” SDG indicator (Indicator 8.6.1).‘ (Source)

Winners

And the hackathon shows impressive results, even if only a few organisations have participated.

The four winners are:

My Favourites

My favourites are number 3 from the National Institute of Statistics and Geography (INEGI-Mexico) and number 2 from the Central Statistical Office of Poland.

Why?

The Mexican solution…

…is aesthetically pleasing and easy to use. The interaction is left to the user and can be individually controlled by him/her in the speed.

The diagrams do not stand alone, but are explained by short texts while scrolling.

The results are not just being accepted. Rather, the concepts are explained and questioned – statistics are presented with the methodological background.

The Polish solution…

…starts with a jourmalistic approach. Here too, the interactivity can be controlled by the user at the desired speed.

At the end, the authors also seek direct contact with the users; a quiz personalizes the statistical data and gives an individual assessment of where the users stand personally with regard to these statistics.

Success Factors

The two applications mentioned above combine decisive user-friendly features:
– visually attractive,
– easy-to-understand navigation that can be controlled by the user according to his needs,
– the journalistic approach,
– concise and instructive explanations,
– personalization,
– hints on the methodological background.

Many of the other applications show the frequently encountered weaknesses: Too much information should be provided, no courage to leave something behind and concentrate on the most important elements. And this leads to long texts and complex navigation with the effect that users quit quickly.

The Good, the Bad and the Ugly

Communication of statistics in times of fake news

In a recent paper Emanuele Baldacci, (Director, Eurostat) and Felicia Pelagalli, (President, InnovaFiducia) deal with the ‘challenges for official statistics of changes in the information market spurred by network technology, data revolution and changes in information consumers’ behaviours’ (p.3)

Three scenarios

The status-quo or bad scenario:

‘Information will continue to be consumed via multiple decentralized channels, with new information intermediaries emerging through social platforms, digital opinion leaders, technologies that reinforce belonging to peers with similar profiles and backgrounds, including in terms of beliefs.’  … ‘Under this scenario it is likely that increased competition from alternative data providers will put pressure on the official statistics position in the information ecosystem and lead to drastic reduction of public resources invested in official statistics, as a result of the perceived lack of relevance.’ (p.8)

 

The ugly scenario:

‘Big oligopoly giants will emerge by integrating technologies, data and content and providing these to a variety of smaller scale platforms and information intermediaries, with limited pricing power for further dissemination. In this scenario, data generated by sensors and machines connected to the network will increasingly create smart information for individuals. However, individuals will not participate in the data processing task, but will be mostly confined to crowdsourcing data for digital platforms and using information services.’
‘In this scenario, official statistics will be further marginalized and its very existence could be put in jeopardy. More importantly, no public authority with significant influence could be in charge of assessing the quality of data used in the information markets. Statistics as a public good may be curtailed and limited to a narrow set of dimensions. …  Official statisticians will appear as old dinosaurs on the way to extinction, separated from the data ecosystem by a huge technology and capability gap.’ (p.9)

 

The good scenario:

The authors do not stop here. They also see a good scenario, but a scenario that implies a huge engagement.

This scenario is ‘predicated on two major assumptions.
First, the information market will be increasingly competitive by sound regulations that prevent the emergence of dominant positions in countries and even more important across them.
Second, official statistics pursue a strong modernization to evolve towards the production of smart statistics, which fully leverage technology and new data sources while maintaining and enhancing the quality of the data provided to the public.
In this scenario, official statistics will generate new more sophisticated data analytics that cater to different users by tailored information services. It uses network technologies (e.g., blockchain, networks) to involve individuals, companies and institutions in the design, collection, processing and dissemination of statistics. It engages users with open collaborative tools and invests heavily in data literacy to ensure their usability. It strengthens skills and capacity on statistical communication to help users understand in transparent manners what are the strengths and limitations of official statistics.’ (p. 9/10)

 

Actions needed to face the challenges ahead

The good scenario already depicts some needed actions to be taken by official statisticians. The authors conclude with proposals that are not really new, ideas that have been on the table for some time but are not so easy to implement.

‘It is important to change mindsets and practices which have been established, in order to put in contact the citizens with official statistics, to make data accessible, to expand the understanding of their analysis, to support individuals, business and institutions in the decision-making process.

The key issue is how to be authoritative and to develop quality knowledge in the new and changing information market. It is important to know the rules and languages of the media platforms used for communication; to overcome the technicalities; to tell stories close to the people; to create communities around specific themes; to develop among citizens the ability to read the data and
understand what is behind the statistical process. In summary, put people at the center (overused phrase, but extremely valuable):
⎯ communicate statistics through engaging experiences and relevant to the people who benefit from them;
⎯ customize the content;
⎯ adopt “user analytics” to acquire the knowledge of the “users” through the analysis of data (web and social analytics) and the understanding of people’s interaction with the different platforms.’ (p.11)

And the concluding words call for external assistance:

‘It will be essential for statisticians to build more tailored data insight services and team up with communication experts to play a more proactive role in contrasting fake news, checking facts appropriately and building users’ capacity to harness the power of data.’ (p.12)

 

 

 

 

 

Corporate nieuws

Eurostat’s biennial scientific conference on New Techniques and Technologies for Statistics (NTTS) is over, a labyrinth of a website is online and tons of documents are somewhere published.

CBS Corporate nieuws summarizes the important trends discussed:
1) New data sources and the consequences
2) The importance of a proactive communication
3) Big Data and algorithms in official statistics

trends.pngCBS06-06-2017 Miriam van der Sangen 

Corporate websites

Why taking this information just from CBS (the Dutch Statistical Office)? Because CBS Corporate nieuws is an excellent example of the second trend: proactive communication, proactivity in delivering (statistical) information to users. The website makes corporate information public and gives insights into activities of CBS and statistics. You see topics …

… and the people behind it.

The target public of this corporate website are enterprises, administrations, journalists, students and whoever may be interested.

A shorter English version is integrated into the CBS website.

Corporate websites like CBS’ are not quite usual. They are resource consuming but are probably very good in helping to understand statisticians’ mission and work .. and in motivating employees.