What are they doing …. ?

… and how do statistical institutions present what they do?

In times of fake news and austerity measures, statistical offices are feeling more and more the urge to orientate the public about themselves and the usefulness and necessity of trustworthy statistics.

But how to proceed?

Public relations specialists know countless ways to get messages to the target groups. A traditional and usually quite boring way are annual reports. They’re usually just an obligatory thing and treated accordingly.

Annual reports as ambassadors for public statistics

Is this still a quite boring lecture under the changing circumstances mentioned above? Let’s look at a few examples.

 

#1 European Official Statistics

The European Statistical Governance Advisory Board publishes the report, which focuses on fake news and trust issues. It’s  mainly a control report with recommendations to be re-evaluated next year.

Not everyone’s reading but with some interesting facts about the European statistical infrastructure.

ESGAB-Titel-2017

‘ … this year’s Report focuses on the importance of good governance to maintain and increase trust in official statistics, ensuring appropriate access to administrative and privately-held data, and the practical challenges of coordinating NSSs.
Chapter 1 looks first at the challenge of maintaining and enhancing trust in official statistics when there is conflicting information provided by non-official sources or when statistical indicators fail to relate to citizens’ actual experiences. Access to administrative records and privately-held data is then examined, highlighting some of the difficulties encountered by NSIs and the need to ensure that the transposition of the new Regulation on General Data Protection into national law does not hinder access to data for statistical purposes. Finally, the challenge of coordination within NSSs is discussed, particularly in relation to ONAs.
Chapter 2 provides ESGAB’s overview of the implementation of the Code of Practice, ..
Chapter 3 reviews ESGAB’s activities over its first nine years, … ‘ (p.10)
ESGAB-Recommendations-2017
p.8

Glossary
European Statistics
Code of Practice (‘the Code’)
The European Statistics Code of Practice sets
the standards for developing, producing and
disseminating European statistics. It builds on
a common definition of quality in statistics used
in the European Statistical System, composed of
national statistical authorities and Eurostat. ….
European Statistical Governance
Advisory Board (ESGAB, ‘the Board’)
ESGAB provides an independent overview of
the implementation of the Code of Practice. It
seeks to enhance the professional independence,
integrity and accountability of the European
Statistical System, key elements of the Code,
and the quality of European statistics …..
European Statistical System (ESS)
The European Statistical System is a
partnership between the European Union’s
statistical authority, i.e. the Commission
(Eurostat), the National Statistical Institutes
(NSIs) and Other National Authorities (ONAs) ….

Some interesting facts given in this report:

gdp-EU

.

#2 UK

UK is of a similar type to the EU. Somewhat more systematic, with clear performance targets and evaluated indicators …. and tons of financial data.

‘This year has been a challenging one for those of us working in official statistics. Numbers were very much in the news in the run-up to the EU referendum and since. Examples of bad use of numbers and misrepresentation of statistics can cast a shadow over the validity and integrity of evidence. However, information that can be accepted and used with confidence is essential to good decision making by governments, businesses and individuals.’ …’ (John Pullinger,p.4)
‘The 2007 Act requires that the Authority produces a report annually to Parliament and the devolved legislatures on what it has done during the year, what it has found during the year and what it intends to do during the next financial year. This report fulfills that responsibility.’ (p.9)
‘STRATEGIC OBJECTIVES
To achieve its mission, over five years the Authority will focus on five perspectives:
a helpful, professional, innovative, efficient and capable statistical service will, we believe, serve the public good and help our nation make better decisions.’ (p.9)
.
‘KEY PERFORMANCE INDICATORS
The Authority’s Business Plan includes a number of Performance Metrics through which we monitor performance. Our performance against these indicators is summarised in the table below. It is important to note our targets are always used to stretch performance ..’ (p. 9)
And some interesting facts:

 

#3 Sweden

Sweden reports concisely on a few central goals and with the obligatory information on the organisation and infrastructure.

‘Statistics Sweden plays a key role in public infrastructure. Its task is to develop, produce and disseminate official and other government statistics. The Official Statistics Act sets out a number of criteria concerning statistical quality, in which statistical relevance is a top priority.’ (Joakim Stymne, p. 4)

‘Punctuality in publishing remained high and amounted to 99 percent. No corrections that were considered serious were made to the published statistics during the year, and there were fewer internal error reports than in 2016.’ (p. 7)


‘During 2017, Statistics Sweden has studied how its customers and users view the agency and its products in different ways.’ (p. 10)

#4 Switzerland

Switzerland differs from other reports in two ways:
– The report shows not only the activities of the Office, but also the state of the country according to various topics (the milestones of the multi-annual statistical programme, and at the same time a small Statistical Yearbook).
– And it is very personal, responsible persons behind the statistics become visible.

German and French only

‘Die erste Halbzeit der Legislatur ist um und damit auch die ersten zwei Jahre des statistischen Mehrjahresprogramms 2016–2019. Die darin festgelegten Ziele und Schwerpunkte bilden die Leitlinien für die Arbeit der Bundesstatistik. Die für das Jahr 2017 geplanten Meilensteine konnten erfolgreich umgesetzt werden. … … der Auftrag der Bundesstatistik wie folgt zusammengefasst: «Im Zentrum des Auftrags der Bundesstatistik stehen die Erstellung und die Vermittlung von nutzergerechten Informationen zu wichtigen Lebensbereichen unserer Gesellschaft. Diese Informationen dienen unter anderem der Planung und Steuerung zentraler Politikbereiche, deren Stand und Entwicklung mit Hilfe der statistischen Informationen beobachtet und beurteilt werden können.” (Georges-Simon Ulrich, p.5)

The state of statistics in the topic areas: e.g. Population

And the targets for the future: focal points and priority developments in the coming year:

Some interesting facts about structure and publishing

Staff

Publishing

.

# 5 Germany

Germany is taking a quite different approach: the annual report is more like a scientific magazine. With interviews and contributions to focal topics.

D-title

‘ People are being guided more by their emotions and less and less by facts – this is how we might sum up the post-truth debate which reached its hitherto climax last year, culminating in “postfaktisch” (post-factual, or post-truth) being chosen as the German Word of the Year 2016. …
I hope that all of the other topics dealt with in this report provide you with a good insight into all matters figure-related and that, in so doing, we can enhance your trust and confidence in official statistics.’ (Dieter Sarreither, p.3).
.
The table of contents shows how this report is designed as a magazine
.
Some interesting information about the office
.
This report also gives itself a personal touch and shows the responsible management personnel

.

# Conclusion

Annual reports are certainly not the most effective way of informing the public about the activities and importance of statistical institutions. They must be approached with other measures; they must be embedded in PR measures. Then they can – especially if they are well made – contribute a lot to understanding official statistics.

 

 

 

 

 

Reading a Picture

Visual storytelling

Visualising data helps understanding facts.
Sometimes it’s very easy to understand a graph; sometimes it’s necessary to read it and to study it to discover unknown territory.

Such graphs are little masterpieces. Here’s one of these and I am sure the authors had more than one iteration and discussion while creating it.
The graph tells the story of the average disposable income and savings of households in Switzerland, published by the Swiss Federal Statistical Office FSO.

snip_disposable-income2

The authors kindly give a short explanation:

How to read this graph.
In one-person households aged 64 or under, the upper-income group has a disposable income of CHF 8487 per month and savings of CHF 2758 per month. Representing 4.0% of all households, this income group corresponds to a fifth of one-person households aged 64 or under (20.1%)

There’s another nice graph, a little bit less elaborated, also explained by the authors:

snip-povertyrates

Statistics ♥

But there’s one thing that is not explained:

snip_poverty-cithe confidence interval!

‘A confidence interval gives an estimated range of values which is likely to include an unknown population parameter, the estimated range being calculated from a given set of sample data,‘ and the above poverty data are from a sample of ‘approximately 7000 households, i.e. more than 17,000 persons who are randomly selected…’.
Or:
The confidence intervals for the mean give us a range of values around the mean where we expect the “true” (population) mean is located (with a given level of certainty, see also Elementary Concepts). ….. as we all know from the weather forecast, the more “vague” the prediction (i.e., wider the confidence interval), the more likely it will materialize. Note that the width of the confidence interval depends on the sample size and on the variation of data values…..’

Khan Academy gives lectures about topics like confidence intervals, sampling, etc.

snip_20161129160845.

Which one ?

The above graphs use just one of multiple possibilities for visualising data.

snip_graph-catalogue

Severino Ribecca’s Data Visualisation Catalogue is one of many websites trying to give an overview. And there’s the risk to get lost in these compilations.

snip_swimring                            © listverse.com

Statistics is Dead – Long Live Statistics

To be an expert in a thematic field!

Lee Baker wrote an article that will please the whole community of official statistics where specialists of many thematic fields (and not alone statisticians or mathematicians or … data scientists) are collecting, analysing, interpreting, explaining and publishing data.
It’s this core message that counts:
“… if you want to be an expert Data Scientist in Business, Medicine or Engineering”  (or vice versa: An expert statistician in a field of official statistics like demography, economy, etc.)  “then the biggest skill you’ll need will be in Business, Medicine or Engineering…. In other words, …. you really do need to be an expert in your field as well as having some of the other listed skills”

Here is his chain of arguments:

“Statistics is Dead – Long Live Data Science…

by Lee Barker

I keep hearing Data Scientists say that ‘Statistics is Dead’, and they even have big debates about it attended by the good and great of Data Science. Interestingly, there seem to be very few actual statisticians at these debates.

So why do Data Scientists think that stats is dead? Where does the notion that there is no longer any need for statistical analysis come from? And are they right?

Is statistics dead or is it just pining for the fjords?

I guess that really we should start at the beginning by asking the question ‘What Is Statistics?’.
Briefly, what makes statistics unique and a distinct branch of mathematics is that statistics is the study of the uncertainty of data.
So let’s look at this logically. If Data Scientists are correct (well, at least some of them) and statistics is dead, then either (1) we don’t need to quantify the uncertainty or (2) we have better tools than statistics to measure it.

Quantifying the Uncertainty in Data

Why would we no longer have any need to measure and control the uncertainty in our data?
Have we discovered some amazing new way of observing, collecting, collating and analysing our data that we no longer have uncertainty?
I don’t believe so and, as far as I can tell, with the explosion of data that we’re experiencing – the amount of data that currently exists doubles every 18 months – the level of uncertainty in data is on the increase.

So we must have better tools than statistics to quantify the uncertainty, then?
Well, no. It may be true that most statistical measures were developed decades ago when ‘Big Data’ just didn’t exist, and that the ‘old’ statistical tests often creak at the hinges when faced with enormous volumes of data, but there simply isn’t a better way of measuring uncertainty than with statistics – at least not yet, anyway.

So why is it that many Data Scientists are insistent that there is no place for statistics in the 21st Century?

Well, I guess if it’s not statistics that’s the problem, there must be something wrong with Data Science.

So let’s have a heated debate…

What is Data Science?

Nobody seems to be able to come up with a firm definition of what Data Science is.
Some believe that Data Science is just a sexed-up term for statistics, whilst others suggest that it is an alternative name for ‘Business Intelligence’. Some claim that Data Science is all about the creation of data products to be able to analyse the incredible amounts of data that we’re faced with.
I don’t disagree with any of these, but suggest that maybe all these definitions are a small part of a much bigger beast.

To get a better understanding of Data Science it might be easier to look at what Data Scientists do rather than what they are.

Data Science is all about extracting knowledge from data (I think just about everyone agrees with this very vague description), and it incorporates many diverse skills, such as mathematics, statistics, artificial intelligence, computer programming, visualisation, image analysis, and much more.

It is in the last bit, the ‘much more’ that I think defines a Data Scientist more than the previous bits. In my view, if you want to be an expert Data Scientist in Business, Medicine or Engineering then the biggest skill you’ll need will be in Business, Medicine or Engineering. Ally that with a combination of some/all of the other skills and you’ll be well on your way to being in great demand by the top dogs in your field.

In other words, if you want to call yourself a Data Scientist you really do need to be an expert in your field as well as having some of the other listed skills.

Are Computer Programmers Data Scientists?

On the other hand – as seems to be happening in Universities here in the UK and over the pond in the good old US of A – there are Data Science courses full of computer programmers that are learning how to handle data, use Hadoop and R, program in Python and plug their data into Artificial Neural Networks.

It seems that we’re creating a generation of Computer Programmers that, with the addition of a few extra tools on their CV, claim to be expert Data Scientists.

I think we’re in dangerous territory here.

It’s easy to learn how to use a few tools, but much much harder to use those tools intelligently to extract valuable, actionable information in a specialised field.

If you have little/no medical knowledge, how do you know which data outcomes are valuable?
If you’re not an expert in business, then how do you know which insights should be acted upon to make sound business decisions, and which should be ignored?

Plug-And-Play Data Analysis

This, to me, is the crux of the problem. Many of the current crop of Data Scientists – talented computer programmers though they may be – see Data Science as an exercise in plug-and-play.

Plug your dataset into tool A and you get some descriptions of your data. Plug it into tool B and you get a visualisation. Want predictions? Great – just use tool C.

Statistics, though, seems to be lagging behind in the Data Science revolution. There aren’t nearly as many automated statistical tools as there are visualisation tools or predictive tools, so the Data Scientists have to actually do the statistics themselves.

And statistics is hard.
So they ask if it’s really, really necessary.
I mean, we’ve already got the answer, so why do we need to waste our time with stats?

Booooring….

So statistics gets relegated to such an extent that Data Scientists declare it dead.”

The original article and discussion –>here


About the Author

Lee Baker is an award-winning software creator with a passion for turning data into a story.
A proud Yorkshireman, he now lives by the sparkling shores of the East Coast of Scotland. Physicist, statistician and programmer, child of the flower-power psychedelic ‘60s, it’s amazing he turned out so normal!
Turning his back on a promising academic career to do something more satisfying, as the CEO and co-founder of Chi-Squared Innovations he now works double the hours for half the pay and 10 times the stress – but 100 times the fun!”


This post is taken from datascience.central and has been published previously in Innovation Enterprise and LinkedIn Pulse

For a fact-based Worldview

2015-10-07_RoslingMedia

Hans Rosling, co-founder and promoter of the Gapminder Foundation and of gapminder.org fights with statistics against myths (‘Our goal is to replace devastating myths with a fact-based worldview.’) and tries to counterbalance media focussing on war, conflicts and chaos.

Here one more example (and this in a media interview…): ‘You can’t use media if you want to understand the world’ (sic!)

And this statement on gapmider.org; ‘Statistical facts don’t come to people naturally. Quite the opposite. Most people understand the world by generalizing personal experiences which are very biased. In the media the “news-worthy” events exaggerate the unusual and put the focus on swift changes. Slow and steady changes in major trends don’t get much attention. Unintentionally, people end-up carrying around a sack of outdated facts that you got in school (including knowledge that often was outdated when acquired in school).’ http://www.gapminder.org/ignorance/

 

 

Dürer’s Rhinoceros and Statistics

Mixing Dürer’s Rhino with Statistics might sound a little bit strange.

Dürer’s Rhinoceros – Wikipedia, the free encyclopedia.

But in an epistemological perspective there’s a point.  Dürer never saw a Rhinoceros; he created it – in accordance with some information he got- in the process of drawing. Statistics – in a sense –  do the same and this even with objects which do not exist in ‘reality’.

This topic is itself an object in several studies. So the Norvegians Rudinow Saetnan, Heidi Mork Lomell and Svein Hammer treat it in their reader ‘The mutual construction of statistics and society‘.

‘How does the act of counting affect the world? How does it change the objects counted, change the lifes of those who count (double entendre intended)? …  Our argument, briefly stated, is that society and the statistics that measure and describe it are mutually constructed.  This argumcnt addresses two counterarguments from seemingly opposite directions. On the one hand, we oppose the notion that statistics are simple, straighthforward, objective descriptions of society, gathered from nonparticipant points of observation…. Like all othcr specific forms of viewing, it is a social act. Counting acts in and upon the social world. Of course, this also means that not counting has an effect on the aspects of the world we (do and/or don’t) count. ….
On the other hand, we also oppose the notion that statistics and/or society are mere fictions, to bc invented at will.’  (Introduction, p.1)

And in its alltime classic ‘The politics of Large NumbersAlain Desrosières treats the same question: ‘ … it is difficult to think simultaneaously that the objects being measured really do exist, and thatt this is only a convention’ (p.1)

And here’s the real Rhinoceros (Indian rhinoceros (Rhinoceros unicornis), Panzernashorn )

Statistics are not so bad .. -;) .

IMAODBC 2010: And the winner is . . .

The Bo Sundgren Award of the International Marketing and Output Database Conference IMAODBC 2010 in Vilnius goes to Vincenzo Patruno from Statistics Italy ISTAT. In his presentation about Data Sharing Vincenzo Patruno demonstrates the use of widgets for the dissemination of statistical informations. Widgets are small pieces of code which can be embedded in a website and interact with an application, i.e. a database. Once embedded the information they provide is always updated automatically whenever the application itself is updated.
See some examples on Vincenzos Blogespecially the post How to Share a whole application on the Web. The small table with figures for Rome on the right hand-column of his blog is such a widget.