What DataUsa is doing could be – I guess – the next step in the evolution of Open Government Data websites. It’s the step from offering file downloads to presenting data (and not files) interactively. And it’s a kind of presentation many official statistical websites would surely be proud of.
César A. Hidalgo from MIT discusses the philosophy behind this. More at the end of this post; at first a short look at this website.
Bringing data together
Merging data from different sources may have been the most expensive and challenging task and the conditio sine qua non for the existence of this website. And perhaps it’s more an organizational than a technical challenge.
Seven public data sources are accessible via DataUsa
Presenting data
Adapting to what internauts normally do, the main entrance is a search bar;
Thematical and geographical profiles are available, too. But in a hidden menu.
The presentation of the data is a mix of generated text and various types of graphs.
The option above every graph allows to share, embed, download, get a table and even an API for the data.
And finally thematical maps provide other views and insights:
Storytelling
But the fascinating part is Stories
Various authors write stories focussing on special topics and using the presentation techniques of the site.
Background
A glossary explains technical terms and the About Section presents the authors and their aim:
‘In 2014, Deloitte, Datawheel, and Cesar Hidalgo, Professor at the MIT Media Lab and Director of MacroConnections, came together to embark on an ambitious journey — to understand and visualize the critical issues facing the United States in areas like jobs, skills and education across industry and geography. And, to use this knowledge to inform decision making among executives, policymakers and citizens.’
Here’s the design philosophy in a visual nutshell:
‘Our hope is to make the data shopping experience joyful, instead of maddening, and by doing so increase the ease with which data journalists, analysts, teachers, and students, use public data. Moreover, we have made sure to make all visualizations embeddable, so people can use them to create their own stories, whether they run a personal blog or a major newspaper.’
And:
‘After all, the goal of open data should not be just to open files, but to stimulate our understanding of the systems that this data describes. To get there, however, we have to make sure we don’t forget that design is also part of what’s needed to tame the unwieldy bottoms of the deep web.’
How to explain important demographic indicators? Try it by telling the story of Laura and Luca. Statistical storytelling at its best!
“From the first breath of life until death, the road of life in Switzerland (and not only here) is strewn with an abundance of data averages from official statistics.
For the past ten years, an average 75,700 babies have been born annually in Switzerland, the majority of whom are boys: 106 boys to 100 girls. His name is Luca. As with most other children, this little boy will grow up in a couple with child(ren), go to school and move on to new horizons. On reaching adolescence he will have to be careful, however. Surveys show that young men aged 15 to 24 years have constituted for a number of years the group at greatest risk of death among persons under 25 years of age. …….
…… Then, around the age of 32, and as for almost two-thirds of Swiss nationals, Luca will join the group of people who have the opportunity to get married, completing the list of 40,700 marriages registered each year in Switzerland. Her name will be Laura. She will be approximately 29 years old when she says “I do” to Luca……..
……… Laura and Luca’s marriage will probably go through some rough times, in particular towards the 6th and 7th year of marriage, a difficult milestone for many couples who often decide to get divorced at this moment. Their marriage will in fact last an average of 15.5 years.” ……………………..
………. Read the full article written by Fabienne Rausa (FSO) here and the magazine ValueS here
There are (at least) two big challenges official statistics will be faced with in the next few years and which will possibly change its quasi-monoplistic position.
.
On the input side it’s Big Data
‘“Big Data” is a term used to describe massive information stores – generally measured in petabytes and exabytes – and also refers to the methods and technologies used to analyze these large data volumes. The core principles of Big Data (data mining, analytics) have been around for some time, but recent technology has enabled the collection and analysis of previously unimaginable data volumes at extremely high speeds.’ So says for example SAP and gives some examples how Big Data will change your life (big words and they show how big software and hardware players begin to occupy the field).
Official Statistics has already put this on the agenda! And so has the in United Nations Statistics Division’s (UNSD) Friday Seminar on Emerging Issues, 22 February 2013.
Some papers from this Seminar:
Gosse van der Veen Statistics Netherlands. High Level Group for the Modernization of Statistical Products and Services. Big Data: Big Opportunity!
Aspects of Big Data and real-time analytics are provided in another paper by Global Pulse (an innovation initiative launched by the Executive Office of the United Nations Secretary-General): Big Data for Development: Opportunities & Challenges
.
The discussion is launchedand as mentions the HLG paper: ‘To use Big data, statisticians are needed with a different mind-set and new skills. The processing of more and more data for official statistics requires statistically aware people with an analytical mind-set, an affinity for IT (e.g. programming skills) and a determination to extract valuable ‘knowledge’ from data. These so-called “data scientists” can be derived from various scientific disciplines.’
.
On the output side it’s (Linked) Open Data in combination with APIs
Open Data is not at all a new topic for Official Statistics. National Statistical Institutes were forerunners in openly providing data; organizations like UN or EUROSTAT went this way as well.
Several Open Data initiatives (USA, UK, France, EU …) consist mostly of data catalogues, and are in that sense also public relations initiatives. A large part of the data so provided consists of statistical data already available, often, on the website of the National Statistical Institute concerned. The EU portal, for instance, offers 5716 datasets of statistical data from a total of 5893 (as of April 2013).
Further central questions are the licensing of data, as well as their availability in machine-readable formats.
Machine-readable statistical data, Application Programming Interfaces (APIs) to the data and especially Linked Open Data LOD (–> essentials, –>tutorial) open the way to creative applications and new models of presenting information.
An Europe-wide Linked Open Data (LOD2) project ‘was launched in September 2010 and will run for four years. It addresses exploitation of the web as a platform for data and information integration, and the use of semantic technologies to make government data more useable.’
Looking for third-party APPs
Data Providers are looking at applications or mashups made with their data with much interest, and they are even sponsoring competitions and hack days (like Apps4EU) to stimulate the reuse of open data, especially from the public sector.
The most popular APP creator and statistical storyteller is Hans Roslings with Gapminder. Rosling himself is a pioneer in fighting for open data.
Open Data, Linked Open Data and APIs are changing the dissemination paradigm of statistical agencies. More people with new skills will do new things. Coding is becoming the new literacy, says i.e. Garrett Heath in his advice for his unborn daughter: ‘I was blown away that the buzz is not around mobile apps, but rather around using APIs. Ten years ago saw the creation of the social networking platforms. The past five years has been about accumulating the data. The next five years and beyond will be about interpreting that data. [My daughter will have access to] a boatload of interesting data sitting in accessible databases that is waiting to be exposed and interpreted with her [the programmer’s]) creativity.’
Storytelling with data
Storytelling based on data is less and less the domain of statistical agencies. Storytelling can access multiple (new) resources and take on new forms. To satisfy the basic idea of an easily understandable and appealing presentation of statistical content, statistical institutions cannot avoid taking certain measures to improve their content and presentation. The “composer” must know how the music is to be played, that is as a quick, competent, qualitatively unique, reliable and indispensable data source.
But this presentation job can no longer be done on one’s own: cooperative partnerships are necessary and have already begun to some extent, both with partners outside statistical institutions and between such institutions. This discussion has been launched.
And this: Many small open data give big data insights
FORGET BIG DATA, SMALL DATA IS THE REAL REVOLUTION says Rufus Pollock co-Director of the Open Knowledge Foundation : ‘… the discussions around big data miss a much bigger and more important picture: the real opportunity is notbig data, but small data. Not centralized “big iron”, but decentralized data wrangling. Not “one ring to rule them all” but “small pieces loosely joined”.’
Have it all on one page: graphs, text, audio, image and all this as a timeline. ONS shows how to do visual, textual and interactive storytelling. Great!
An example of a well made storytelling with simpledynamic graphics and short textual explanations for a broad public. In (Flash and) Italian only … but pictures say more than 1000 words in every language 😉
More visual storytelling here on the website of Statistics Italia
It’s like an app, a separate presentation not embedded into the navigation of the mother website. One single link only – behind the NSI logo – gives the context to the whole statistical content offered by Statistics Italia. I speak of … noi italia.
And it’s beautiful. A mix of static and dynamic (flash, delivered by NCVA) content, text (some call it storytelling) with interactive graphs and maps. .
FFunction is a Montréal-based company specializing in user interfaces and data visualization.
In an interview with ReadWrite FF Function gives a short description of what data visualisation should do. It reminds us a lot of the long discussions in official statistics:
‘Visualization should reveal hidden patterns and trends within the data. It should explore a topic, help make a discovery or tell a story. Whatever the goal, you have to turn the data into information that people can understand.’
And these are the elements of the graph:
Fields: Design, Communication, Information and their mix: Visual Communication, Data journalism, User Interface
Raw elements: Look & Feel, Idea, Data
Disciplines: Journalism, Information Architecture, Typography
Process elements: Visual Design, Objective, Dataset
Outputs: Layout, Story, Report, Data Analysis, Dashboard, Interface
Since its emergence in the 19th century, the continuous publication of results has been an important activity of official statistics. This publication activity, as well as its conditions, objectives and forms have been debated time and again in official statistics’ community.
About 10 years ago, one theme in particular came to the forefront of discussions: storytelling and its role in the dissemination and communication of statistics. Storytelling is a programme to make the results of official statistics accessible and understandable to people and – in fulfilment of an information mandate – to make “evidence based decision making” possible.
Looking back, it becomes apparent that storytelling meant, and still means, many different things to statisticians. The goal is largely undisputed, but the implementation varies widely and is influenced by developments in the media sector.
Where are we today? What and where is the potential of the storytelling approach in the world of the social and semantic web?
The goal of the following paper is to make an inventory of what storytelling comprises,what role storytelling plays within the framework of official statistics and which challenges official statistics face in view of the rapidly changing media environment. This paper is the contribution of the author to the International Marketing and Output Database Conference IMAODBC 2010 in Vilnius.