Touching Statistics

This week the German parliament lifted the ban on using tablet computers like the iPad during sessions. Earlier this year one MP dared to read his speech from an iPad and got cited for that. Still laptop computers aren’t allowed for the noise and visual barrier they present.

Now I don’t have any aspirations of speaking in parliament but I do know what a difference a touch device makes. It is so much more convincing in one on one talks. Modern tablet sized computers are very likely to run SVG capable browsers, so data visualisation is a given. In this video you will see some additions to the webbased visualisations as far as the touch interface goes, which is a little different from pointer devices like the traditional mouse.

Try it out as a beta version on my personal site at

Additionally this population pyramid uses the html5 application cache which means this web-app will work even without an internet connection. So if you have a WiFi only device or are afraid of roaming costs during international conferences, just visit the above URL once and bookmark it. From then on it will work without an internet connection.

If you want to start developing yourself just do a view source at the above URL or read further in the Safari Reference Library.

BBC Dimensions: Compare Size to Familiar Areas

This BBC website does two clever things in data visualisation: It lets you compare the size – typically an area – of a flood, oil spill or festival ground to an area you’re familiar with.

Area of 2010 Pakistan Floods Overlaid the UK

So it not only uses convincing comparissons but it also offers ways to engage with the data in that you can insert your home zip code whereby you can get a much better feel of the data.

While the areas affected by the 2010 Pakistan floods have to be measured in areas of european countries, other sizes like the route Neil Armstrong and Buzz Aldrin walked on the moon’s surface can be compared to your front lawn.

Ever worried about your SVG graphics?

Not any more. Yesterday Microsoft published their fourth platform preview of Internet Explorer 9 – the last browser to get native SVG support – and I am here to tell you that it renders a lot of SVG content as was intended, right out of the box, the only caveat being: your box must be at least Windows Vista.

Check it out for yourself e.g. with this Price Kaleidoscope you might have seen at some conferences.

Internet Explorer 9 with native SVG support

If you’re stuck with WindowsXP in your office and wonder when this ubiquitous SVG world will come to a desk near you, you might be interested in SVG in Internet Explorer, a paper about possible transition strategies.

Mandatory Independence

Among all those nice posts about the latest data visualizations and web 2.0 activities we must not forget how all the data is gathered that we later on distribute, publish or visualize. Keeping the balance between the burden of filling out forms and privacy concerns on the one hand and demands for high quality data on the other have occupied us for ages. This becomes especially visible with a census where typically the largest amount of people are affected.

It probably comes at no surprise that statistical offices aren’t the only ones juggling that balance. More often than not they have superiors and those are in politics. Recent events in Canada are worth being spread in this community in case they haven’t already.

From my understanding of two articles in The Globe and Mail Canada’s Industry Minister wants to make the census long-form voluntary against the advice from Statistics Canada. The debate ended in the Canadian chief statistician stepping down:

Dr. Sheikh’s Wednesday night resignation as Statistics Canada’s chief statistician over the census is all the more remarkable because of its rarity. In a world where loyalty is king, bureaucrats of his standing do not tend to quit over differences of opinion.
He did. In doing so, he displayed qualities that have emerged through his 38-year career: stubbornness and independence of mind.
The Globe and Mail, July 23, 2010

This is indeed remarkable. The lack of people speaking up when it comes to political interference with official statistics is no proof that such interference does not exist. In theory there are provisions in some countries such as the following:

The professional independence of statistical authorities from other policy, regulatory or administrative departments and bodies, as well as from private sector operators, ensures the credibility of European Statistics.
Article 1 of the European Statistics Code of Practice

But at least off the record many people involved might regard such codes to be mere lip service and wouldn’t be surpised to read something like this in the news:

In an interview published Wednesday, Clement said that some people at Statistics Canada “like to think” they are an independent agency, but in fact they report to him as minister.

At least one should spread the word about such instances.

Analog Visualisation

Another Hans Rosling TED talk, this time on global population growth, in which you will see that his theatrical talents matter much more than the Gapminder software.

And it might encourage us to sometimes step back from the screen and try out different ideas. Do you have similar examples in mind or have you already put them into practice?

Visualizing absolute numbers

Quite often we deal with quantities that differ a lot and when it comes to visualizing those we tend to play tricks like using logarithmic scales or calculating relative numbers, a process by which a lot of the story gets lost.

Here is an artists’ project called Of All The People In All The World (Stan’s Cafe), that used rice grain to depict human beings:

Related: Running the Numbers looks at contemporary American culture through the austere lens of statistics. Each image portrays a specific quantity of something: fifteen million sheets of office paper (five minutes of paper use); 106,000 aluminum cans (thirty seconds of can consumption) and so on. My hope is that images representing these quantities might have a different effect than the raw numbers alone, such as we find daily in articles and books. Statistics can feel abstract and anesthetizing, making it difficult to connect with and make meaning of 3.6 million SUV sales in one year, for example, or 2.3 million Americans in prison, or 32,000 breast augmentation surgeries in the U.S. every month.

Comparing population pyramids

Comparing population data for regions that differ a lot in absolute numbers poses some challenges. While percentages come to mind population pyramids using percentages are a lot less familiar and are prone to mis-interpretation. But oftentimes absolute numbers means you have to adjust scales.

In the example below a clear indicator appears when the two population pyramids are scaled differently (which is not the case in all combinations). Here you see a region in the western part of Germany (the state of Hesse on the left) compared to one in the eastern part (Saxony). The latter showing an additional bulge (born in the 1980s).

Two population pyramids side by side with table

While the above example had been around for a while, it was updated today both in terms of data and technology wise. The data is Germany’s latest population projection broken down to the “Länder” level (=NUTS1). You can check out the population pyramid comparison at

This population pyramid is now using the SVG Web toolkit so that it runs out of the box in modern browsers and in Internet Explorer just as well with the help of the Flash plugin.

And while we’re at the topic, let me plug the Animated Population Pyramid of Estonia which was recently published using the same code-base.

Tufte’s Granddad

Are you in need for holiday presents in the office and on a tight budget? Why not go back in time and shop for books out of copyright. The Internet Archive is here to help. Check out Willard Cope Brinton: Graphic presentation (1939), and delve into an ancestor to the Tufte books.

You can read this book online through the beautiful web-based book reader or download in a number of formats that allow for high quality printing. For free.

An even better population pyramid

Today Statistics Germany published their latest population projection until the year 2060. Together with this data the animated population pyramid was updated as well.

Most notable is a new layout that will put the assumptions right beside the pyramid and will let you switch between four different scenarios for the future (different assumptions for: fertility, life-expectancy, net-migration).

Thanks to the SVG Web library it will work in any browser and takes full advantages of open web standards, namely Scalable Vector Graphics (SVG). Watch a short screencast to see all of the functionality.

Then check it out for yourself, it’s available in english, french, german and russian at
Internet Explorer will need the Flash plugin to make this happen, all other browsers don’t.

Postscriptum: It seems, ONS published similar data today with a different approach in visualizing. Check it out, compare and please comment.

Why open standards matter

On this blog we usually showcase best practices of how to communicate statistics and keep the technological aspects of it in the background – which is the right way to do. But we also never get tired of mentioning how statistics is a basis for informed decision making and therefore a foundation for democracy. To live up to these standards we make sure our methods are well documented and I would argue we should also give some thoughts on what technology we use.

SVG in Internet Explorer

There are two reasons for this: We should allow our users to learn from our applications on the web, build upon them, mash them up with other stuff we didn’t imagine or even improve them. And secondly when we talk about archiving in the digital age, we are well advised to use open standards. Everyone of us who recently tried to open some old Word 2.0 documents will understand what I mean.
The topic comes up as several statistical offices have moved from the SVG format for interactive statistical graphics to Flash. See the latest population pyramid from ONS or the election atlas in Germany. While SVG is an open graphics standard just like HTML, you can think of Flash more like Word documents, those are closed binary files. Users cannot look behind the scenes and if the source code gets forgotten or the technology changes dramatically, all is lost.
Now there is a flipside to it: Just like Microsoft Word, Flash is ubiquitous, works really well and works the same way across all supported platforms. SVG on the other hand had its ups and downs. Since 2008 it is very well supported on modern browsers such as Opera, Firefox, Safari and Google Chrome but even Internet Explorer 8 doesn’t handle it at all. There was a plugin for Internet Explorer, but that never had the significant market share that Flash enjoyed (no YouTube without Flash!) and was “end of lifed” in January 2009, meaning SVG support on Internet Explorer was gone at the beginning of this year. So why all the bemoaning, free and open doesn’t allways win.
Well, things are changing right now. With the help of Google an open source project aims at adding SVG support to Internet Explorer through the Flash plugin. And they are aiming high, want to implement it in Wikipedia, which uses SVG as a base format for all their graphics and maps. It’s called the SVG Web project, and already it works well enough to support our use-cases. I’ve put up an animated population pyramid and an interactive map with it and couldn’t be happier.
Even in Internet Explorer you can right click in the graphics and “view source”, see how everything was done, adapt it, improve it … And when your are using Firefox or Safari you can print these graphics into PDF and get print quality vector graphics.
Give it a look, talk about it with your tech people and let me know what you think. We have comments here for a reason.

Comparing Thematic Maps

Statistical graphics are most convincing when they allow for interesting comparisons. A pie- or bar-chart allows comparisons in one data dimension as does one map, it shows how one variable varies in different regions. But data analysis shouldn’t stop here. Diagrams like the animated population pyramid or the gapminder/trendalyzer allow comparisons in more than one dimension where one of the dimensions is usually time which is depicted through animation.
Comparing regional patterns is a little trickier. A standard use case could be the question if people have more children in regions where conservative votes are higher. This would statistically be done by calculating correlations. However regression analysis is not for everybody. At least it would be nice to show two related patterns side by side and give people an idea what correlated variables would look like. Below is an example of how this could be implemented:

You can check out this mapping application at
it will work for at least 95% of internet users