API and Apps: An example fom official statistics

An example of an API access to statistical data

The U.S. Census Bureau  now offers some of its public data in machine-readable format. This is done via an Application Programming Interface (“API”).
Based on this API an App has been developed helping to query data from the Cenus 2010:

No data without legal clarification. The Census Bureau does it like follows:

You may use the Census Bureau API to develop a service or service to search, display, analyze, retrieve, view and otherwise “get” information from Census Bureau data.
All services, which utilize or access the API, should display the following notice prominently within the application: “This product uses the Census Bureau Data API but is not endorsed or certified by the Census Bureau.” You may use the Census Bureau name in order to identify the source of API content subject to these rules. You may not use the Census Bureau name, or the like to imply endorsement of any product, service, or entity, not-for-profit, commercial or otherwise.’

Open Government Data Benchmark: FR, UK, USA

Finally there’s a very interesting comparison of OGD in three leading countries.

qunb did it . Have a look at this presentation.

1) There are lots of duplicates on OGD platforms


2) There are very few structured data yet



3) Apps are the real challenge

There are different strategies fostering the developmemt of Apps made with open data. The U.K. method seems to be one of the most productive


The presentation in French

Taking You Back

Where statistics meet individuals: US Census Bureau publishes the 1940 Census records.



And lots of interesting infographics


Official Statistics: Identify Common Challenges

In his Blog Director Groves of the US Census Bureau informs about an important discussion among his colleagues (thanks Xavier for this hint):

‘Several weeks ago, at the initiative of Brian Pink, the Australian statistician, leaders of the government statistical agencies from Australia, Canada, New Zealand, United Kingdom, and the United States held a summit meeting to identify common challenges and share information about current initiatives. ..

… They perceive the same likely future challenges for central government statistical agencies, and they are making similar organizational changes to prepare for the future. …

Ingredients of the future vision:

  1. The volume of data generated outside the government statistical systems is increasing much faster than the volume of data collected by the statistical systems; almost all of these data are digitized in electronic files.
  2. As this occurs, the leaders expect that relative cost, timeliness, and effectiveness of traditional survey and census approaches of the agencies may become less attractive.
  3. Blending together multiple available data sources (administrative and other records) with traditional surveys and censuses (using paper, internet, telephone, face-to-face interviewing) to create high quality, timely statistics that tell a coherent story of economic, social and environmental progress must become a major focus of central government statistical agencies.
  4. This requires efficient record linkage capabilities, the building of master universe frames that act as core infrastructure to the blending of data sources, and the use of modern statistical modeling to combine data sources with highest accuracy.
  5. Agencies will need to develop the analytical and communication capabilities to distill insights from more integrated views of the world and impart a stronger systems view across government and private sector information.
  6. There are growing demands from researchers and policy-related organizations to analyze the micro-data collected by the agencies, to extract more information from the data.

… In short, the five countries are actively inventing a future unlike the past, requiring new ways of thinking and calling for new skills.  The payoff sought is timelier, more trustworthy, and lower cost statistical information measuring new components of the society, economy, and environment, telling a richer story of our countries’ progress. ‘

Read the full blog post here: http://directorsblog.blogs.census.gov/2012/02/02/national-statistical-offices-independent-identical-simultaneous-actions-thousands-of-miles-apart/

Open data: Waiting ….

UK and US governments support open data … not only in their own countries. In an official letter they ask OECD to join this movement.

‘On behalf of US Secretary of State Hillary Rodham Clinton and UK Foreign Secretary William Hague, the heads of the two countries’ missions to the OECD delivered a letter this week to the Organisation’s Secretary General, Angel Gurría. In it, Mrs Clinton and Mr Hague called on the OECD to commit to the principles of the Open Government Partnership, and make all of its core data freely available online. ‘ https://usoecd.cms.getusinfo.com/data.html


Awaiting an answer ……..



in Warsaw there was held the OGDcamp 2011.
Waiting for the keynotes posted …



An instructive introduction to Open data.


And …

a key message

from Vincenzo Patrunos presentation at ISTAT for the Italian Statistics Day (yes! October 20th !!) where were discussed about Open Data and Open Government  during the workshop “Open Official Statistical Data”.

The same from his presentation at IMAODBC 2011. Have a look at it.

Waiting for the paper …. -;)



Infovis vs. Statistical Graphs?

Two statements from a controversy on data visualisation: statisticians vs. visualisation specialists,  statistical graphics vs. Information visualization (a.k.a infovis).  A controversy? Not really!

The visualisation expert: ‘And yet, visualization is much, much more than what it appears to be at first glance. The real power of visualization goes beyond visual representation and basic perception. Real visualization means interaction, analysis, and a human in the loop who gains insight. Real visualization is a dynamic process, not a static image.  Real visualization does not puzzle, it informs.’

Robert Kosara, UNC Charlotte, http: // eagereyes. org/

The statistician: ‘ … differences between statistical graphics and infovis. In statistical graphics we aim for transparency, to display the data points (or derived quantities such as parameter estimates and standard errors) as directly as possible without decoration or embellishment. ‘In a modern computing environment, a display such as Nightingale’s [infovis] could link to a more direct graphical presentation …., which in turn could link to a spreadsheet with the data. The statistical graphic serves as an intermediate step, allowing readers to visualize the patterns in the data.’

Andrew Gelman, Dep. of Statistics and Department of Political Science Columbia University, New York Antony Unwin, Department of Mathematics University of Augsburg

Read the two articles published in the joint newsletter of the Statistical Computing & Statistical Graphics Sections of the American Statistical Association, Volume 22.

‘This volume features two articles both looking at the aspects of “graphical displays of quantitative data”. In the first paper “Visualization: It’s More than Pictures!” by Robert Kosara, Robert sheds a light from the point of view of an InfoVis person, i.e. someone who primarily learned how to design tools and techniques for data visualization. With the second article “Visualization, Graphics, and Statistics” by Andrew Gelman and Antony Unwin, we get a similar view, but now from someone whose primary training is in math and/or statistics.’

In the introduction Jürgen Symanzik gives an excellent  crash course in data visualization and its power:

‘It appears as if statistical graphics have helped to detect the unknown and unexpected — again! Most of us know the classical examples from the last 150 years where statistical graphics have helped to discover the previously unknown. This includes John Snow’s discovery that the 1854 cholera epidemic in London most likely was caused by a single water pump on Broad Street, a fact he observed after he had displayed the deaths arising from cholera on a map of London. A second, well–known example is Florence Nightingale’s polar area charts from 1857, the so–called Nightingale’s Rose (sometimes incorrectly called coxcombs), that demonstrated that the number of deaths from preventable diseases by far exceeded the number of deaths from wounds during the Crimean War. These figures convinced Queen Victoria to improve sanitary conditions in military hospitals. Many additional important scientific discoveries based on the proper visualization of statistical data could be mentioned, but the most important fact is: New discoveries based on the visualization of data can happen here and now!

This is a message we should carry to our collaborators, students, supervisors, etc.: Statistical graphics (or visual data mining, visual analytics, or any other name you like) typically do not provide a final answer. But, statistical graphics often help to detect the unexpected, formulate new hypotheses, or develop new models. Later on, additional experiments or ongoing data collection as well as more formal methods (and p–values if you really want) may be used to verify some of the original graphical findings.’

Jürgen Symanzik Utah State University

Nightingale’s Rose


Mapping America

A recent project of The New York Times allows to ‘browse local data from the Census Bureau‘s American Community Survey, based on samples from 2005 to 2009’. It’s a great visual and interactive application designed by By MATTHEW BLOCH, SHAN CARTER and ALAN McLEAN.

Several topics and maps are available

and provide insights down to cities and blocks

‘Because these figures are based on samples, they are subject to a margin of error, particularly in places with a low population, and are best regarded as estimates.’