Hint: A new kind of search services?
The blog: http://ow.ly/blYdV
Hint: A new kind of search services?
The blog: http://ow.ly/blYdV
From the New York Times:
“PARIS — Google, which organizes the world’s information digitally, is linking up with a precursor that aimed to do something similar, on paper.
It plans to announce Tuesday [13 March 2012] that it is forming a partnership with a museum in Mons, Belgium, dedicated to a long-ago venture to compile and index knowledge in a giant, library-style card catalog with millions of entries — an analog-era equivalent of a search engine or Wikipedia. …
… Long before them, in 1895, two Belgians, Paul Otlet and Henri La Fontaine, began the project that grew into the Mundaneum. Their card catalog, initially called the Universal Bibliographic Repertory, compiled links to books, newspaper and magazine articles, pictures and other documents from libraries and archives around the world. People were able to submit queries via the mail or telegraph. The collection expanded to 16 million cards, and Mr. Otlet and Mr. La Fontaine envisioned a “city of knowledge,” complete with museum exhibits and other archival material. …
…The partnership is part of a broader campaign by Google to demonstrate that it is a friend of European culture, at a time when its services are being investigated by regulators on a variety of fronts.’
Two years ago Wolfram launched Wolfram|Alpha, a search engine (‘computational knowledge engine’) which does more than find objects or link objects in the sense of linked data: An engine computing answers from a huge amount of (partially manually) curated data and milllions of algorithms used in Wolfram’s software Mathematica.
‘Oh, and by the way, these days the majority of queries to Wolfram|Alpha give zero hits in a search engine; they don’t ever appear literally on the web. So the only way to get an answer is to actually compute it.’
So the words of Stephen Wolfram in his keynote speech at Wofram Summit 2010. This speech gives an extensive insight into Wolfram’s philosophy and objectives, fascinating!.
And now, some days ago, Wolfram launched a new data format – CDF: Computable Document Format. From the announcement: ‘CDF is a new standard that’s as everyday as a document, but as interactive as an app. It empowers readers to drive content and generate results live for a deeper understanding. And authoring interactivity is easy enough for teachers, journalists, analysts, managers, or researchers to add to reports, presentations, blogs, infographics, articles, and textbooks.’
A viewer (100mb download) is needed to read the CDFs and Mathemathica (Wolfram’s Software) is needed to edit a CDF.
Take this example: Age Distribution in the World as a CDF document. It’s 78kb to download and presents this ternary graph:
This ‘ternary diagram is a graph that shows the proportion of three variables as a position in an equilateral triangle. The three variables have a constant sum (in this case to unity). This particular diagram shows the population proportion of children (0<=age<15), adults (15<=age<65), and elderly (65<=age) for different countries. The proportions have been color-coded to facilitate interpretation. … You can choose a continent or the whole world.’ (manually or with the autorun function).
More examples –> here.
The code underlying this presentation is Mathematica (first lines only):
This code of a CDF is generated by Mathematica:
Till now I haven’t seen a lot of Mathematica usage and visualisations in official statistic’s dissemination. CDF could be an interesting tool for interactive offline publications in this field.
But it’s not an open standard and as an alternative there exists PDF which allows to embed interactive elements (flash, RIA), too (rarely seen).
data.gov in UK has a linked data lead. It’s John Sheridan and he prepared
a short presentation propagating (of course) linked data with a lot of interesting examples!
Some days ago in a post I mentioned how Google and others go semantic and provide in their search results not only information about information (means: links to web pages) but information itself. So i.e. the cinema showtimes.
And Googles does even more. Google search directly provides statistical information.
Unemployment Spain gives this:
In context: (Hint: use Google without country redirect, this is: google.com/ncr)
And the source is Eurostat via Google Public data Explorer:
So why go to Eurostat or another statistical site ;-).
Search on Google for cinema or weather in a region and you will get more than a link: the weather forecast and the showtimes for today or tomorrow … .
Increasingly, search engines are going to provide more than just links, that is the information looked for. To do so Google already uses (since 2009) semantic markup on web pages in order to present search results with information instead of links to sites containing that information. Such so-called rich snippets describe people, reviews, products, recipes, etc.
Wolfram Alpha has this ambition, too. But Wolfram follows another road: Incoming search questions are analyzed via language recognition, linked to the Wolfram Alpha knowledgebase which then delivers corresponding content:
For weather Spain Wolfram Alpha does even better than Google ;-)
And now we see a step forward by Google & Co in direction of the Semantic Web: Second of June 2011 Google, Bing and Yahoo! announced schema.org, a ‘new initiative to create and support a common set of schemas for structured data markup on web pages. Schema.org aims to be a one stop resource for webmasters looking to add markup to their pages to help search engines better understand their websites.’
This is the next step after rich snippets and one further step towards the Semantic Web in action. But: Google unfortunately doesn’t use an existing standard like RDF! :-(
Many new markup categories will be added. Something relevant for statistical sites? Perhaps ‘GovernmentOrganization’ and ‘DataType’.
Providers of websites have now to decide how they will integrate such new markup in their content in order to get a good representation in search engines.
Best wishes for 2011 and have a great holiday far away from work!
Let work Google for you: searching in 5.000.000 books with more than 500.000.000.000 words and finding out how often specific words were used in the last 200 years.
For ‘work’ and ‘holiday’ it looks rather sobering -;)
In German books we find some slight nuances:
To finish with statistical terms: GDP, GNP: