The discussion about the intelligent web that makes searching easy and successful is really not a new one. A short “history” in: Nova Spivacks weblog
“Some are focused on creating a vast new structure to supplant the existing Web” (John Markoff, see here)
There are already some tools and standards. A good overview to this supplanting structure:
The future of the Web is Semantic. Ontologies form the backbone of a whole new way to understand online data. By Naveen Balani, 18 Oct 2005.
„Benefits of the Semantic Web to the World Wide Web.
The World Wide Web is the biggest repository of information ever created, with growing contents in various languages and fields of knowledge. But, in the long run, it is extremely difficult to make sense of this content. Search engines might help you find content containing specific words, but that content might not be exactly what you want. What is lacking? The search is based on the contents of pages and not the semantic meaning of the page’s contents or information about the page.
Once the Semantic Web exists, it can provide the ability to tag all content on the Web, describe what each piece of information is about and give semantic meaning to the content item. Thus, search engines become more effective than they are now, and users can find the precise information they are hunting. Organizations that provide various services can tag those services with meaning; using Web-based software agents, you can dynamically find these services on the fly and use them to your benefit or in collaboration with other services.”
And: W3.org!
“The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. It is a collaborative effort led by W3C with participation from a large number of researchers and industrial partners. It is based on the Resource Description Framework (RDF) . … The Semantic Web is about two things. It is about common formats for integration and combination of data drawn from diverse sources, where on the original Web mainly concentrated on the interchange of documents. It is also about language for recording how the data relates to real world objects. That allows a person, or a machine, to start off in one database, and then move through an unending set of databases which are connected not by wires but by being about the same thing.“
And last but not least: Topic Maps
http://www1.y12.doe.gov/capabilities/sgml/sc34/document/0323.htm
http://www.isotopicmaps.org/sam/
“Others are developing pragmatic tools that extract meaning from the existing Web“. (John Markoff)
To introduce metadata in existing web content is a cumbersome task. What if intelligent robots could do it ? I think Google is working on it and this could also explain why this company is expanding so fast and is founding new local offices (so in Zurich, Switzerland in 2006 and expanding in 2007). Wait and see …
And some are trying to do it by (traditional but well working) instruments and cooperation.
So for example Statistical Institutes are introducing predefined keywords in their search queries and start to harmonize them with others. For some topics there are also standardized, thesaurus like compilations of terms describing huge content ( for instance NOGA 2002 – General Classification of Economic Activities). A lot of work in the field of semantics is done. An international and language independent search in statistical data becomes possible. Really? (see also Alf’s post and comment)