Big Analytics

Technology Predictions for 2015

For 2015, Bing predicts the same three technologies for all continents to be in the first places: Wearables, Digital Personal Assistants and Home Automation.
All these technologies are part of the Internet of Things (IoT). They have many sensors delivering data, they are connected to several clouds and they provide information about processes or behaviours of people and environments.
Devices communicate with devices, devices with humans and humans with devices. So IoT is not another Internet or an Internet besides the already known. It’s an expansion of the Internet and it produces more massive data accessible for known and unknown parties: telecom and cloud providers, enterprises, governments ……
Finally also humans become part of IoT

IoT Infographic

These aspects are explained in a good infographic made by Postscapes in collaboration with Harbor Research.
The complete infographic can be found here:

IoT, Big Data and then?

Data from classical statistical surveys as well as from connected devices must be analyzed in order to be of use.
Analytics of things (of data produced by sensors of connected things including humans) are the new kind of statistical information every organisation, private person or government can use for their purposes, for well informed decisions and steering activities.  The usages are numberless and under no control.
Examples of such statistics and applications are already numberless:
  • Singapur itraffic
  • Some more example applications of IoT in Postscapes IoT toolkit
  • And some things are intelligent even without being connected, like thermostats.


Big Analytics, Data Science and Official Statistics

The quasi-monopoly of national statistical agencies having the resources to undertake huge surveys vanishes. A  new kind of statistics emerges – collected, controlled and used by new agents.
To get the  full potential of these data, special qualifications are needed. Classical statistical analysis expands to Data Science. (Big) Data need (Big) Analytics and this explains why many predict that statistician will be the sexiest profession.
To get an idea of this business in expansion or even explosion Diego Kuonen’s frequent tweets and presentations are an ideal source.
As new sources of data appear and broader analyzing techniques become necessary, Statistical Agencies are challenged. Attacking the paradigm change is on the agenda.

 Who owns the data

Who owns the data emerging from my devices (smartphone, wearables, home automation …)? What data come from sensors out of my personal reach?
And what are they doing with these data?
Huge data protection issues wait for answers. Reality seems to be faster than rules …
'Estimates vary, but by 2020 there could be over 30 billion devices connected to the Internet. Once dumb, they will have smartened up thanks to sensors and other technologies embedded in them and, thanks to your machines, your life will quite literally have gone online.'


Big Data – Big Projects – Big Discussions

‘Old’ Data vs. Reality Mining

For a long time Official Statistics are synonym for data. With the emergence (or better: the stronger awareness) of new information sources – aka Big Data – this is about to be changed. And the opportunities these data are offering are changing, too. With all the risks (privacy!) included.
In the light of the research activities and projects around Big Data and reality mining traditional statistical data management seems to date from another time. Evidence based decision making is migrating to a new level.
Some prominent examples of Big Data research:

Project FuturICT

One very interesting and very ambitious project facing such a BIG Data opportunity is FuturICT, lead by Dirk Helbling from the Swiss Federal Institute of Technology in Zurich (ETHZ).
FuturICT’s ‘ultimate goal … is to understand and manage complex, global, socially interactive systems’ (Homepage FuturICT).
Introducing FuturICT by Dirk Helbling:
Some points (taken from Edge and the FuturICT brochure):
‘There are two big global trends. One is big data. That means in the next ten years we’ll produce as many data, or even more data than in the past 1,000 years.
The other trend is hyperconnectivity. That means we have networking our world going on at a rapid pace; we’re creating an Internet of things. So everyone is talking to everyone else, and everything becomes interdependent. ….
But on the other hand, it turns out that we are, at the same time, creating highways for disaster spreading. We see many extreme events, we see problems such as the flash crash, or also the financial crisis. That is related to the fact that we have interconnected everything. In some sense, we have created unstable systems. We can show that many of the global trends that we are seeing at the moment, like increasing connectivity, increase in the speed, increase in complexity, are very good in the beginning, but (and this is kind of surprising) there is a turning point and that turning point can turn into a tipping point that makes the systems shift in an unknown way. ……
We really need to understand those systems, not just their components. It’s not good enough to have wonderful gadgets like smartphones and computers; each of them working fine in separation. Their interaction is creating a completely new world, and it is very important to recognize that it’s not just a gradual change of our world; there is a sudden transition in the behavior of those systems, as the coupling strength exceeds a certain threshold.’

Three components

‘The [first] component to ‘measure the state of the world’ is called the Planetary Nervous System. It can be imagined as a global sensor network, where ‘sensors’ include anything able to provide data in real-time about socio-economic, environmental or technological systems (including the Internet). Such an infrastructure will enable real-time data mining – reality mining – and the calibration and validation of coupled models of socio-economic, technological and environmental systems with their complex interactions. It will even be possible to extract suitable models in a data-driven way, guided by theoretical knowledge.’ (Future ICT. Global computing for our complex world, p.18)
The second component, the Living Earth Simulator will be very important here, because that will look at what-if scenarios. It will take those big data generated by the Planetary Nervous System and allow us to look at different scenarios, to explore the various options that we have, and the potential side effects or cascading effects, and unexpected behaviors, because those interdependencies make our global systems really hard to understand.’
The third component will be the Global Participatory Platform. That basically makes those other tools available for everybody: for business leaders, for political decision-makers, and for citizens. We want to create an open data and modeling platform that creates a new information ecosystem that allows you to create new businesses, to come up with large-scale cooperation much more easily, and to lower the barriers for social, political and economic participation.’

Scoop.IT: FuturICT



Social Physics: Another Approach

Alexander Pentland from MIT Media Labs is also dealing with the opportunities of Big Data. In his book “Social Physics” he reflects about what can be done with this treasure of information. And it’s a rather technocratic approach he follows. 2014-06-13_socialphysics.

Social physics?

‘Social physics is a quantitative social science that describes reliable, mathematical connections between information and idea flow on the one hand and people’s behavior on the other. Social physics helps us understand how ideas flow from person to person through the mechanism of social learning and how this flow of ideas ends up shaping the norms, productivity, and creative output of our companies, cities, and societies. It enables us to predict the productivity of small groups, of departments within companies, and even of entire cities. It also helps us tune communication networks so that we can reliably make better decisions and become more productive.’ …
See also Pentland at a Google show:


‘The engine that drives social physics is big data: the newly ubiquitous digital data now available about all aspects of human life. Social physics functions by analyzing patterns of human experience and idea exchange within the digital bread crumbs we all leave behind us as we move through the world—call records, credit card transactions, and GPS location fixes, among others. These data tell the story of everyday life by recording what each of us has chosen to do. And this is very different from what is put on Facebook; postings on Facebook are what people choose to tell each other, edited according to the standards of the day. Who we actually are is more accurately determined by where we spend our time and which things we buy, not just by what we say we do.
The process of analyzing the patterns within these digital bread crumbs is called reality mining, and through it we can tell an enormous amount about who individuals are.’ (From ‘Social Physics: How Good Ideas Spread-The Lessons from a New Science’, The Penguin Press, 2014) .

‘How to re-engineer the world’: The Economist’s critical voice

‘ Institutions should be redesigned around social physics, [Pentland] says. For instance, to improve health-care, anonymous medical records could be used to show what treatments work best. Mr Pentland’s research also offers lessons for policymakers and business people. He advances a new way to protect privacy by creating something of a property right for personal information. People would in most cases control what personal data were collected, how they are used, and with whom they are shared, treating their personal data as assets, as they do money in a bank. Yet he is less convincing when he strays from his research to make broader points about politics and economics. He reduces too much of the world’s complexity to something to be solved by data, when they are just part of the solution. His enthusiasm for a world run by datacrats rings of a zealotry that could easily go awry. Still, “Social Physics” is a fascinating look at a new field by one of its principal geeks.’ From


‘A society enabled by Big Data’

‘Reality mining’ is the buzzword and it’s tied to the other buzzword ‘Human-Data Interaction’ HDI.
  • Human Data Interaction HDI.2014-06-14_HDI flow
    ‘Personal data about and by each of us, whether we are aware of it or not, feeds into black-box analytics algorithms to infer facts, both correct and incorrect. These drive actions, whose eff ects may or may not be visible to us’.

Internet of things

Sources of big data are not only humans with or without their devices but also objects equipped with sensors and machines communicating with machines (M2M). In the Internet of things (IoT) things exchange data, semantic description helps for the interoperability of things and interconnected smart objects become reality.

‘The internet of things is a way to deliver cheap information that could be used for good or ill. So let’s start talking about what we want as a society’ This is the motto for one of several conferences dealing with this topic:

2014-06-14_iot cheap data

Data Explosion: Analytics Software Must Adapt or Die

From ReadWriteWeb: Written by Richard MacManus / June 2, 2010 12:30 AM

In my previous few articles, I’ve explored the potential impact of sensors on the Internet. Soon there will be a trillion sensors connected to the Web, which will result in an explosion of online data. How will this mass of new and mostly real-time data be processed and analyzed? Will current data analytics software be able to cope? The short answer is, no it won’t. New types of analytics software will be required, together with much more powerful computers.

During my visit to HP Labs last month, I sat down with Meichun Hsu – director of the Intelligent Information Management Lab at Hewlett Packard – to discuss this issue. Hsu has been researching new real-time, sensor analytics solutions for the coming Internet of Things era.

Read more……

Web Wide World – Internet of Things (IoT)

The Web evolves. Everything is being tracked. Data and real world objects are linked together and the web is the medium where all this happens – so the (perhaps not so far) vision.

Nova Spivack discusses this in his article ‘From World Wide Web to Web Wide World — The Web Breaks Out of its Petri Dish’.

And a European Union Conference starting 6th of October 2008 entitled “Internet of Things – Internet of the Future is also focussing on this issue: ‘The Internet is at a crossroads of its evolution. Mobile internet and Radio Frequency Identification (RFID), among other key technologies, will soon allow the creation of an « Internet of objects » whose services will weave themselves into users’ daily life. Tomorrow’s Internet services will expand to various fields like health, education, proximity services and energy management.’

Currently the EU has launched a Consultation on the early challenges regarding the “Internet of Things”: ‘The context of this consultation is the preparation of a Communication from the European Commission on the Internet of Things (IoT), planned for the second quarter of 2009. … The Communication on the Internet of Things will propose a policy approach addressing the whole range of political and technological issues related to the move from RFID and sensing technologies to the Internet of Things. It will focus especially on architectures, control of critical infrastructures, emerging applications, security, privacy and data protection, spectrum management, regulations and standards, broader socio-economic aspects.’

In a Working Paper of the EU Commission is explained with some instructive examples what IoT could mean: ‘The phrase “Internet of Things” heralds a vision of the future Internet1 where connecting physical things, from banknotes to bicycles, through a network will let them take an active part in the Internet, exchanging information about themselves and their surroundings. This will give immediate access to information about the physical world and the objects in it – leading to innovative services and gains in efficiency and productivity.’

What could this mean for statistical information?

First of all changes in data collection (‘i.e.: ‘The Internet of Things will have a profound effect on the way traffic, weather, particles in the air, water pollution, and the environment can be monitored and statistics collected.’ Working Paper of the EU Commission, p. 5).

But also (and much more difficult to anticipate) changes for dissemination of information. Semantic Web is often seen as a system of linked data (not documents). Every object gets its URI (its unique adress) and is described with a set of specific properties (i.e. RDF triples).

So in a world of described objects (data or also real world objects), search engines can bring together objects with common properties and open new dimensions of information and knowledge.
In theory (and perhaps in a distant future) objects in the real world can be linked with a lot of other objects, one of them (object-specific) statistical data.
To think about!

See also ‘Real World Internet‘. Position Paper and ‘Future Internet Portal‘.