Earth project aims to ‘simulate everything’
01 January BBC
It could be one of the most ambitious computer projects ever
conceived. An international group of scientists are aiming to create a
simulator that can replicate everything happening on Earth - from global
weather patterns and the spread of diseases to international financial
transactions or congestion on Milton Keynes’ roads.
Nicknamed the Living Earth Simulator (LES), the project aims to
advance the scientific understanding of what is taking place on the
planet, encapsulating the human actions that shape societies and the
environmental forces that define the physical world.
“Many problems we have today - including social and economic
instabilities, wars, disease spreading - are related to human behaviour,
but there is apparently a serious lack of understanding regarding how
society and the economy work,” says Dr Helbing, of the Swiss Federal
Institute of Technology, who chairs the FuturICT project which aims to
create the simulator.
Knowledge collider
Thanks to projects such as the Large Hadron Collider, the particle
accelerator built by Cern, scientists know more about the early universe
than they do about our own planet, claims Dr Helbing.
What is needed is a knowledge accelerator, to collide different
branches of knowledge, he says.
“Revealing the hidden laws and processes underlying societies
constitutes the most pressing scientific grand challenge of our
century.”
The result would be the LES. It would be able to predict the spread
of infectious diseases, such as Swine Flu, identify methods for tackling
climate change or even spot the inklings of an impending financial
crisis, he says.
But how would such colossal system work? For a start it would need to
be populated by data - lots of it - covering the entire gamut of
activity on the planet, says Dr Helbing.
It would also be powered by an assembly of yet-to-be-built
supercomputers capable of carrying out number-crunching on a mammoth
scale.
Although the hardware has not yet been built, much of the data is
already being generated, he says.
For example, the Planetary Skin project, led by US space agency Nasa,
will see the creation of a vast sensor network collecting climate data
from air, land, sea and space.
In addition, Dr Helbing and his team have already identified more
than 70 online data sources they believe can be used including Wikipedia,
Google Maps and the UK government’s data repository Data.gov.uk.
Drowning in data
Integrating such real-time data feeds with millions of other sources
of data - from financial markets and medical records to social media -
would ultimately power the simulator, says Dr Helbing.
The next step is create a framework to turn that morass of data in to
models that accurately replicate what is taken place on Earth today.
That will only be possible by bringing together social scientists and
computer scientists and engineers to establish the rules that will
define how the LES operates.
Such work cannot be left to traditional social science researchers,
where typically years of work produces limited volumes of data, argues
Dr Helbing. Nor is it something that could have been achieved before -
the technology needed to run the LES will only become available in the
coming decade, he adds.
Human behaviour
For example, while the LES will need to be able to assimilate vast
oceans of data it will simultaneously have to understand what that data
means. That becomes possible as so-called semantic web technologies
mature, says Dr Helbing.
Today, a database chock-full of air pollution data would look much
the same to a computer as a database of global banking transactions -
essentially just a lot of numbers. But semantic web technology will
encode a description of data alongside the data itself, enabling
computers to understand the data in context.
What’s more, our approach to aggregating data stresses the need to
strip out any of that information that relates directly to an
individual, says Dr Helbing.
That will enable the LES to incorporate vast amounts of data relating
to human activity, without compromising people’s privacy, he argues.
Once an approach to carrying out large-scale social and economic data
is agreed upon, it will be necessary to build supercomputer centres
needed to crunch that data and produce the simulation of the Earth, says
Dr Helbing. Generating the computational power to deal with the amount
of data needed to populate the LES represents a significant challenge,
but it’s far from being a showstopper.
If you look at the data-processing capacity of Google, it’s clear
that the LES won’t be held back by processing capacity, says Pete
Warden, founder of the OpenHeatMap project and a specialist on data
analysis.
While Google is somewhat secretive about the amount of data it can
process, in May 2010 it was believed to use in the region of 39,000
servers to process an exabyte of data per month - that’s enough data to
fill 2 billion CDs every month.
Reality mining
If you accept that only a fraction of the “several hundred exabytes
of data being produced worldwide every year... would be useful for a
world simulation, the bottleneck won’t be the processing capacity,” says
Mr Warden. “Getting access to the data will be much more of a challenge,
as will figuring out something useful to do with it,” he adds.
Simply having lots of data isn’t enough to build a credible
simulation of the planet, argues Warden. “Economics and sociology have
consistently failed to produce theories with strong predictive powers
over the last century, despite lots of data gathering. I’m sceptical
that larger data sets will mark a big change,” he says.
“It’s not that we don’t know enough about a lot of the problems the
world faces, from climate change to extreme poverty, it’s that we don’t
take any action on the information we do have,” he argues. |