Grex Agora46 Conference

Item 140: JAPANESE EARTH SIMULATOR: A CHALLENGE AND AN OPPORTUNITY

Entered by spectrum on Fri Aug 1 18:44:33 2003:

The article by John Markoff, which appeared in the New York Times on 20th
April, describing the Earth Simulator in Japan, and a similar article in
HPCwire 4/26/2002, have given a seismic jolt to the complacency of US
supercomputer policy makers. They have debunked the notion that there is only
one path to supercomputing via off-the-shelf chips in a massively parallel
configuration. To paraphrase Mark Twain "the reports of the death of parallel
vector supercomputers were an exaggeration". 

Statements of fact that: "A Japanese laboratory has built the world's fastest
computer, a machine so powerful that it matches the raw processing power of
the 20 fastest American computers combined and far outstrips the previous
leader, an IBM-built machine", are hardly surprising. 

What is puzzling is the feigned surprise that this has happened. The Earth
Simulator project started in 1997 and has been in the public domain ever
since. The Japanese reported its progress in supercomputer forum worldwide.
I have also reported it in this publication on several occasions over the past
two years. And yet we have quotes: "the arrival of the Japanese supercomputer
evokes the type of alarm raised by the Soviet Union's Sputnik satellite in
1957. In some sense we have a Computenik on our hands," said Jack Dongarra,
who reported the achievement. Jack maintains the authoritative list of the
world's 500 fastest computers based on the LINPACK benchmark and, as a leading
light in the development of MPI, is a legend in the supercomputing field. As
far back as 1987 he was sufficiently well-known for a UK delegation, which
I led, to visit him at the Argonne National Laboratory to discuss
state-of-theart HPC issues. 

For those of us long in the tooth and with grey hair, this reaction is not
new. In 1984 a number of articles were published in the US press stating that
the Japanese have, or are just about to achieve, supremacy over the US in the
manufacture of supercomputers. Raul Mendez published two articles in the
Society for Industrial and Applied Mathematics News and this triggered intense
interest. Mendez's assertion that the Japanese would gain supremacy in the
supercomputer field was immediately refuted by Peter Gregory then of Cray
Research and Neil Lincoln then of ETA Systems, in statements reported in High
Technology. They both exuded confidence that the US would keep its substantial
lead. At the same time they used this assertion as leverage to extract a
deluge of new funds from the Department of Defence for research in new
supercomputers. 

One has to remember that in the early eighties the US academic community
swallowed the concept that small systems (DEC-VAX systems) were more cost
effective and user-friendly. This led them into a desert. They then realised
that whatever the merits of small systems they were not able to do real
science on them. The lack of exposure of their new graduates to the "big
ideas" made them less suitable to industry and their competitive edge started
to wane vis-a-vis Japan and Western Europe. The Lax report in 1984 articulated
with clarity the damaging economic consequences of neglecting supercomputing.


The decision to set up initially 5 National Science Foundation (NSF)
supercomputer centres was swiftly implemented. After that supercomputer
centres and resources grew rapidly. 

As early as 1990, these lessons were forgotten and the proponents of
microprocessor-based off-the-shelf Massively Parallel Processors (MPPs) and
clusters were on the ascendance, raising doubts about vector processors'
survival. "Wither vector and Fortran!" they shouted. Some, in the guise of
the US ASCI programme, made decisions that almost destroyed Cray Research,
the main vector processor provider in the USA. There were also other factors
in Cray's difficulties, not least its dependence on users from the weapons
industry, which declined after the collapse of the Soviet Union and the end
of the Cold War. 

Cheap off-the-shelf processors have their attraction. Funding bodies like them
because they often give the illusion of cheap capability computing. In
reality, insufficient memory bandwidth and slow inter-connect switches
intercede and deliver a mirage. For example, when Intel released the iPSC/2
in the late eighties, offering a 60Mflop/s processor for $64,000, there was
much excitement and a prediction that the Cray, the workhorse of scientific
computing at that time, would become obsolete overnight. In reality the iPSC/2
could only deliver around 2Mflop/s sustained, about the same as the Control
Data CDC 6600 twenty-five years earlier, and was no match for the Cray Y-MP
for large-scale computation. The added difficulty of parallel programming in
the late eighties hammered the last nail in iPSC/2's commercial coffin. 

By the mid-nineties the new breed of computers made from off-the-shelf
commodity chips arrived on the market. Those funded from the US ASCI programme
and consisting of several thousand CPUs grabbed the headlines, but because
of communication and memory bandwidth limitations, they often deliver very
little of their potential peak performance to the user. As one US Earth
Systems scientist told me in May 2000: "In the USA it was as if we were
entering a Grand Prix, and some of us said: don't give us all that money for
the best car, just pay us less and we will buy a commodity off-the-shelf
General Motors car and soup it up. Of course we lost". 

Several universities and government agencies have tried to buy NEC machines
over the last decade for purposes like aircraft simulation, seismic studies
and molecular modelling. But resistance from the Commerce Department and
members of Congress, who complained that NEC was "dumping" the machines or
selling them below cost, thwarted these sales. This infamous protectionist
posture taken by the US government against Japanese vendors with viable vector
parallel systems and the near collapse of Cray Inc., sent shivers to aircraft
companies with a substantial stake in weapons production, who had previously
been using Cray vector supercomputers. 

The concerns from this important section of industry sparked a debate in
Washington. This debate is now over. It was resolved that there is a need for
high bandwidth systems and money has been made available for Cray Inc. to
develop the SV2 and successor products. NEC sells the SX-6 a scaled-down
version of the Earth Simulator supercomputer. Last year Cray Inc. entered into
a marketing agreement to sell these machines in the United States, but no
sales have been announced to date. 

The US policy of favouring scalar MPPs in the 1990s gave a great fillip to
vendors with microprocessor- based systems inside its protected home market,
but elsewhere by year 2000 the aerospace and weather/climate sectors were
dominated by NEC. 

Teraflop/s, for example, are essential for realistic simulation of global
climate changes. Today's weather models are too crude, too large-grained. To
extend models to include the earth's hydrology (underground water reservoirs
as well as cloud) computers require teraflop/s power. As far back as 1990,
supercomputer model simulations were exemplified by studies on smog in cities
such as Hamburg, London and the Los Angeles basin. This last was used to
develop a cost-effective technical basis for key revisions of the US Clean
Air Act. The annual cost of environmental control projects exceeds $100
billion and a modest cost reduction pays for many supercomputers. 

A few days ago I had an email from the Earth Simulator Centre in Yokohama
informing me how they are putting their 35.6Teraflop/s to good use. Several
projects are up and running. As some of you know the Earth Simulator was
developed to study two important areas. One area is atmospheric and
oceanographic science: prediction of long-term climate change (global
warming), middle/short-term climate change (EL Nino), and weather disasters
(typhoons). The other area is solid earth science: evolution of solid earth
(earth core, mantle and crust), evolution of crust and mantle near Japan,
evolution of earthquake generation processes (seismic wave tomography) and
so on. 

They are very keen to have international collaborators from the US and Europe
especially on climate change. As I understand it for prediction of typhoons
they are already negotiating collaboration with the Meteorological Agency of
Canada and the university of Taiwan. 

As we have seen from the Earth Simulator the key to Teraflop/s sustained
performance is fast CPUs, fast memories and fast interconnect switches. The
(type C), tightly coupled parallel vector supercomputer architectures from
Cray and NEC have the edge by a large margin over (type T) microprocessorbased
systems at present. 

As Burton Smith from Cray Inc. said on many occasions: "each system type is
adapted to its ecological niche; Type T systems perform well with local data,
well-balanced workload, explicit methods and domain decomposition. Type C
systems (represented by NEC and Cray) perform well with global access of data,
poorly balanced workloads, sparse linear algebra, implicit methods, and
adaptive or irregular meshes". 

At present these two types of supercomputers are capable to be massively
parallel, as the ASCI White and the Earth Simulator have demonstrated. The
issue is how to put dogma aside, pursue architectural diversity to produce
capability computers suitable in their selected niche. Remember 200 light
aircraft maybe cheaper but are not equivalent to a Boeing 747 for traversing
the Atlantic. 

Although the supercomputer market is relatively small, about $5Billion a year,
it is overlaid with strategic importance spanning technological and scientific
advances as well as national security imperatives. In many ways the politics
overwhelm the rationale, with vendors bending politicians' ears with unhealthy
consequences. 

On the same day as the New York Times article was written, Friday April 19,
the NSF Blue Ribbon Panel on Cyber-infrastructure was meeting at the
University of Michigan to present their draft report on the next phase of the
Terascale programme. Instructively the only member on the panel from outside
education was from IBM Research. He gave a presentation on what has changed
in computing. The meeting proposed a new initiative to revolutionize science
and engineering research, at NSF and worldwide, and to capitalize on new
computing and communications opportunities. 21st Century Cyber-infrastructure
would include supercomputing, but also massive storage, networking, software,
collaboration, visualization and human resources. They foresee that current
centers (NCSA, SDSC, PSC) are a key resource for the initiative. The budget
estimate is incremental $650 Million/year (continuing), although some perceive
the requested funds as rather low. 

Hopefully, NSF will have the courage to grasp the opportunity to pursue
architectural diversity and propose that a healthy chunk of the funds are used
to procure parallel vector systems. To paraphrase Queen Elizabeth speaking
earlier this week to both houses of parliament in the UK, on the occasion of
celebrating her 50th year on the throne: "Change has become constant in our
lives, the way it is managed and how we embrace it will determine the future".


4 responses total.

#1 of 4 by russ on Sat Aug 2 01:29:45 2003:

I'm not sure what Markoff is getting at.  Simulating the Earth *is*
one of those easily-parallelizable problems that Beowulf clusters
are so good at.  It's problems like Fourier transforms and such
that require access to all of the data at once, which is why it's
so hard to compute holograms.

Given this, I get the feeling that the article is planted to try
to push policymakers to support some program or option for a
purpose that Markoff (or the person he's working for) wishes to
remain unstated.  That would explain the apparent inconsistencies.


#2 of 4 by polytarp on Sat Aug 2 18:04:15 2003:

John Markoff sucks.  Free Kevin.


#3 of 4 by nydus on Sun Sep 21 20:25:52 2003:

japanese are too powerful... earth simulator is a very poewrful machine but
i think that is not possible to calculate all... math is not life.
in my opinion japan are going to decript all net passwords fot conquer the
world. hehehe crazy? whild? simply the truth... hehehe
italy forever
nydus


#4 of 4 by jp2 on Mon Sep 22 00:08:11 2003:

This response has been erased.



There are no more items selected.

You have several choices: