DOE's Computational Chemistry Grand Challenge addresses the problem of modeling large molecular systems. One application area is complex problems in environmental chemistry such as the disposal of chlorofluorocarbons and the percolation of toxic chemicals through clay-based landfills and waste sites, which are of interest to both DOE and industry. Other application areas are product design and process optimization (for example the design of polymers and composite materials) and molecular biology (for example, drug design).
Quantum chemical self-consistent field (SCF) methods for studying molecular structure on vector computing systems were limited in scalability and by inefficiencies from extra computation. The new parallel SCF implementation overcomes these limitations. For example, to speed data communications among processors, a new programming strategy makes it easier to write efficient software for computing systems with nonuniform memory access costs. This strategy is incorporated in the Fortran-callable Global Array communications library. The research has also enabled scientists to determine a scaling factor used to assign the optimal number of processors to the molecule under study.
This work is being conducted by researchers from DOE's Argonne and Pacific Northwest laboratories and pharmaceutical and chemical companies such as Allied Signal. They have run the new parallel software on the IBM SP-1, the Intel Delta and Paragon, the Kendall Square Research KSR-2, the Cray Research T3D, and workstation clusters.
The 38-atom carbonate system on the left illustrates the most advanced modeling capability at the beginning of the HPCC Program; the 389-atom zeolite system on the right was produced by a recent simulation. Computational complexity effectively grows as the cube of the number of atoms, implying a thousand fold increase in computational power between the two images.
More than half of the U.S. population depends on groundwater for its water supply. Groundwater is also an important source of irrigation and industrial process water. In many regions, available sources of groundwater are a fundamental constraint on development and economic activity. Groundwater supplies are increasingly threatened by organic, inorganic, and radioactive contaminants introduced into the environment by improper disposal or accidental release. Estimates of remediation costs of U.S. government sites alone range into the hundreds of billions of dollars. Protecting the quality of groundwater supplies is a problem of broad societal importance.
Remediation methods remain extremely (and potentially prohibitively) expensive and unpredictable in their success. The software developed under the DOE-sponsored Grand Challenges has been critical in developing effective remediation strategies. Grand Challenge numerical modeling of groundwater transport on massively parallel computing systems improves U.S. competitiveness by (1) directly applying groundwater technologies to groundwater problems, (2) applying these technologies to related industrial processes, and (3) applying generic massively parallel computational methods to industrial processes.
Earthquake Ground Motion Modeling in Large Basins: The Quake Project
A tool is being developed to simulate earthquake ground motion on parallel computing systems in order to determine how the intensity and duration of earthquake ground motion varies over a region. The knowledge gained will be used in designing earthquake-resistant structures for local conditions, leading to greater economy and safety. This work is illustrated below.
This NSF-funded project is being conducted by engineers, computer scientists, and seismologists at Carnegie Mellon University, the University of Southern California, the Southern California Earthquake center, and the National University of Mexico.
The upper image shows a computational model of a valley that has been automatically partitioned for solution on a parallel computing system, one processor to a color. The lower image shows the response of the valley as a function of frequency and position within the valley. It is well known that the response of a building to an earthquake is greatest when the frequency of the ground motion is close to the natural frequency of the building itself. These results show that damage can vary considerably depending on building location and frequency characteristics. Obtaining this kind of information for large basins such as the Greater Los Angeles Basin requires high performance computing.
Understanding land cover dynamics is critical to the study of global climate change. Databases of land cover dynamics are needed for global carbon models, and biogeochemical, hydrological, and ecosystem response modeling. Over the span of several decades, changes in vegetation take place at a scale of less than 1 kilometer and require analysis of high resolution satellite images. Portable scalable software for a variety of image and map data processing applications is being developed and in the future will be integrated with new models for parallel I/O of large-scale images and maps.
An initial focus area is generating maps of the world's tropical rain forests over the last three decades. The image below illustrates the application of a new algorithm for solving the mixture modeling problem to a remotely-sensed image of part of Africa. Mixture modeling allows environmental scientists to estimate the proportions of different vegetation types present in a single pixel, thereby characterizing the vegetation more realistically than a classification that labels each pixel as a single vegetation type. Accurate descriptions of the land surface are important boundary conditions for climate and other global environmental models. Using the new algorithm, vegetation proportions are estimated by comparing observed reflectance measurements within a pixel to measurements expected if the pixel were purely of one type and solving for the proportions using mathematical optimization procedures. The algorithm, which is based on solving similar image restoration problems, is more accurate and faster than the labor- intensive classical methods.
This NSF-funded work is being conducted at the University of Maryland.
This figure encodes the proportions of desert, grass, and forest within each pixel of a satellite image using color mixing. The Grand Challenge result, on the left, was produced using a new parallel algorithm and is a much more accurate estimate of mixture proportions than the least squares algorithm traditionally employed by environmental scientists.
The objective of this research is to demonstrate a high performance computing system for large-scale high-resolution ecosystem simulation. To date a DEVS-GIS (Discrete Event System Specification interfaced to Geographic Information System) has been implemented in C++ to run on a single processor or in a multiprocessor environment using PVM. An object-based interface enabling models to transparently access commonly-used GIS databases has also been implemented. Watershed models have been developed and tested on a single processor DEVS-GIS system and are being ported to the Thinking Machines CM-5. The next step is extensive performance analysis. This will include the development of interconnect "middleware" and visualization software specifically for the DEVS-GIS environment. The long-term goal of this NSF-supported research at the University of Arizona is to enable experimentation with complex ecosystem models that is impossible without high performance computing.
("A Case Study Combining Grand Challenge and National Challenge Technologies: Modeling San Diego Bay to Aid Resource Management" is also presented.)