On the national scene, this quarter saw major changes such as the installation of a new President, an unprecedented economic downturn and a stimulus bill that offers Science and Education new funding opportunities. This quarter also saw the MC2 implement its NSF award for a planning meeting and subsequently culminate its effort to establish a Center for Hybrid Multicore Productivity Research (CHMPR) in collaboration with its 3 University partners: Georgia Tech, U Cal San Diego and U Minnesota. Overhanging the effort for attendance at the planning meeting was the impact of the economy on industry and government agencies to travel and to participate in academic support programs making it difficult for each university to obtain their letters of commitment (LOC) to become members in the CHMPR. In spite of these obstacles, our efforts produced a well-attended meeting Jan. 22/23 with over 65 attendees and mostly from industry and government. The Co-I Directors gave very interesting vision presentation of their Labs as did some of the staffs. To view the presentations visit our web site at http://mc2.umbc.edu/chmpr and click on Projects. A student poster session also helped deliver the message on what the CHMPR could contribute to meeting the challenges of attaining scalable performance and productivity in developing multicore applications. More impressive by our team, was the effort to obtain LOCs from industry in these difficult financial times. MC2 obtained 11 LOCs and three more after submission of the proposal to NSF while two of our colleagues obtained five each. The participation consists of some leading federal research agencies providing LOCs and nearly every major multicore chip vendor and several leading defense IT contractors as well. Based on the response we were able to generate, we are optimistic in our pursuit of this NSF award.
This quarter also saw substantial improvements overall to our bluegrit hybrid cluster as a result of the efforts of John Dorband with the support of Charles Lohr and Terrence Celeste. Having received from IBM just at the end of 2008, nine additional QS22 blades, four with 32 GB of ram and five with eight GB of ram, and four IBM JS22 blades, 3 with 32 GB of ram, John Dorband installed and now has all the QS blades operationally available. Working with Terrence Celeste, he expects to have the JS series operational before the end of the month. John Dorband also upgraded the software operating system by installing Fedora 8 on all our blades giving us the ability to run SDK v3.1 In addition, with the help of Charles Lohr, John acquired and installed 20 TBs of disk storage and made available 16 TB of raid 5 storage running under a system PVFS that can access data at rates near a max of 1GB/s. The storage system is now available to all bluegrit jobs.
In a complementary effort to these system improvements, Shujia Zhou, working with Chao Cheng Wu and Ashwini Lahane, have now been able to create a hybrid ring of QS and JS blade processors executing MPI jobs under MPICH2 . Moreover, Chao Cheng Wu, working with Shujia Zhou, has installed the latest, essential geoscience data library, NetCDF4. Shujia Zhou tested and evaluated the XLC 10.1 compiler’s OpenMP feature for Cell/SPE which was installed by John Dorband. Additionally, Shujia Zhou also tested and evaluated the XLF11.1 compiler for Cell/SPE installed by John Dorband. I am pleased to report that all these efforts have now made the system more stable and provided users with the capability to run a large class of modeling applications that were greatly limited prior to these changes.
Prof.’s John Dorband and Shujia Zhou of the MC2 are also offering for the Spring 09 semester a graduate course called “Introduction to Parallel Computing with the Emphasis of IBM Cell Processor”. The course was limited to 20 students and is currently over enrolled. Students of the course have been using the JS and QS blades and their combination (i.e., hybrid system) to do their homework as well as the course projects. The system has proven to be reliable for this extended use. Other faculty members outside the MC2 are now using the bluegrit syste. One Prof. has used the JS system to benchmark his MPI code for his proposal to NSF.
Substantial progress was made on several of the following projects:
1.WRF Model. After many months, Po Lun Ma, a graduate student of Prof. Arking, Johns Hopkins University, with the aid of Prof. Zhou and our system staff, have successfully compiled and executed the NCAR/NOAA community Weather Regional Forecast model (WRF) on the Cell. This model has been run on most other computer systems and to our knowledge this is the first invocation of the WRF model on the Cell processor. The model has been executed on only one Cell blade thus far and run only on the PPU. We are trying to run it at multiple Cell blades with MPI. The program had to use the netCDF 3.6.3 rather than netCDF 4.0. The next step is to get WRF to run on the SPEs, hopefully without having to convert the Fortran code to C. Unfortunately, we do not believe the IBM Fortran compiler can support openMP so we will be examining other alternatives.
2.Cloud Computing. Two major breakthroughs took place this quarter on implementing a cloud computing infrastructure on the bluegrit system employing the MapReduce parallel programming paradigm. (i) Navid Golpayageni, a student of Prof. Halem, installed and executed a version of Hadoop (an Apache based open source version of Google’s MapReduce and distributed File system programs) on the 32 JS20 blade cluster. In testing this program, he processed a months worth of satellite data from the AIRS instrument using the high resolution visible channel data and compared it with the current SOAR parallel processing gridding system approach which assigns each blade a days worth of data. He found this new parallel paradigm to perform the gridding with almost a five-fold speed increase. These results are being written up for his Masters thesis and will be presented at the IBM/ NCSU cloud computing Seminar March 26/27 at NC State University in Raleigh NC. (ii) David Chapman has successfully written and tested the MapReduce program for the Cell processor on a single blade. This work is currently being extended to execute on multiple blades. This version is distinct from the Wisconsin developed version in that it has a structure that is extensible to multiple blades making use of MPI as well as being able to execute in a hybrid manner on both the QS series and JS series simultaneously. It moreover uses a more efficient Radix sort routine in the map to reduce stage.
3.SOAR Multi Instrument Infrastructure. A multi-instrument, multi-satellite pair of software packages called ‘Downloader’ and ‘Gridder’ have successfully been written and installed by David Chapman, Phuong Nguyen and Navid Golpayageni into the operational version of the Service Oriented Atmospheric Radiance (SOAR) gridding system. SOAR is now able to download and grid more than 6 different instrument data sets from over a dozen satellites. This capability will enable instrument scientists and general users to incorporate data from a wide variety of planned, current and prior satellite missions into the SOAR system. In addition, the system can also assimilate in the same format structure weather and climate model output products of radiances and grid them for comparisons with observed climate data records. This work is being written up for consideration as a Chapter in a book to be published by In-Tech called “Geoscience and Remote Sensing”. An undergraduate student, Obinna Uozima, has been trained and has started to conduct the initial production processing of an AIRS and MODIS seven year data set. This will then be compared with a seven year collection and processing of HIRS 2/3 data sets. We are also pleased to report that our paper by M. Halem, _D. Chapman_, P. Nguyen, C.Tilmes, Y. Yelena, N. Most, K. Stewart, titled "Service Oriented Atmospheric Radiances (SOAR): Services for Gridding and Analysis of Multi-Sensor Satellite Radiance Data for Climate Studies" appeared in the Trans. Geosciences and Remote Sensing Journal. Vol.47, No. 1 Jan. 2009 P.114-122.
4.OMI Processing Algorithm. We have received the OMI benchmark algorithm from the NASA ozone processing team. Ashwini Lahane, working with Prof. Zhou, has been able to compile the system on the bluegrit JS cluster. This is a very big Fortran program that needs to gain a factor of 100 speed improvement for NOAA operational use by the end of the year. After testing, the goal is to execute the benchmark on the cell. We plan to employ xlf11 with overlay. This will be a real computing challenge to obtain such performance improvements over conventional clusters but we are nevertheless hopeful that our system can get at least part of that factor in speed.
5.RHSEG on the Cell. Srinivastan Kannan, a student working on his Master thesis has successfully tested the RHSEG segmentation algorithm on the bluegrit JS series and is in the process of converting the algorithm to run on the Cell SPU. NIST is interested in segmenting and analyzing thousands of molecular cell images.
We plan to track and report on the progress of these and other projects not reported here at this time at our next quarterly Blog report.
Sincerely,
Milt Halem
Director, Multicore Computational Center (MC2)