Skip Over Navigation Links
Interface Online Center for Information Technology (CIT)
Search Interface Issues:

March 4, 2002 [Number 222]     Printable Version Printable version (469k PDF)

Index

Previous Story

Next Story

New Scientific Applications Are on Biowulf

Scientists at the NIH can utilize the high-speed computational capabilities of the Biowulf/Lobos3 supercluster to enrich their research. Managed by the Helix Systems staff, Biowulf is a parallel-processing system available to all NIH researchers.

Biowulf processors
Biowulf—800 processors running Red Hat Linux

Biowulf consists of a main login/ administrative node and about 800 processors running the Red Hat Linux operating system.

The scientific software available on Biowulf has been recently expanded so that even more researchers can take advantage of the supercluster.

Computational Chemistry

  • Molecular Dynamics Packages

    Two major molecular dynamics (MD) packages are available on Biowulf—CHARMM (Chemistry at HARvard Molecular Mechanics) and AMBER (Assisted Model Building with Energy Refinement).

    MD calculations can involve significant message passing between the nodes, so that in some cases the job can benefit significantly from a faster internode network. Two executables are available for each program to support the different types of internode communications on Biowulf (e.g., fast Ethernet and Myrinet 2000).

      CHARMM
      Cover scripts for CHARMM, developed by Dr. Richard Venable (FDA/CBER), allow the user to select the CHARMM version and the communication network. Novice CHARMM users may find their runs greatly simplified by use of these scripts. Advanced users can use these scripts with their own versions of CHARMM.

      AMBER
      AMBER is a molecular simulation package that contains a large number of modules. It is important to note that only sander, sander classic and gibbs are multithreaded.

      AMBER molecular simulation package
      AMBER—One iteration of Sander for a single-stranded dinucleotide (5'-AT-3') plus one sodium ion in a periodic bos of water molecules. Comparison of timings on different nodes and network types.


  • Quantum Mechanical Calculations

    Quantum mechanical calculations can be performed using GAMESS or GAUSSIAN 98.

      GAMESS
      Gamess is a general ab initio quantum chemistry package. Among the possible quantum mechanical computations are molecular wavefunctions, energy corrections, optimized molecular geometries, and potential energy surfaces.

      GAUSSIAN 98
      Gaussian 98 can predict energies, molecular structures, and vibrational frequencies, and model them in both their ground state and excited states, as well as perform many other electronic structure calculations.

      GAMESS is multithreaded, so that a single job can run in parallel on multiple nodes. In contrast, GAUSSIAN 98 is single-threaded. The benefit of using it on the Biowulf cluster would be to run many single-node jobs (a "swarm" of jobs) simultaneously.

Statistics

As measured by cpu cycles on the supercluster, statistics follows immediately behind molecular dynamics and quantum chemistry applications. GAUSS and R—the statistical packages available on Biowulf—are not parallelized. The best way to use them would be to run "swarms" of single-threaded jobs.

GAUSS—a fast matrix programming language designed for computationally intensive tasks—has a wide variety of statistical, mathematical, and matrix-handling routines. R is a language and environment for statistical computing and graphics, and provides a wide variety of statistical and graphical techniques such as linear and nonlinear modeling, time series analysis and clustering.

Sequence Analysis

In this post-genomic era at NIH, many computational projects naturally involve nucleotide and protein sequences. Whole-genome studies, or those involving microarrays, produce vast amounts of sequence data that needs to be compared against other enormous sequence databases.

    BLAST, BLAT and Pfsearch
    BLAST, BLAT and Pfsearch are three sequence analysis packages that have been set up on the Biowulf cluster to analyze hundreds or thousands of sequences. BLAST and BLAT use different algorithms to compare query sequences against databases, while Pfsearch searches a database for a specified pattern. A convenient system has been set up on Biowulf to make it easy for users to perform large numbers of BLAST or BLAT searches. In a typical project, one user BLASTed 20,000 DNA sequences against the human genome database in 8 hours—instead of 500 to 1000 hours using more conventional systems.

    HMMER
    Soon to come is HMMER, Sean Eddy’s package that uses profile hidden Markov models to perform sensitive database searching.

Other Ways to Use Biowulf

Of course, use of Biowulf is not restricted to the software described above. Any parallelized program can be installed on the supercluster. Even more useful, any project that requires many runs of a single-threaded program can take advantage of "swarm"—a general-purpose command that makes it easy to submit large numbers of independent jobs.

AMBER molecular simulation package

Biowulf's System Monitor
Developed by the NIH staff, the monitor shows node activity. Each "dot" represents a dual-processor node. Red, orange and green are busy nodes; blue nodes are idle. The monitor can also be used to display activity associated with a single job or a single user.

Biowulf usage has been steadily growing since it was first built, and the hardware and networking have een expanding in tandem. By the end of this year, the supercluster will comprise more than 1000 processors and over 500 gigabytes of RAM memory. The hundreds of individual nodes are interconnected by six fast Ethernet switches on a gigabit Ethernet backbone. Also connected to the backbone network are four high-performance fileservers with access to over a terabyte of user data.

Visit the Biowulf Web site. More information about installed scientific software can be found online.

 
Published by Center for Information Technology, National Institutes of Health
Interface Comments |  Accessibility