Data Capacitor
Data Capacitor



Data Capacitor


Data Capacitor In the News
April 20, 2009: IU Data Capacitor, Lustre WAN and TeraGrid enable collaborative science [cached]
April 29, 2008: IU dedicates new resources to support Lustre over wide area networks [cached]
November 16, 2007: Team led by IU wins Bandwidth Challenge at SC07 [cached]
June 7, 2007: Data Capacitor reaches 977 MB/sec over TeraGrid [cached]
April 3, 2007: Data Capacitor team nominated for MIRA award [cached]
October 5, 2005: Indiana University Announces Major National Science Foundation Grant for Data Capacitor Project [cached]
About the Data Capacitor
The Data Capacitor is a high speed/high bandwidth storage system for research computing that serves all IU campuses and NSF TeraGrid Users. At peak performance, the Data Capacitor has a 14.5 gigabyte per second aggregate transfer rate per second.

The Data Capacitor allows users to obtain high read/write speeds for their data as well as support for very large files. Using a wide area filesystem, the Data Capacitor permits users to access remote data as if the filesystem were mounted locally. This could allow researchers at multiple remotes sites to share large amounts of data.

The data capacitor is comprised of:

52 Dell servers running Red Hat Enterprise Linux
24 Myricom 10 gigabit ethernet cards
12 DataDirect Networks S2A9550 storage controllers
30 DataDirect Networks 48 bay SATA disk chassis
535 terabytes usable disk storage using the Lustre filesystem
Six water cooled Rittal racks with magnetic locks
You can read frequently asked questions or papers and presentations about the Data Capacitor, apply for a project allocation on the Data Capacitor for your project, or contact the Data Capacitor team.

Projects Using the Data Capacitor
Data Capacitor users span a wide variety of fields yet all share a need to store and manipulate large data sets. Some of our users include:

Center for Genomics and Bioinformatics

The Center for Genomics and Bioinformatics is a multidisciplinary research center serving the Indiana University Bloomington campus. The CGB carries out independent research in genomics and bioinformatics, collaborates with and/or assists projects developed by IUB faculty, and promotes interdepartmental and interdisciplinary interactions to enhance genomics and bioinformatics at IUB.

Computational Biology and Bioinformatics

Located in the Center for Computational Biology and Bioinformatics at IUPUI, Sean Mooney's laboratory is focused on research and training in bioinformatics and computational biology. Specifically, their research interests aim to characterize and predict the effects of genetic variation.

Computational Chemistry

James P. Reilly's laboratory focuses on research in efficient biomolecular ion production, proteomics, photochemistry of peptide ions, protein structure and cellular fingerprinting, and novel time-of-flight instrumentation.

Platform for Computational Comparative Genomics on the Web

PLATCOM is an integrated system for the comparative analysis of multiple genomes. It is designed in a modular way, so that multiple tools and databases can be integrated freely and the whole system can grow easily. The PLATCOM system is built on internal databases, which consist of GenBank, Swiss-Prot, COG, KEGG, and Pairwise Comparison Database (PCDB). PCDB is a derived database from GenBank built by performing pairwise comparison of protein-to-protein and whole genome-to-whole genome sequences with FASTA and BLASTZ respectively. Currently it contains 48,205 entries of unduplicated protein-to-protein and whole genome-to-whole genome pairwise comparison matches. PCDB is designed to incorporate newer genomes automatically, so that PLATCOM evolves as new genomes become available. Over these databases, a suite of genome analysis applications are provided.

Computational Fluid Dynamics Laboratory

The Computational Fluid Dynamics Laboratory was established in 1986 within the Department of Mechanical Engineering to conduct research and develop software in the areas of computational fluid dynamics and heat transfer.

Current research projects include the finite element and finite volume solution of three-dimensional flow problems; high speed compressible flow calculations for internal and external flows; unsteady flow computations; moving body flows with unstructured meshes; parallel computing; load balancing for parallel computing on parallel processors and network of workstations; and high-performance grid computing.

Internet Traffic Analysis

This project is interested in studying the infrastructure scalability and vulnerabilities of expanding communication networks, by means of analyzing the statistical behavioral patterns that emerge and are observable in Internet traffic data. The idea is that such analysis may lead to robust design/planning/management tools as well as methods for mitigating and/or immunizing against attacks by early detection of anomalous patterns correlated with malicious behavior. The networks considered span a very broad range of scale, from individual interactions (e.g., social engineering, phishing, covert communication) to application-specific flows (e.g. spam, email and Web based DDoS) to global-scale Internet traffic networks (e.g. Internet2 peer networks and worms).

Linked Environments for Atmospheric Discovery

Linked Environments for Atmospheric Discovery (LEAD) makes meteorological data, forecast models, and analysis and visualization tools available to anyone who wants to interactively explore the weather as it evolves. The LEAD Portal brings together all the necessary resources at one convenient access point, supported by high-performance computing systems. With LEAD, meteorologists, researchers, educators, and students are no longer passive bystanders or limited to static data or pre-generated images, but rather they are active participants who can acquire and process their own data.

Proteomics at IU

The Proteomics Core Facility at the Indiana University School of Medicine opened in the fall of 2001 in the Department of Biochemistry and Molecular Biology. It is a component of the INGEN cores supported by Indiana Genomics Initiative (INGEN). The Proteomics Core Facility became the academic component of the Indiana Centers for Applied Protein Sciences (INCAPS) in May 2004 and was rename to Protein Analysis and Research Center. It is a service and collaborative research resource that balances applied proteomics research with the development of new and improved methods for protein identification, characterization, and quantification. The Center encourages collaborations that apply the tools of proteomics to cutting edge biomedical research.

[additional information]

WIYN Observatory

The WIYN Telescope, a 3.5-meter instrument employing many technological breakthroughs, is the newest and second largest telescope on Kitt Peak. The WIYN Observatory (pronounced "win") is owned and operated by the WIYN Consortium, which consists of the University of Wisconsin, Indiana University, Yale University, and the National Optical Astronomy Observatories (NOAO). Most of the capital costs of the observatory, which amounted to $14 million, were provided by these universities, while NOAO, which operates the other telescopes of the Kitt Peak National Observatory, provides most of the operating services. This partnership between public and private universities and NOAO is the first of its kind. The universities benefit from access to a well-run observatory at an excellent site, and the larger astronomical community served by NOAO benefits from the addition of this large, state-of-the-art telescope to Kitt Peak's array of telescopes.

X-ray Crystallography

The Indiana University Molecular Structure Center is a service and research facility in the Department of Chemistry at Indiana University, located in Bloomington, Indiana. The laboratory has a full complement of single crystal and powder diffraction equipment used to characterize crystalline materials using the techniques of X-ray crystallography. Researchers in the laboratory can determine the three-dimensional structure of nearly any material that can be crystallized.


The results of a crystallographic study is a set of atomic coordinates which locate the atoms of a molecule in the "unit cell" of the crystal. This information can then be used to generate images of the molecule and to determine distances and angles in the molecule. In addition, the data allows one to examine the packing of the molecules in the crystal, information which can often lead to understanding the properties of the material.

IUMSC Server allows rapid access to the data that are generated in the IUMSC. Nearly all of the materials studied have been synthesized or isolated by researchers from other laboratories, usually within the Indiana University system, but often from laboratories throughout the world.



About UITS Contact UITS Job Opportunities Press Room Site Map Comments
Copyright 2006, The Trustees of Indiana University



IU's involvement in the TeraGrid, and the presentation of this material is based upon work supported by the National Science Foundation under Grants No. 0833618, SCI451237, SCI535258, and SCI504075. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Comments: 0
Votes:19