Research Computing Team

UAB IT Research Computing provides comprehensive support for researchers that use Cheaha, develop and maintain the cluster, and provide advanced software and consulting support if needed by researchers. Research Computing staff also maintain proposal support documentation, such as letters of commitment and facilities descriptions, to allow researchers a tighter focus on their projects.

History of Research Computing at UAB

  • Acquiring the first cluster

    Aug 31

    Using a 2001 infrastructure development grant from the National Science Foundation (NSF EPS-0091853), UAB created the Cheaha high-performance computing cluster with the acquisition of a 64-node computer cluster with 128 total cores. Cheaha — named after Alabama’s highest peak — expanded the compute capacity available at UAB and was the first general-access resource for the community.

    It led to expanded roles for UAB IT in research computing support through the development of the UAB Shared HPC Facility in the Business and Engineering Complex and provided further engagement in Globus-based grid computing resource development on campus via UABgrid and regionally via SURAgrid.

  • Upgrading hardware

    Mar 31

    Money was allocated by UAB IT for hardware upgrades, which led to the acquisition of an additional 192 cores and migrated Cheaha's core infrastructure to the Dell blade clustering solution. The expansion provided a three-fold increase in processor density over the original hardware and enabled more computing power to be located in the same physical space with room for expansion, an important consideration in light of the continued growth in processing demand. This hardware represented a major technology upgrade that included space for additional expansion to address over-all capacity demand and enable resource reservation.

    The 2008 upgrade began a continuous resource improvement plan that included a phased development approach for Cheaha with ongoing increases in capacity and feature enhancements being brought into production via an open community process, leveraging a federated systems model developed through NSF Award #0330543. Software improvements rolled into the 2008 upgrade included grid computing services to access distributed compute resources and orchestrate jobs using the GridWay meta-scheduler. An initial 10Gigabit Ethernet link establishing the UABgrid Research Network was designed to support high-speed data transfers between clusters connected to this network.

  • Supporting increased research data sets

    Mar 31

    In 2009, annual investment funds were directed toward establishing a fully connected dual data rate Infiniband network between the compute nodes added in 2008 and laying the foundation for a research storage system with a 60TB DDN storage system accessed via the Lustre distributed file system. The Infiniband and storage fabrics were designed to support significant increases in research data sets and their associated analytical demand.

  • Grant increases compute and storage capacity

    Aug 31

    UAB was awarded a National Institutes of Health Small Instrumentation Grant to further increase analytical and storage capacity. The grant funds were combined with the annual investment funds, adding 576 cores (48 nodes) based on the Intel Westmere 2.66 GHz CPU, a quad data rate Infiniband fabric with 32 uplinks, an additional 120 TB of storage for the DDN fabric, and additional hardware to improve reliability.

    Additional improvements to the research compute platform involved extending the UAB Research Network to link the Business and Engineering Center and RUST data centers, adding 20TB of user and ancillary services storage.

  • Investing in the foundation

    Aug 31

    UAB IT invested in the foundation hardware to expand long-term storage and virtual machine capabilities with acquisition of 12 Dell 720xd system, each containing 16 cores, 96GB RAM, and 36TB of storage, creating a 192 core and 432TB virtual compute and storage fabric. Additionally, hardware investment by the School of Public Health's Section on Statistical Genetics added three 384GB large memory nodes and an additional 48 cores to the QDR Infiniband fabric.

  • Acquiring an open stack cloud

    Mar 31

    In 2013, UAB IT Research Computing acquired an OpenStack cloud and Ceph storage software fabric through a partnership between Dell and Inktank in order to extend cloud computing solutions to the researchers at UAB and enhance the interfacing capabilities for HPC.

  • Mission Support Seed Expansion

    Mar 31

    UAB IT received $500,000 from the university’s Mission Support Fund for a compute cluster seed expansion of 48 teraflops. This investment recognized the crucial role computation plays in supporting and advancing research.

  • Winning grant to boost storage capacity and stablize funding

    Sep 30

    With the arrival of a new chief information technology officer at UAB, UAB IT ramped up attention to its research computing infrastructure. New CIO Dr. Curt Carver secured a $500,000 grant from the Alabama Innovation Fund for a 3-petabyte research storage array. With an immediate impact, the institution stabilized research compute funding and provided the necessary funding so that research computing was not funded from reserves.

  • Growing to become the fastest supercomputer in Alabama

    Sep 15

    UAB IT received additional funding from the deans of the College of Arts and Sciences and schools of Engineering and Public Health to grow the compute capacity provided by the prior year's seed funding, growing computing speed from 10 teraflops within the last year to 110 teraflops. The expansion added compute nodes, and the 6-petabyte file system came online.

    This file system provided each user 5 TB of personal space, additional space for shared projects and a greatly expanded scratch storage all in a single file system. The 2015 and 2016 investments combined to provide a completely new core for the Cheaha cluster, allowing the retirement of earlier compute generations.

  • Increase network speeds to share Terabyte data sets

    Oct 31

    UAB IT launched a 100Gbps network, one of the fastest in the state, and implemented a grant-funded Science DMZ to separate research network traffic from the rest of campus. The Science DMZ provides high-speed network to access national and international data repositories and share large research data sets in the era of Big Data.

  • Expanding even further

    Sep 15

    UAB IT added 72 graphics processing units, which will boost the Cheaha supercomputer’s power to 450 teraflops. A teraflop is a unit of computing speed equal to 1 trillion floating-point operations per second. This latest expansion makes Cheaha far and away the fastest supercomputer in Alabama and one of the five fastest at academic institutions in the Southeast — and knocking on the door of the Top 500, a list of the fastest supercomputers in the world.