Skip to content Skip to main navigation Report an accessibility issue
High Performance & Scientific Computing

Node and Storage Investments



Overview

Investment criteria and standard node configurations are reviewed annually and updated on this page at the beginning of each fiscal year by the OIT HPSC Director. This page has been updated for FY24 on on August 15, 2023 and again on December 2023 due to price fluctuations.

If you are working on a proposal and need the ISAAC facilities document it is available here.
(UT authentication required)

Faculty and other University researchers may purchase compute nodes to be placed in the ISAAC HPC cluster using sponsored research funds, startup funds, or other funds for a private condo. These private condo nodes are exclusive to investors, their project(s), and project members. In addition to private condo exclusive use private condo resources participate in the responsible node sharing program to make unused compute resources available to the campus community with very short limits on allowed run times.

If you wish to investigate purchasing nodes or know what you want and are ready to purchase nodes for use in a private condo, please submit an HPSC Service Request (ticket) at this link. The Director will contact you to discuss details regarding the purchase and once you want to commit to a purchase you will work with the Associate CIO for Service Level and Capacity Management to develop a service level agreement (SLA) that will apply for the equipment purchase (equipment, cost, service period, etc.).

For FY24, the standard SLA period for a compute or GPU node investment will be five years. Other periods will be considered on a case-by-case basis including a proposal or award performance period.

Note: Costs below are estimates; final costs will be determined based upon equipment configuration usually in the form of a vendor quote and specified in an OIT Service Level Agreement document corresponding to the funding source.

Federal research R accounts can only be charged the direct equipment cost since our facility is not a cost center. A vendor quote would be provided for inclusion in a proposal. After an award, the vendor quote would be requested again, as usually vendor quotes are only good for 30 days.

The compute node costs below are estimated one-time costs for a five year period.

Note: All prices below are estimates, and pricing changes monthly (approximately). Dell advised HPSC in December 2023 that prices are fluctuating and lead times for shipment of equipment can be up to six months due to the industry-wide shortage of GPUs, power supplies, the prices of memory, and other specialized parts. UTK has a good relationship with Dell and we have not seen delays to date but other customers have seen delays in shipment. When an order is placed from a quote we usually get estimated ship dates at the time of order.

Standard Compute Node – Dell 16th Generation Technology

With the 16th generation standard compute node, investors will obtain a powerful compute node based on Intel Xeon Ice Lake processors or AMD Genoa or Bergamo processors that can handle significant HPC workloads. The pertinent technical specifications are listed below.

Intel Sapphire Rapids based compute node

  • Dell PowerEdge R660
  • 2x 24-Core Intel Xeon Gold 6438M Processors, 64cores total
  • 256GB DDR5-4800 RAM memory
  • 480GB SATA SSD for operating system
  • HDR Infiniband and Ethernet network interface cards

The estimated cost for a standard Intel processor compute node is approximately $17,137.11 ($267.77 per core). Modifications can be made to the above configuration on a case-by-case basis.

AMD EPYC based compute node

  • Dell PowerEdge R6625
  • 2x 112 core AMD EPYC 9734 processors, 224 cores total
  • 1,024 GB memory
  • 480GB SATA SSD for operating system
  • HDR Infiniband and Ethernet network interface cards

The estimated cost for a Dell PowerEdge R6625 is approximately $33,447 ($149.31 per core). Modifications can be made to the above configuration on a case-by-case basis.

GPU Nodes from Dell

GPU configurations and quotes can be quite complex. There are currently (as of May 2024) ten different types of GPUs available with three from AMD and seven from NVIDIA. Many of the lower end GPUs L40S, A40, and A16 are only available in smaller server configurations, whereas, the NVIDIA H100 which is the top-of-the-line GPU is really only available in the higher end servers, such as, the Dell PowerEdge XE8640 and XE9640. These smaller GPUs have availability but there performance and size is limited. The Director has come up with a way to purchase the XE8640 nodes with four NVIDIA H100s and get them in a reasonable timeframe. E accounts can be used to purchase a portion of a XE8640 which we will sell in 1/4 increments. R accounts can be used if they are not tied to Federal research funds.

With the Dell PowerEdge XE8640 standard GPU node, investors will obtain a high end GPU node that can provide significant GPU performance for GPU-intensive applications. The pertinent technical specifications are listed below.

Intel Sapphire Rapids GPU node with NVIDIA HGX H100 4x GPUs

  • Dell PowerEdge XE8640 (max 5 kilowatts power draw)
  • 2x 32-Core Intel Xeon Platinum 8462Y+ Processors, 64 cores total
  • 1,024 GB DDR5 4800 memory
  • 480GB SATA SSD for operating system
  • NVIDIA HGX H100 4 GPU SXM with 80GB GPU memory and onboard NVLink (allows fast communication between the 4 GPUs – internal only)
  • 51.2 TB NVMe drives (for on server storage and fast data transfer to/from GPU)
  • 1 Gb/s Dual port ethernet (for management)
  • 10 Gb/s Quad port ethernet (for high performance networking)
  • NVIDIA Mellanox ConnectX-6 single port HDR 200 GB/s Infiniband (for Lustre storage network)
  • NVIDIA Mellanox ConnectX-7 single port NDR 400 GB/s Infiniband (for application IPC and cross GPU node communication)

The estimated cost for a this high performance GPU node four NVIDIA 80GB H100 GPUs is $187,859. This is a very feature and component rich configuration for HPC and AI/ML workloads. We will work to purchase these servers each fiscal year an provide the capability, as long as the correct account types are used, to sub-divide the use of these server down to 1/4 of the GPU node which then would cost $46,965.

Storage Investments

Lustre project storage space is available on each cluster managed with quotas. Project space is not purged. There is also Lustre scratch space available on each cluster managed with quotas on all of the ISAAC clusters. See information on the File Systems available on each cluster under the File System submenu item for each cluster in the HPSC website menu. OIT HPSC performs best effort to maintain an amount of storage to meet demands of the UT research community. We work with researchers to make storage resources, including the the UT-StorR archival storage, available to serve most research storage needs.

University of Tennessee projects receive 1 terabyte (TB) of Lustre project space on ISAAC clusters at no additional cost to the research project. Requests for project space up to 100 TBs can be requested at any time and addressed by any HPSC staff by submitting an HPSC Service Request (see Submit HPSC Service Request in the menu to the left of this page). The OIT HPSC staff can approve any reasonable request for additional project space up to 100 TBs. Any project storage over 100 TBs should be brought to the attention of the HPSC Director for consideration also via an HPSC Service Request and the Director may ask that the faculty research lead fund these large storage requests. Any request beyond 100 TBs for an existing project or for a sponsored research proposal should request a quote for the purchase of the additional storage using research, department or faculty project funds, if at all possible. For FY25 the approximate cost for storage is $100 per terabyte. This includes the storage drives using RAID-6; the storage infrastructure which includes controllers, Infiniband switches and cables, drive drawers, cables, etc.; the metadata storage (SSDs and controllers); and a license and support for the ExaScalar Lustre file system software from Data Direct Networks. RAID-6 means the raw storage amounts end up with about 70% usable and can withstand two drive failures without losing data. We strongly encourage faculty with startup funds, research funds, and research incentive funds (RIF) to pay for the storage their research projects use over 10 TBs of Lustre storage.

Also, Lustre file systems have a finite number of files that the entire file system can contain. Any project or researcher that needs more that 5 million files will have to be addressed on a case-by-case basis.

Backups of Lustre storage are the responsibility of the end users. The three Lustre files systems are 1.3 petabytes, 2.9 petabytes, and 3.6 petabytes and we do not have the capability to backup that much space. Lustre file systems are very reliable and backups are not provided on any Lustre space by default. Any request for Lustre storage backups would be addressed on a case-by-case basis.

Note: VM (virtual machine) storage in the Secure Enclave is handled on a case-by-case basis. Please contact HPSC directly via email or help request ticket if your project has questions about VM storage.

On a case-by-case basis, arrangements may be made for some varieties of back-ups of limited size, such as a duplicate copy or snapshot. If these sorts of backups is desired, project PIs should discuss these desired backups with HPSC staff, so such requirements can be included any investment Service Level Agreement. For more information on the available file systems, please refer to the File Systems document.

UT-StorR Long Term Archival Storage

The UT-StorR long term archival storage system is now available and can help with the following use cases: (1) researchers who use the ISAAC campus clusters and have large amounts of primary and results research data and data collections that are no longer actively used or used very infrequently, (2) UTK Core Facilities that manage instrument and other data and data collections for a community, and (3) in cooperation and coordination with UTK Library, the data storage by researchers, research projects, or Core Facilities to meet the data management plans of funded research projects.

To meet federally (and other) funded projects that require data management plans you can include the costs of the tapes for long term storage of you research results, papers and other documents described in a proposal data management plan. The cost of the tapes is approximately $100 for each 18 TB LTO-9 tape cartridge. Storage in UT-StorR of one petabyte of data would be stored on 56 tapes which would cost $5,600. Currently, UT-StorR hardware, software, and maintenance is funded through FY26 and no additional costs besides tape costs should be included. We will be working with ORIED to come up with a cost recovery plan for UT-StorR. UT-StorR may become a Core Facility to implement a cost recovery mechanism. When the cost recovery is known and available this paragraph will be updated with those costs that should be included in budgets for research proposals that have a data management plan requirement.

For more information on UT-StorR see the Archival Menu on the OIT HPSC pages (in the menu to the left of this page).

Investment History

Historical Server Pricing for Comparisons

Server TypeServer ComponentsDate/Price/
$ per core or GPU
Dell PowerEdge R660Intel Xeon Gold 6438M, 64 cores
256GB DDR5-4800 memory
Mellanox ConnectX-6 HDR Infiniband
Ethernet NIC & cables
Jan-2024
$17,137.11
$267.77
Dell PowerEdge R6625AMD EPYC 9734, 224 cores total
1,024 GB DDR5-4800 memory
Mellanox ConnectX-6 HDR Infiniband
Ethernet NIC & cables
Dec-2023
$33,447
$149.31
Dell PowerEdge R650Intel Xeon Gold 6348, 56 cores
256 GB DDR4-3200 memory
Mellanox ConnectX-6 HDR,
Ethernet NIC & cables
Sep-2022
$13,581.59
$242.53
Dell PowerEdge R6525AMD EPYC 7713, 128 cores
512 GB DDR4-3200 memory
Mellanox ConnectX-6 HDR,
Ethernet NIC & cables
Sep-2022
$20,999.11
$164.06
Dell PowerEdge R640Intel Xeon Gold 6248R, 48 cores
192 GB DDR4-3200 memory
Mellanox ConnectX-6 HDR,
Ethernet NIC & cables
May-2021
$9,131.45
$190.24
Dell PowerEdge R7515AMD EPYC 7763, 128 cores
256 GB DDR4-3200 memory
Mellanox ConnectX-6 HDR,
Ethernet NIC & cables
(3000078872607.1)
Mar-2021
$11,861
$92.66
Dell PowerEdge XE8640Intel Xeon Platinum 8462Y+, 64 cores
1,024GB DDR5-4800 memory
4x NVIDIA H100 SXM 80GB GPU
51.2 TB NVMe SSD storage
Mellanox ConnectX-6 HDR,
Ethernet NIC & cables
Jul-2023
$144,900
$36k/GPU
Dell PowerEdge R750Intel Xeon Gold 6348, 56 cores
256 GB DDR4-3200 memory
NVIDIA A40 48GB GPU
Mellanox ConnectX-6 HDR,
Ethernet NIC & cables
Sep-2022
$23,835
$23k/GPU
Dell PowerEdge R750XAIntel Xeon Platinum 8358, 64 cores
1,024 GB DDR4-3200 memory
4x NVIDIA A40 48GB GPU
Mellanox ConnectX-6 HDR,
Ethernet NIC & cables
Nov-2021
$57,830
$14.5k/GPU
Dell PowerEdge R7525AMD EPYC 7713, 128 cores
512GB DDR4-3200 memory
2x NVIDIA A100 40GB GPU
Mellanox ConnectX-6 HDR,
Ethernet NIC & cables
Sep-2021
$39,093
$19k/GPU
Dell PowerEdge R740Intel Xeon Gold 6248R, 48 cores
192 GB DDR4-3200 memory
2x NVIDIA V100S 32GB GPU
Mellanox ConnectX-6 HDR,
Ethernet NIC & cables
Dec-2021
$22,655
$11.3k/GPU

Mar-2021
$20,500
$10k/GPU