Skip to main content

Quest September 2023 Maintenance Updates and Upgrades

Northwestern IT successfully completed fall 2023 Quest maintenance during the week of September 18. Read on to learn more about the completed work and user-focused outcomes.

As background, Quest includes the Quest Analytics Nodes, the Genomics Compute Cluster (GCC), and the Kellogg Linux Cluster (KLC). Quest downtimes are regularly scheduled to perform critical system patches, software updates, and hardware or infrastructure changes that cannot be completed while jobs are running, or data is being accessed. Regular hardware, firmware, and software updates allow for a stable cluster and mitigate Quest’s susceptibility to security vulnerabilities that come with using outdated and unsupported software. Updates also enhance performance and increase productivity for researchers and open new opportunities for service and system improvements on the cluster.

Highlights of the work performed during the September 2023 downtime are detailed below.

IBM General Parallel File System (GPFS) Software and Interconnect (InfiniBand) Drivers Updated

Two critical components of Quest, the storage system (GPFS) and the InfiniBand Interconnect, were updated recently. GPFS is a high-performance parallel file system necessary for concurrent high-speed data access for applications or running jobs. InfiniBand is a high-speed, low latency network system for node-to-node and node-to-storage communications.

  • The upgrades laid the foundation for introducing liquid cooled node architecture that will be available in the fall for researchers. Liquid cooling offers a more robust heat removal from the nodes and allows us to operate nodes with higher-CPU density, higher-power, and higher-performance on Quest in the future.
  • Upgraded interconnect drivers and GPFS improve performance and stability for overall system communication and for applications doing heavy reading and writing operations on files.
Older Interconnect Fabric Removed
  • Prior to this Quest downtime, the system supported three different generations of InfiniBand Interconnect fabric that connect nodes and the file system with high speed. The slowest of these fabrics, InfiniBand FDR, has been removed from the cluster. This change reduced the complexity in movement of data across the nodes and the file system and reduced inefficiencies when data is traveling from one InfiniBand protocol to another.
Quest 8 Nodes Retired
  • Quest 8 nodes (having 28 cores and 96 or 196 GB memory per node) are the oldest generation of computing hardware on the cluster. Quest 8 nodes are now retired from the cluster. By removing aging hardware, more room is available for new nodes considering the power, cooling, and space availability in the Data Center.
  • More information about the Quest nodes is available on the Quest Technical Specifications page.
Job Scheduler Upgraded to Slurm Version 23.02
  • The updated Slurm job scheduler offers new possibilities for GPU (graphical processing unit) computing. Slurm has the ability to schedule jobs to partitioned NVIDIA A30, A100, and H100 GPUs. These powerful GPUs can be partitioned into as many as seven smaller GPU instances via a feature called multi-instance GPU (MIG). Smaller GPUs operate as if they are physically isolated devices. MIGs can run workloads from smallest to largest with minimal resource waste, and they extend the community’s access to GPU computing.
  • A recently identified security vulnerability has been fixed for PMIx package, which is required for launching MPI (Message Passing Interface) parallelized jobs with Slurm.
Applications on Quest Analytics Nodes Updated
  • R and RStudio upgraded to versions 4.2.3 and 2023.06, respectively. These updates enable support for new R packages on the Quest Analytics Nodes.
  • JupyterHub and Notebook upgraded to versions 4.0.1 and 6.5.5, respectively. The new default Python for Jupyter applications has become 3.11.
  • With this upgrade, the Quest Analytics Nodes now offer JupyerLab (version 4.0.5) as a web-based interactive development environment. More information on accessing JupyterLab on Quest Analytics Nodes can be found in this Knowledge Base article.

Questions about this maintenance can be directed to Alper Kinaci, manager of research computing support services, at akinaci@northwestern.edu. Users can also contact quest-help@northwestern.edu for any support related to using Quest or globus-help@northwestern.edu for assistance with Globus.