IBM Platform LSF Implementation Scenario in an IBM iDataPlex Cluster

An IBM Redpaper publication



Abstract

The IBM® Platform Load Sharing Facility (LSF®) product family is a workload management platform for demanding, distributed, and mission-critical high-performance computing (HPC) environments. It provides a set of intelligent, policy-driven scheduling features so that you can fully take advantage of all of your compute infrastructure resources and ensure optimal application performance. By using the highly scalable and available architecture, you can schedule complex workloads and manage petaflop-scale resources.

This IBM Redpaper™ publication explains how to use IBM Platform LSF features for cluster workload management, including job scheduling, job submission, and application profiles examples. It describes the general challenges for (HPC) clusters, and how IBM Platform LSF can help such clusters. This publication includes some of the major features of IBM Platform LSF and the way these features can help to address the challenges of HPC clusters.

Table of contents

Valid hardware resources to ensure that the setup can use IBM Platform LSF
Sizing for I/O-intensive clusters
Considerations for GPGPU intensive clusters
Job queues
Job scheduling
Goal-oriented scheduling
Job submission
Compute units
Application profiles
Job submission prechecks and setup
Job resizing
Idle job detection
Defining external resources (elims)
Using advance reservations
Hyper-Threading technology
Changing the paradigm with guaranteed resources



Profile

Publish Date
29 April 2013

Last Update
30 April 2013


Rating: Not yet rated


Author(s)

IBM Form Number
REDP-5004-00

Number of pages
52