IBM Platform LSF Implementation Scenario in an IBM iDataPlex Cluster

An IBM Redpaper publication

thumbnail 

Published on 29 April 2013, updated 30 April 2013

  1. .EPUB (0.5 MB)
  2. .PDF (0.5 MB)

Share this page:   

IBM Form #: REDP-5004-00


Authors: Dino Quintero, Ricardo Dobelin Barros, Ashraf Gomaa, José Higino, Archana Kumar, Majid Ouassir, Adam Parker and Joanna Wong

    menu icon

    Abstract

    The IBM® Platform Load Sharing Facility (LSF®) product family is a workload management platform for demanding, distributed, and mission-critical high-performance computing (HPC) environments. It provides a set of intelligent, policy-driven scheduling features so that you can fully take advantage of all of your compute infrastructure resources and ensure optimal application performance. By using the highly scalable and available architecture, you can schedule complex workloads and manage petaflop-scale resources.

    This IBM Redpaper™ publication explains how to use IBM Platform LSF features for cluster workload management, including job scheduling, job submission, and application profiles examples. It describes the general challenges for (HPC) clusters, and how IBM Platform LSF can help such clusters. This publication includes some of the major features of IBM Platform LSF and the way these features can help to address the challenges of HPC clusters.

    Table of Contents

    Valid hardware resources to ensure that the setup can use IBM Platform LSF

    Sizing for I/O-intensive clusters

    Considerations for GPGPU intensive clusters

    Job queues

    Job scheduling

    Goal-oriented scheduling

    Job submission

    Compute units

    Application profiles

    Job submission prechecks and setup

    Job resizing

    Idle job detection

    Defining external resources (elims)

    Using advance reservations

    Hyper-Threading technology

    Changing the paradigm with guaranteed resources

     

    Others who read this also read