HPC Clusters Using InfiniBand on IBM Power Systems Servers
An IBM Redbooks publication
Note: This is publication is now archived. For reference only.
This IBM® Redbooks® publication provides information about the InfiniBand standard and how the standard has been implemented to support IBM High Performance Computing (HPC) clusters.
This book will help you understand, plan, implement, and manage InfiniBand using IBM servers and switches to form a high-bandwidth, low-latency communication network for applications.
This book presents the software and hardware components that must be brought together to form the management and application foundation. We cover cluster management, node installation, monitoring the infrastructure, and application support, such as IBM TWS LoadLeveler®, Parallel Environment, and various other AIX® and SUSE SLES tools.
This book is intended for IT architects, system designers, data center planners, and system administrators who must design, configure, and manage an InfiniBand infrastructure in an HPC cluster.
Part 1. Understanding InfiniBand
Chapter 1. InfiniBand architecture
Chapter 2. High performance computing hardware components using IB
Chapter 3. Technical description of software components
Part 2. Planning for InfiniBand
Chapter 4. Planning for an HPC cluster
Part 3. Implementing InfiniBand
Chapter 5. Implementation overview
Chapter 6. Configuring the InfiniBand fabric
Chapter 7. Configuring InfiniBand on AIX
Chapter 8. Configuring InfiniBand on Linux on Power
Part 4. Managing the InfiniBand environment
Chapter 9. Fabric management and monitoring
Chapter 10. Node management and monitoring
Appendix A. Advanced syslog-ng configuration
Appendix B. Fabric port counters
Appendix C. Fabric Management scripts
Appendix D. AIX ibstat -s output description
Appendix E. Recommended MPI environment settings