This IBM Redpaper describes the continuous availability features of AIX Version 6, Release 1. It also addresses and defines the terms Reliability, Availability, and Serviceability (RAS) as used in an IT infrastructure. It touches on the global availability picture for an IT environment in order to better clarify and explain how AIX can improve that availability. The paper is intended for AIX specialists, whether customers, business partners, or IBM personnel, who are responsible for server availability.
A key goal of AIX development is to improve overall system serviceability by developing problem determination tools and techniques that have minimal impact on a live system; this document explains the new debugging tools and techniques, as well as the kernel facilities that work in conjunction with new hardware, that can help you provide continuous availability for your AIX systems. The paper provides a broad description of the advanced continuous availability tools and features on AIX that help to capture software problems at the moment they appear, with no need to recreate the failure. In addition to software problems, the AIX kernel works closely with advanced hardware features to identify and isolate failing hardware and replace hardware components dynamically without bringing down the system. The tools discussed include Dynamic Trace, Lightweight Memory Trace, Component Trace, Live dump and Component dump, Storage protection keys (kernel and user), Live Kernel update, and xmalloc debug.
Table of contents
Chapter 1. Introduction
Chapter 2. AIX continuous availability features
Chapter 3. AIX advanced continuous availability tools and features
Appendix A. AIX features availability