IBM FlashSystem in OLAP Database Environments

IBM Redbooks Solution Guide

Abstract

IBM FlashSystem family is an enterprise-class flash storage platform that is ideal for delivering extreme performance with low latency for OLAP (On-line Analytical Processing) environments. Analytical processing is typically characterized by a relatively low volume of transactions, but with complex queries that often involve aggregating data from within a data warehouse to perform decision making and analysis. For OLAP systems, a low response time is a strong measure of effectiveness. This IBM Redbooks Solution Guide explains how IBM flash storage systems drives extreme performance, cost efficiencies, and enterprise reliability to satisfy the requirements of an OLAP implementation.

Changes in the last update:
* Replaced integration scenario

Contents


IBM® FlashSystem storage systems deliver high performance and efficiency in an easy to integrate offering so that businesses can more readily compete in the market. IBM's recent acquisition of Texas Memory Systems (TMS), an IBM Company, extends IBM's leadership in flash optimized storage solutions. By using patented Variable Stripe RAID™ technology and 2D Flash RAID™, FlashSystem storage systems enhance system resiliency without sacrificing performance or usable capacity.

Figure 1 illustrates the value of the IBM FlashSystem storage infrastructure.

IBM flash storage infrastructure value
Figure 1. IBM FlashSystem storage infrastructure value

FlashSystem storage systems transform the data center environment and enhance performance and resource consolidation to gain the most from business processes and critical applications. Examples of such processes and applications include online transaction processing (OLTP), business intelligence (BI), online analytical processing (OLAP), virtual desktop infrastructures, high-performance computing, and content delivery solutions (such as cloud storage and video on demand). This guide focuses on IBM flash storage solutions for OLAP database environments, which are based on FlashSystem storage systems.


Did you know?

FlashSystem storage systems deliver over 500,000 read IOPS with less than 100 microseconds of latency and provide up to 20 TB of usable data storage just in 1U of rack space. FlashSystem storage systems also offer enterprise-level availability and reliability with no single point of failure, multiple layers of data correction, chip redundancy, and redundant hot swap components.


Business value

Data warehouses are commonly used with OLAP workloads in decision support systems, such as financial analysis. Unlike OLTP, where transactions are typically relatively simple and deal with small amounts of data, OLAP queries are more complex and process larger volumes of data.

OLAP databases are normally separated from OLTP databases and tend to consolidate historical and reference information from multiple sources. Queries are submitted to OLAP databases to analyze consolidated data from different points of view to make better business decisions in a timely manner.

For OLAP workloads, a fast response time is critical to ensure that strategic business decisions can be made quickly in dynamic market conditions. Delays can significantly increase business and financial risks. Usually, decision making is stalled or delayed because of a lack of accurate, real-time operational data for analytics, which means missed opportunities for the following reasons:
  • Inability to gain insight into a business
  • Inability to predict business outcomes
  • Explosion of volume, variety, and velocity of information

FlashSystem storage systems can help to make businesses more agile and analytics-driven by providing up-to-the-minute analytics based on real-time data, and not yesterday’s news.

The delays come primarily from batch data loads and performance issues due to handling heavy complex queries that use I/O resources. A common performance bottleneck in OLAP environments is the I/O that is required for reading massive amounts of data (frequently referred to as big data) from storage for processing in the OLAP database server. The server ability to process this data is usually a nonfactor because they typically have significant amounts of RAM and processing power, parallelizing tasks across the computing resources of the servers.

In general, clients might experience the following challenges in OLAP environments:
  • Slow query execution and response times, which delay business decision making
  • Dramatic growth in data, which requires deeper analysis


FlashSystem storage systems can help to address these challenges in the following ways:
  • Dramatically boosting the performance of OLAP workloads with distributed scale-out architecture, providing almost linear and virtually unlimited performance and capacity scalability
  • Significantly improving response time for better and timely decision making


Solution overview

An OLAP solution with FlashSystem storage systems consists of the following components:
  • Database servers (IBM System x® or IBM Power Systems™) to run data management software such as IBM DB2®, Microsoft SQL Server, or Oracle Database
  • Flash storage systems (IBM FlashSystem 720 or IBM FlashSystem 820) to host the entire data set or partitioned subsets of data
  • A private network (such as 10 Gb Ethernet or QDR/FDR InfiniBand) that is used to provide high-speed connectivity across database servers in a cluster
  • A storage area network (SAN) that is used to provide connectivity across database servers and storage systems

IBM DB2 for Linux, UNIX, and Windows is the database of choice for robust, enterprise-wide solutions that handle high-volume workloads. It is optimized to deliver industry-leading performance, while lowering costs. IBM servers that run DB2 are proven performance leaders. DB2 uses and optimize multiple threads automatically, with no change to applications. The unique clustering design of DB2 provides near-linear scalability, continuous availability, and simplified management.

IBM System x servers are developed, tested for quality, and certified by IBM, and they are backed by IBM's incomparable worldwide service and support. IBM System x servers deliver business value over the long term because of advancements in scalability, reliability, and performance, which are combined with flexible configuration options, energy efficient components, and robust systems management tools.

IBM eX5 servers, such as IBM System x3850 X5, are designed for mission-critical enterprise-class workloads such as databases. These systems use open, industry-standard components, which are combined with IBM eX5 technologies, to provide leadership performance, scalability, and reliability.

Ideally suited for compute-intensive workloads, IBM Power Systems deliver leadership performance and scalability in its class. An integrated approach to the design, development, and testing of each IBM POWER® server, blade, or compute node ensures the resiliency that is required for today’s IT infrastructure. All Power Systems server models include innovative reliability, availability, and serviceability features that help you avoid unplanned downtime. And, with capacity on demand, hot-node add, and IBM Active Memory™ Expansion, Power Systems enterprise servers ensure that you can keep your most important applications available, even as you add capacity to handle new business demands.

FlashSystem offerings that are used in OLAP environments include FlashSystem 720 and FlashSystem 820 purpose-built external flash storage systems. These storage systems feature one of the industry's most dense GB capacity, bandwidth, and IOPS per 1U rack space ratios for a protected storage, in addition to low power consumption.

Figure 2 shows FlashSystem 720, high availability SLC Flash system.

RamSan-720
Figure 2. FlashSystem 720

Figure 3 shows FlashSystem 820, high availability eMLC Flash system.

RamSan-820
Figure 3. FlashSystem 820

FlashSystem 720 and FlashSystem 820 have the following common features:
  • 1U form factor
  • Two dual-port 8 Gbps Fibre Channel controllers or dual-port QDR InfiniBand host interface modules
  • Patented Variable Stripe RAID
  • 2D Flash RAID: RAID-5 across flash chips and RAID-5 across flash modules
  • Hot swappable flash
  • Redundancy for power, data, and management
  • No single point of failure

FlashSystem 720 has the following features:
  • Enterprise-grade SLC flash
  • 5 or 10 TB of usable storage with system-level RAID and spare flash modules
  • 525,000 IOPS (4 KB reads) and 400,000 IOPS (4-KB writes)
  • 25 μs (writes) and 100 μs (reads) latency
  • 5 GBps bandwidth (reads) and 4 GBps (writes)
  • Low power (350 watts)

FlashSystem 820 has the following features:
  • Enterprise-grade eMLC flash
  • 10 or 20 TB of usable storage with system-level RAID and spare flash modules
  • 525,000 IOPS (4 KB reads) and 280,000 IOPS (4 KB writes)
  • 25 μs (writes) and 110 μs (reads) latency
  • 5 GBps bandwidth (reads) and 2.8 GBps (writes)
  • Low power (300 watts)


Solution architecture

IBM flash storage solutions for OLAP use a distributed server and storage scale-out approach. This approach satisfies high bandwidth requirements and provides unlimited performance and capacity growth capabilities, matching the growing volumes of data that is being processed.

FlashSystem storage systems are connected to the database hosting platforms by using Fibre Channel SAN. Server hosts, or nodes, are interconnected with the isolated high-speed network (such as 10 Gb Ethernet with IBM RackSwitch™ G8124E) that is used for the inter-node data exchange. Each node runs a copy of the OLAP database application, and the analyzed data set is partitioned and distributed across the storage systems. Depending on the database management software that is used and its architecture, each node might have access to only a certain portion of data, or all nodes can have access to all data that is stored on the external storage. OLAP queries are distributed across nodes and processed in parallel. Figure 4 illustrates this architecture.

Scale-out OLAP architecture
Figure 4. Scale-out OLAP architecture

This solution can scale seamlessly by adding more FlashSystem storage systems and nodes. In such a case, storage capacity and I/O bandwidth are incremented linearly with the increasing number of storage devices, which can help to eliminate storage I/O bottlenecks in OLAP workloads.


Usage scenarios

OLAP applications can be used for risk assessment, business intelligence and reporting, exploration and visualization, predictive analytics, and other similarly profiled industry and functional applications. For example, consider a data warehouse solution that can store and process 160 TB of data. With eight FlashSystem 820 storage systems, the consolidated FlashSystem storage infrastructure can store up to 160 TB of a protected data with up to 26.4 GBps storage I/O aggregated bandwidth.


Integration

An excellent example of a scale-out approach in the OLAP environment is IBM InfoSphere Warehouse. InfoSphere Warehouse is powered by the DB2 for Linux, UNIX, and Windows data server. With its massively scalable, shared-nothing architecture, DB2 provides high performance for mixed-workload query processing of relational and basic XML data. Such advanced features as database and table partitioning, compression, multidimensional clustering (MDC), materialized query tables (MQT), and OLAP capabilities make DB2 a powerful engine for operational warehousing.

InfoSphere Warehouse provides advanced capabilities for database partitioning, so that IT users have multiple ways to distribute data across servers for large-scale parallelism and linear scalability. The shared-nothing architecture of DB2 helps ensure that performance will not degrade as the warehouse grows. Also, because InfoSphere Warehouse can physically cluster data on multiple dimensions, order data by value range, and limit I/O to relevant data partitions, it helps reduce the work that is needed to resolve many queries.

The architecture of the InfoSphere Warehouse database server solution with FlashSystem 820 is shown in the following figure.


Figure 5. InfoSphere Warehouse database server solution with FlashSystem 820

InfoSphere Warehouse transparently splits the database across multiple partitions stored on FlashSystem logical volumes, and it uses computing power of multiple servers to satisfy requests for large amounts of information. SQL statements are automatically decomposed into subrequests that are run in parallel across each database partition. Results of the subrequests are joined to provide final results.


Supported platforms

FlashSystem storage systems support a wide range of operating systems (Windows Server 2003 and 2008, Linux, and IBM AIX®), hardware platforms (System x, Power Systems, and x86 servers not from IBM), HBAs, and SAN fabrics. For specific information, see the System Storage Interoperation Center (SSIC): http://ibm.com/systems/support/storage/ssic


Ordering information

For FlashSystem 720 and FlashSystem 820 ordering information, see the following IBM Redbooks® Product Guide:


Related information

For more information, see the following documents:

Special Notices

This material has not been submitted to any formal IBM test and is published AS IS. It has not been the subject of rigorous review. IBM assumes no responsibility for its accuracy or completeness. The use of this information or the implementation of any of these techniques is a client responsibility and depends upon the client's ability to evaluate and integrate them into the client's operational environment.

Profile

Publish Date
21 March 2013

Last Update
16 April 2013


Rating:
(based on 2 reviews)


Author(s)

IBM Form Number
TIPS0974