Implementing IBM Easy Tier with IBM Real-time Compression

IBM Redbooks Solution Guide

Abstract

IBM® Easy Tier® is a performance function that automatically and non-disruptively migrates frequently accessed data from magnetic media to solid-state drives (SSDs). In that way, the most frequently accessed data is stored on the fastest storage tier, and the overall performance is improved. This IBM Redbooks® Solution Guide describes this topic.

Contents


Overview

IBM® Easy Tier® is a performance function that automatically and non-disruptively migrates frequently accessed data from magnetic media to solid-state drives (SSDs). In that way, the most frequently accessed data is stored on the fastest storage tier, and the overall performance is improved.

How does it work

Every volume is split into logical units called extents. Easy Tier is based on algorithms that are developed by IBM Research, which evaluates the access frequency of each extent. Each extent is rated according to the number of I/Os going to that extent. Extents with a high rating, receiving the most I/Os, are marked as “hot” extents and become candidates for migration to SSDs in the same storage pool. Periodically, but no greater than 24 hours, a migration plan is created according to the “heat” of the extents and the data is migrated to the SSD MDisk. When the SSD becomes full, and there is a hotter extent to move onto the SSD, the “cooled” extents are migrated back to the lower-tiered MDisk.

Migrations are typically minimal, and add up to a maximum of two terabytes of data per day. The number of host read and write operations to a specific extent determines the rating of the extents. Only I/Os smaller than 64 KB are considered when determining ‘heat’ to prevent sequential I/O patterns from filling up the SSDs with data that is not likely to be accessed again frequently.

For more information about Easy Tier, see Chapter 7, "Easy Tier", in Implementing the IBM System Storage SAN Volume Controller V6.3, SG24-7933, found at http://www.redbooks.ibm.com/redbooks/pdfs/sg247933.pdf.

Easy Tier with compressed volumes

IBM Real-time Compression™ software is embedded in IBM Storwize® V7000 and IBM SAN Volume Controller systems. Compressed volumes have a unique write pattern to the MDisks. When a host writes data to a certain offset in a compressed volume, the system compresses this data, which is then written to another offset of the underlying volume as it is represented in the storage pool. Such a change in offsets triggers unnecessary migrations of data into SSDs because repetitive writes to the same logical offset end up written in various locations instead. A new Easy Tier algorithm is therefore required to support compression.

What is new in Storwize V7000 and SAN Volume Controller V7.1

Starting with Version 7.1, Easy Tier supports compressed volumes. A new algorithm is implemented to monitor read operations on compressed volumes instead of reads and writes. The extents with the highest number of read operations that are smaller than 64 KB are migrated to SSD MDisks. As a result, frequently read areas of the compressed volumes are serviced from SSDs. Easy Tier on non-compressed volumes operates as before and it is based on read and write operations smaller than 64 KB.

Performance results

The performance improvement that is achieved with Easy Tier and compression has an up to 3x faster application response time by having 5% of SSDs in the configuration. Throughput (maximum IOPS) depends on compression processor usage; therefore, in most cases, throughput remains the same.

Figure 1 shows the test results of a Transaction Processing Performance Council benchmark C (TPC-C) on a compressed volume with Easy Tier enabled and disabled. The TPC-C was used with an Oracle database and represents a realistic Online Transaction Processing (OLTP) workload. (For more information about TPC-C, go to http://www.tpc.org/tpcc/default.asp.)

Benchmark results
Figure 1. Benchmark results

Test results show that the application response time became faster by more than 3x when the configuration used SSDs with Easy Tier, compared to a similar configuration without SSDs and Easy Tier.

The Storwize V7000 system that was used in the benchmark was running software Version 7.1.0.1 and was using the following disk configuration:

Without Easy Tier:
  • 72 x 300 GB SAS HDDs

With Easy Tier:
  • 68 x 300 GB SAS HDDs
  • 4 x 300 GB SAS SSDs

IBM Storage Tier Advisor Tool (STAT): A tool to monitor Easy Tier

The IBM Storage Tier Advisor Tool (STAT) is a Windows console application that analyzes heat data files that are produced by Easy Tier and produces a graphical display of the amount of "hot" data per volume (with predictions about how additional SSD capacity could benefit the performance for the system) and per storage pool.

The tool is available at no additional cost and can be found at the following website:

http://www-01.ibm.com/support/docview.wss?uid=ssg1S4000935

To use the tool, you should use the dpa_heat file as a source file. The tool provides a report of volume heat distribution and recommendations.

To download the file, from the IBM Storwize V7000 GUI, navigate to the Settings icon in the left pane and click Support, as shown in Figure 2.

Support option
Figure 2. Support option

Click Show full log listing..., as shown in Figure 3.

Show full log listing option
Figure 3. Show full log listing option

Download the dpa_heat file from the list of files that is displayed, as shown in Figure 4.

File selection
Figure 4. File selection

The dpa_heat file is also in the full support package.


Understanding the results

This section describes how to interpret the results.

Volume heat distribution

The Volume Heat Distribution report is useful for understanding the amount of capacity that is migrated to the SSD when Easy Tier is enabled. The heat areas of compressed volumes are reported based on read operations only. The non-compressed volume is based on reads and writes. The “hot” part of the volume is marked in red, as shown in Figure 5.

Volume Heat Distribution report
Figure 5. Volume Heat Distribution report

Note: The tool’s recommendations are based on the state of the volume. Recommendations about generic volumes are based on both reads and writes, but compressed volumes are based only on read operations. Therefore, if you consider enabling Easy Tier on a compressed volume, first compress the volume and then use the STAT utility. Otherwise, the STAT tool’s recommendations will be different from the actual results.

Performance improvement

The system recommendation and the system pool recommendation reports show the potential performance improvement in percentages according to the number of SSDs that are added.

Note: When compressed and non-compressed volumes are in the same storage pool, they might affect the predicted performance improvement results of the entire pool.

Compressed volumes are not directly supported by the STAT tool and therefore its recommendations will be inaccurate for compressed volumes.

Use the results there were obtained to estimate the performance improvement. Figure 6 shows the Storage Pool Recommendation.

Storage Pool Recommendation
Figure 6. Storage Pool Recommendation

Configuration

Easy Tier is defined at a storage pool level and the algorithm runs on all the volumes in the pool. If Easy Tier must be disabled for a certain volume, you can disable it by running the following command-line interface (CLI) command:

svctask chvdisk –easytier off volume name

To configure Easy Tier, complete the following steps:
  1. Create a storage pool with HDD MDisks.
  2. Add an MDisk with SSD to the same pool.

Easy Tier is automatically turned on for pools with both SSD MDisks and HDD MDisks, so all the volumes in the pool have Easy Tier enabled. Figure 7 shows Easy Tier activated.

Easy Tier activated
Figure 7. Easy Tier activated

Conclusion

As shown, Easy Tier with Real-time Compression can greatly improve read I/O activity response time. Therefore, you should enable Easy Tier with compression on volumes with a high read workload.

Special Notices

This material has not been submitted to any formal IBM test and is published AS IS. It has not been the subject of rigorous review. IBM assumes no responsibility for its accuracy or completeness. The use of this information or the implementation of any of these techniques is a client responsibility and depends upon the client's ability to evaluate and integrate them into the client's operational environment. publib-b.boulder.ibm.com

Profile

Publish Date
23 August 2013


Rating: Not yet rated


Author(s)

IBM Form Number
TIPS1072