Faster Microsoft SQL Server Performance with IBM FlashSystem Storage

IBM Redbooks Solution Guide

Abstract

This IBM® Redbooks® Solution Guide provides an overview of how to identify storage performance bottlenecks and improve Microsoft SQL Server database performance by using IBM FlashSystem™ storage to accelerate the most resource-intensive data operations.

Contents


This IBM® Redbooks® Solution Guide provides an overview of how to identify storage performance bottlenecks and improve Microsoft SQL Server database performance by using IBM FlashSystem™ storage to accelerate the most resource-intensive data operations.

Fast storage and low latency are vital to maximizing SQL Server performance. Adding servers or processing power often does not improve SQL performance because the system is limited by the speed of data access time. The result is a massive performance gap, most for database servers, which typically carry out more I/O transactions than other systems. Super-fast processors and massive amounts of bandwidth are often wasted as storage devices take several milliseconds to access requested data.

Figure 1 depicts the importance of latency, which applications see as time waiting. Input/output (operations) per second (IOPS), which are defined as a measure of work versus time, are important, but low latency is key to increased SQL Server performance.

Figure 1. FlashSystem products minimize wait time
Figure 1. FlashSystem products minimize wait time


Did you know?
  • While traditional storage solutions take several milliseconds to access data, FlashSystem products use IBM MicroLatency™ to access storage in a matter of microseconds
  • You can move your most I/O intensive applications to FlashSystem products to improve the performance of your entire environment with minimal impact to existing configurations
  • Consolidating your storage onto FlashSystem products can help you reduce floor space, server costs, and software licensing fees


Business value

IBM FlashSystem products provide extremely fast storage performance with an extremely small footprint. They offer extraordinarily high random I/O performance because of their low access times and high bandwidth. You can deliver more and have faster SQL transactions with FlashSystem storage.

Decreasing application performance under heavy user loads is not a new story for most enterprises. In the last three years, there have been dramatic changes in demands placed upon database servers. While the number of users of database systems has increased, so has the average amount of data stored in databases. Additionally, the demand for more complex business analysis has increased the complexity of the work done by database servers. The combination of more users, greater volume of data and more complex queries has frequently resulted in slower database response. Administrators often consider two sources for database performance problems:
  • Server and processor performance. One of the first steps that most IT shops do when performance decreases is to add processors and memory to servers or add servers to server farms. Adding servers and processors has a minimal impact on database performance that is storage bound, and compounds the resources that are wasted as even more processing power waits on the same slow storage.
  • SQL statements. Enterprises invest millions of dollars squeezing every bit of efficiency out of their SQL statements. The software tools that assist programmers with the assessment of their SQL statements can cost tens of thousands of dollars. The personnel required for evaluating and iterating the code costs much more. In the last decade, many consulting firms have appeared that specialize in system tuning, and their number-one billable service is SQL tuning.
    Tuning SQL can create performance improvements, but even the best SQL cannot make up for poor storage I/O. In many cases, applications cannot support features that rely heavily on disk I/O. For example, developers often remove features that cause large queries and return large data sets to protect application performance. For more details, go to the following site:
    http://www.ibm.com/common/ssi/cgi-bin/ssialias?infotype=SA&subtype=WH&htmlfid=TSW03191USEN

In many cases, the money spent in these two pursuits can be significant, although the return is often disappointing. Server performance and SQL tuning alone does not often detect the true cause of poor database performance: the gap between processor performance and storage performance. Adding servers and processors will have minimal impact on database performance and will compound the resources wasted, as more processing power waits on the same slow storage.

Often, features that rely heavily on disk I/O cannot be supported by applications. In particular, programs that result in large queries and those that return large data sets are often removed from applications to protect application performance.

When system administrators look to storage, they frequently try three approaches to resolve performance problems:
  • Increase the number of disks. Adding disks to JBOD (just a bunch of disks) or RAID is one way to improve storage performance. This approach can increase the I/O capability because I/O growth is a linear function when increasing the drive count, but has a trivial impact on the bottleneck if the performance issue is related to latency. You can scale I/O and bandwidth with disks if you have the space and the budget to do so, but you cannot gain the benefit of microsecond latency no matter how many disks you add.
  • Move the most frequently accessed files to their own disk. This approach can deliver the best I/O available from a single disk drive. As frequently pointed out, the I/O capability of a single hard disk drive is limited. At best, a single disk drive can provide hundreds of I/Os per second (IOPS) versus thousands of IOPS for flash storage systems. FlashSystem storage systems are capable of providing hundreds of thousands of IOPS within a single 1U appliance. FlashSystem storage combines low latency with writes at 25 microseconds and reads at 110 microseconds, with IOPS up to 525,000 for FlashSystem 820, to provide extreme performance and MicroLatency for critical business needs.
  • Select and implement a RAID design. A common approach is to move from a JBOD implementation to RAID. RAID systems frequently offer improved performance by placing a cached controller in front of the disk drives and by striping storage across multiple disks. The move to RAID can provide additional performance, particularly in instances where a large amount of cache is used. However, to reach high IOPS and meet user concurrency requirements. the required number of hard disk drives soon becomes prohibitive. FlashSystem storage's hardware-only data path, with extremely low-latency Custom FPGA-based data movement. decreases latency and reduces the need to continually increase the number hard disks to reach high IOPS.

The best solution to the performance gap is to implement solid state disks for the most frequently accessed database components.

While other components in the data chain move in terms of computation times and the raw speed of electricity through a circuit, hard drives move mechanically, relying on physical movement around a magnetic platter to access information. Form factor solid-state drive (SSD) technologies are limited by the speed of SCSI controllers, and many PCI Express (PCIe) solutions require additional software to manage operations. FlashSystem products provide a hardware-only data path that offers much faster access than any of these other solutions.

When servers wait on storage, users wait on servers. This is I/O wait time. FlashSystem storage systems are designed to minimize I/O wait time by providing a fast, hardware-only data path. These products offer 250 times faster access times (200 microseconds instead of 5 milliseconds) and 1333 times more I/O transactions per second (400,000 instead of up to 400) than a hard disk drive. Although multiple hard disk drives can be stacked to obtain thousands of IOPS, this type of solution soon reaches a point of diminishing returns where the costs of power, floor space, and air conditioning become prohibitive. Figure 2 shows latency for flash products compared to other types of storage technologies.

Figure 2. Latencies seen in storage technologies
Figure 2. Latencies seen in storage technologies

FlashSystem products provide four key benefits:
  • Extreme performance. Speed up your drive-critical applications with extreme performance storage. Spinning hard drives can be rapid, FlashSystem provides extreme performance while DRAM has the lowest latency, but is most expensive.
  • MicroLatency. Eliminate I/O wait time with 110 microsecond access times.
  • MacroEfficiency. Save money on IT staffing, power, and cooling.
  • Enterprise reliability. Durable and reliable designs use enterprise-quality flash, Variable Stripe RAID technology to reduce maintenance events, two-dimensional RAID technology, and hot-swap capability.


Solution overview

Disk latency, which is defined as a measure of the time delay from the time a disk I/O request is created until the time the disk I/O request is completed, affects Microsoft SQL Server performance. IBM FlashSystem storage products can reduce bottlenecks caused by frequent disk access. This section discusses methods for identifying I/O performance bottlenecks and lists components that are the best candidates for migration to a flash storage appliance.

Although a solid-state drive can speed up almost any SQL Server database, it is most needed in installations where servers experience I/O wait time. Several elements are involved in server I/O. The PCI (or other) bus, host bus adapter, interface, storage network switch, RAID controller, and hard disk drives are all involved in every I/O operation between server and storage. Theoretically, any one of these points can cause an I/O bottleneck. In practice, however, the hard disk drives are the most likely culprit. Simply put, every component in the I/O process is solid state except for the hard disk drives. Therefore, when you identify I/O wait time, hard disk drives are the most likely cause. Adding a solid-state drive can significantly reduce I/O wait time.

The best way to identify I/O wait time is to look at operating system performance. The tools to evaluate operating system performance vary by operating system.

For Microsoft Windows operating systems, the best tool for system performance analysis is the Performance Monitor (Perfmon) program. This tool uses counters to monitor the speed of disk transfers and the I/O operations that are generated by SQL Server.

Unfortunately, the Performance Monitor program does not provide the actual I/O wait time statistic, but it does include actual processor performance levels. The “Processor: % Processor Time” statistic shows the actual work that is being done by the processor. If transactions have a large impact on your system, but the “% Processor Time” statistic is well under 100%, it is possible to infer an I/O wait problem. Systems that implement solid-state drive show high “% Processor Time” numbers.

For example, Figure 3 and Figure 4 show screen captures from the Windows Performance Monitor program. Figure 3 shows the “Processor: % Processor Time” for a Windows system that is running the Iometer software performing 100% random writes to a hard disk drive. In this exhibit, the processor utilization is less than 1%. Running additional applications on this system might increase the processor utilization only marginally, because the processor is waiting for I/O from the hard disk drive. In the following example, the Iometer software shows that on average, there were 150 writes per second (150 IOPS) to the disk drive.

Figure 3. Processor performance when writing to hard disk
Figure 3. Processor performance when writing to hard disk

Figure 4 shows the same system with the same access specifications in the Iometer software running against a FlashSystem solid-state drive. In this example, the processor averages 68% utilization. The Iometer software shows that 138,000 writes per second are going to the FlashSystem product (138,000 IOPS). As it turns out, this IOPS number is a limitation of the host bus adapter. Nonetheless, this example clearly illustrates how a solid-state drive can improve processor utilization for a Windows-based system.

Figure 4. Processor performance when writing to solid state disk
Figure 4. Processor performance when writing to solid state disk

In addition to processor indicators, you can look at the “Physical Disk: Average Queue Length” and “Physical Disk: Disk Bytes Per Second” indicators to detect bottlenecks in the disk subsystem. If the “Physical Disk: Average Disk Queue Length” is consistently high, consider moving files that are located on that disk to the solid-state drive. A “Disk Queue Length” greater than 3 (per disk in the RAID set) indicates a problem.

The “Physical Disk: Disk Bytes Per Second” indicator helps you visibly determine the level of performance you are getting when your disk array levels off. The point at which disk bytes per second level off and physical disk queue lengths increase is the point when your system is demonstrating an I/O bottleneck. In addition, if you simultaneously graph Processor Utilization, you notice that processor time tends to decrease when your system experiences an I/O bottleneck. After you narrow down the time range causing I/O problems, further refine your search for the problem by using Perfmon’s ability to narrow details by reads/writes and by drive arrays. In this way, you can quickly determine which array is causing I/O problems.

For statistics to confirm the I/O wait witnessed by the application, you can also load the SQL Server: Wait Statistics objects into the Windows Perfmon program. These counters give an application’s perception of Wait categories of real-time queries. The most prominent waits are typically the Page I/O Latch waits, and Log buffer and Log Write waits.

With these tools, you can observe the I/O wait time for a Windows-based system.


Solution architecture

FlashSystem products are a SAN solution with Fibre Channel or InfiniBand connectivity. They have a small, 1U form factor, and can be integrated with IBM SVC to add performance to low latency.


Usage scenarios

IBM FlashSystem solutions benefit the financial industry; FlashSystem storage low latency and improved database performance can improve this usage scenario.

Financial industry

The securities industry is one of the most performance-sensitive sectors of the economy. Electronic exchanges, trading and clearing systems, and algorithmic trading are growing as tick-data response times decline. This has created an environment of fierce competition where the performance of IT infrastructure makes the difference between the firms that capture market share and profits, and those that do not.

Latency matters

System response times (latency) are of critical importance to applications in this environment. In addition to the intense performance requirements of the securities industry, resiliency of IT systems is critical. The need to electronically create legally binding agreements for tremendous monetary sums places unique requirements for redundancy and real-time position management on the business infrastructure. A firm that is not in control of its trades and positions is left without a view of its risk and is at the mercy of counterparties. This can result in dramatic costs from settlement failures, fines for missing regulatory timing requirements, and customer complaints. To avoid this situation, data in securities applications must be fast and must be persistent.

Flash storage has the following characteristics:
  • Lowest possible access times. Flash storage, such as the IBM FlashSystem 720, has access times that start at 0.1 milliseconds for reads (50 times faster than a hard disk of 5 milliseconds) and integrate write buffers for access times as low as 0.025 ms for writes.
  • High bandwidth. The enterprise solid-state storage market includes products such as the IBM FlashSystem 820, which can support over 5 GBps of random data throughput and across 20 TB of two-dimensional RAID-protected capacity.

For more information, see the white paper, Utilizing IBM FlashSystem in the securities industry:
http://public.dhe.ibm.com/common/ssi/ecm/en/tsw03230usen/TSW03230USEN.PDF

Improving database performance

If you determine that your system has I/O subsystem problems, the next step is to identify which components of your SQL Server database are experiencing the highest I/O and in turn causing I/O wait time. Consider moving several components to FlashSystem storage. These components include entire database, transaction logs, temporary database (tempdb), indexes, and frequently accessed tables.

Entire database

Some databases should have all of their files moved to flash storage. These databases tend to have at least one of the following characteristics:
  • High concurrent access. Consider moving all data to solid-state drive for databases that are used by a large number of concurrent users. This ensures that storage is not a bottleneck for the application and maximizes the utilization of servers and networks. I/O wait time will be minimized and servers and bandwidth will be fully utilized.
  • Frequent random accesses to all tables. For some databases, identifying a subset of files that are frequently accessed is not possible. Many times these databases are effectively large indices.
  • Small to medium size databases. Given the fixed costs associated with buying RAID systems, buying FlashSystem storage to store small to medium sized databases is often economical. A FlashSystem 710, for example, can provide 5 TB of database storage for less than the price of most enterprise RAID storage systems.
  • Large read-intensive databases. Given the fixed costs associated with architecting a RAID system for performance (buying a large cache and buying a lot of spindles for striping), it is economical and much simpler to buy a FlashSystem 820 pure flash solution to accelerate large read-intensive databases. A single FlashSystem 820 system can scale up to 20 TB of capacity in a 1U form-factor.
  • Database performance is the key to company profitability. Some subset of databases help companies make more money, lose less money, or improve customer satisfaction if they process faster. Flash storage can help make these companies more profitable.


Transaction logs

Transaction logs are one of the most important factors in the write-performance for SQL Server databases. When a database write occurs, SQL Server creates a transaction log entry. Each SQL Server database has at least one transaction log (*.ldf). SQL Server considers operations committed when the write to the transaction logs is complete.

Because the transaction logs are a source of constant I/O during database operation, store the transaction logs on the fastest disk possible. Writing a transaction log to FlashSystem storage is a natural way to improve overall database performance.

Temporary database (tempdb)

SQL Server uses tempdb to support temporary data during certain operations. The tables support complex queries, joins, and index creations. Because tempdb supports many kinds of operations, locating them on the fastest possible storage is essential.

When complex operations occur, they complete more quickly if the tempdb is moved to FlashSystem storage. Because the I/O to the tempdb can be frequent, disk drives cannot handle them easily.

Indexes

An index is a data structure that speeds up access to database records. An index is usually created for each table in a database. SQL Server updates these indexes when adding records and modifying identifying data for records. When a read occurs, SQL Server consults an index to access the correct record quickly. Furthermore, many concurrent users might read any index simultaneously. This activity to the disk drive creates frequent, small, and random transactions, which makes the disk drives unable to keep up with demand and I/O wait time results.

Storing indexes on a solid-state drive can increase the entire application’s performance. For online transaction processing (OLTP) systems with a high number of concurrent users, this results in faster database access. Historically, indexes have been a common SQL Server component to be moved to solid-state drive because they can be re-created from the existing data.

Frequently accessed tables

Sometimes, only 5 - 10% of tables account for a large percentage of all database activity, and thus I/O to storage. When a large number of users hit a table, they are likely going after different records and different attributes. As a result, the activity on that table is random. Disk drives are notoriously bad at servicing random requests for data. In fact, the peak performance of a disk drive drops as much as 95% when servicing random transactions. When a table experiences frequent access, transaction queues develop where other transactions are waiting on the disk to service the next request. These queues are another sign that the system is experiencing I/O wait time.

Moving frequently accessed tables to FlashSystem storage makes sense. Unlike traditional hard drives, random transactions do not affect solid-state drive performance. Additionally, solid-state drives have faster access times than disk drives. Therefore, moving frequently accessed tables to solid-state drive can improve application performance up to 10 times.

SQL Server supports moving specific tables to specific file groups, which enables you to more easily move the most frequently accessed tables to solid-state drive.

Using the Microsoft SQL Server Profiler tool to identify the most frequently accessed SQL Server files and tables

The SQL Server Profiler tool allows database administrators to monitor events in SQL Server. The profiler monitors SQL Server activities by tracing selected events, including database objects. Database objects include databases, tables, indexes, views, and stored procedures. The SQL Server Profiler specifically supports monitoring when a table is opened. Tables are opened during Select, Insert, and Delete statements. By using the profiler, a database administrator can monitor the number of times all tables are opened during the trace period. The results of the trace can be sent to a table for further analysis to determine which tables were most frequently opened during the trace period and thus which tables are the best candidates for placement onto a solid-state drive.

Object-related traces can impose a significant overhead on database performance because of the frequency with which objects are accessed. If you have a test environment that models production activity and transaction mixes, this might be the best place to use an SQL Server Profiler “Object” trace. Otherwise, ensure you do not use the Profiler during peak transaction periods unless you are willing to take the performance hit.

For more information about SQL Server Profiler, visit the Microsoft MSDN website:
http://msdn.microsoft.com/en-us/library/bb545450%28v=msdn.10%20%29.aspx

To improve performance, physically isolate database objects, such as tables and indexes, into unique file groups. With file groups, management and monitoring become more granular than the database level, and DBAs are able to avoid the high overhead of SQL Server Profiler to find hotspots. This monitoring is done with Dynamic Management Views (DMVs).

The DMV dm_io_virtual_file_stats (Figure 5) is a query of I/O counters and cumulative wait times for each type of I/O. These counters represent the aggregate values since the last boot of the SQL Server instance. Because these are running counters, a difference in query results in needed to get a proper assessment of I/O wait. Record one query result before a busy period of the application, and record a second query result after the busy period. Recording each query result in a spreadsheet application, such as Microsoft Excel, is ideal for this analysis.

Figure 5. Running a DMV dm_io_virtual_file_stats query
Figure 5. Running a DMV dm_io_virtual_file_stats query

From this query, you can extract the counters for every database with their respective file statistics. The results can then be presented into a more visual form using Microsoft Excel with a PivotChart to group files into file groups.

Figure 6 is an example of showing the number of I/O stalls logged against each file’s size in a TPC benchmark database.

Figure 6. Graphic view of query
Figure 6. Graphic view of query


Integration

FlashSystem products can be integrated into your existing storage environment. You can also add an IBM SAN Volume Controller (SVC) to manage FlashSystem storage as Tier 0 storage.

Use an SVC to add advanced storage functionality to the extreme performance of FlashSystem storage. SVC products have a maximum of 100 μs of latency and add the following features:
  • Thin provisioning to allocate storage "just in time"
  • Improved utilization to harvest all SAN capacity
  • Disaster avoidance with location-proof data availability
  • IBM Easy Tier® for storage efficiency
  • IBM FlashCopy® for point-in-time copies
  • Mirroring/copy services for data replication and protection
  • Real-time compression to place up to five times more data in the same physical space


Supported platforms

Supported hardware and platforms for the FlashSystem portfolio of products are listed at theIBM System Storage Interoperation Center (SSIC):
http://www.ibm.com/systems/support/storage/ssic


Ordering information

FlashSystem storage products can be ordered individually or bundled with an SAN Volume Controller to provide advanced functionality such as thin provisioning, IBM Easy Tier, and FlashCopy.

Ordering information is shown in Table 1.

Table 1. Ordering part numbers and feature codes
ItemProduct numberFeature codeDescription
IBM FlashSystem 7209381-AS2
IBM FlashSystem 8209381-AE2
IBM FlashSystem Bundle9381-AE2AF0W(24 TB) FlashSystem capacity with SAN Volume Controller functionality
Enterprise Performance 20 TB Building Block

(2) 2145-CG8 SAN Volume Controller Engines
(1) 9831-AE2 FlashSystem 820 20TB
(20 TB) 5639-VC6 base license
(5) 6911-400 Standard Service Units
IBM FlashSystem Bundle9381-AE2AF0X(12 TB) FlashSystem capacity with SAN Volume Controller functionality
Enterprise Performance 10TB Building Block

(2) 2145-CG8 SAN Volume Controller Engines
(1) 9831-AE2 FlashSystem 820 10TB
(10 TB) 5639-VC6 base license
(5) 6911-400 Standard Service Units
IBM FlashSystem Bundle9381-AE2AF0Y(24 TB) FlashSystem capacity with SAN Volume Controller functionality
Enterprise Performance+HA 20TB Building Block

(2) 2145-CG8 SAN Volume Controller Engines
(2) 9831-AE2 FlashSystem 820 20TB
(40 TB) 5639-VC6 base license
(5) 6911-400 Standard Service Units
IBM FlashSystem Bundle9381-AE2AF0Z(12 TB) FlashSystem capacity with SAN Volume Controller functionality
Enterprise Performance+HA 10TB Building Block

(2) 2145-CG8 SAN Volume Controller Engines
(2) 9831-AE2 FlashSystem 820 10TB
(20 TB) 5639-VC6 base license
(5) 6911-400 Standard Service Units


Related information

For more information, see the following documents:

Special Notices

This material has not been submitted to any formal IBM test and is published AS IS. It has not been the subject of rigorous review. IBM assumes no responsibility for its accuracy or completeness. The use of this information or the implementation of any of these techniques is a client responsibility and depends upon the client's ability to evaluate and integrate them into the client's operational environment. publib-b.boulder.ibm.com

Profile

Publish Date
30 July 2013


Rating: Not yet rated


Author(s)

IBM Form Number
TIPS1038