Journaling - Configuring for your Fair Share of Write Cache
Published 23 February 2007
Authors: Hernando Bedoya
Of all the hardware choices you can make, the most influential regarding journal performance is usually the quantity of IOA write cache made available for your journal receivers.
While it is tempting to simply configure as many disk units emanating from an IOA as the surrounding hardware connections will allow, doing so in a journal intensive environment can sometimes be counterproductive. Instead, it is often wiser to refrain from attaching so many disk units to the IOA. By spreading the disk units around across a larger quantity of IOAs, you allow the IOA write-cache to keep up with the journal traffic.
Written by Larry Youngren, Software Engineer IBM Systems &Technology Group, Development iSeries Journaling
The path leading from main memory out to the disk units often passes through an I/O Adaptor (IOA). Many IOAs house some local memory of their own, which is often configured to serve as a write cache.
The presence of such write cache can be a good thing, since it serves as a speed matching buffer allowing your application to experience a rollercoaster of increases and decreases in the rate of arrival of new journal entries without swamping the underlying disk subsystem.
As the rate of arrival of new journal-induced disk traffic rises, it simply accumulates in the IOA write-cache, and is staged for an eventual trip out to the disk surface. Since the journal entries pause within the IOA write-cache, the longer they linger, the more likely it becomes that neighboring disk sector images will arrive and thus that the trip out to the disk surface will have a greater locality-of-reference and hence take fewer total disk rotations to service. In effect, the presence of the IOA write cache improves your disk write efficiency.
The result is often dramatic performance improvements upstream at the application level, as seen in this example:
The question is, “If write cache within the IOA is good, how many disk units should I configure to share the same write cache?”
That is a good question, and it is certainly tempting to conclude that with the huge increases in IOA write cache sizes (years ago, they used to be merely 10 MBs, and then grew to 26 MB, and are now much larger) that you should configure as many disk units as the underlying physical connections allow, but that is not always the wisest choice.
Some of the largest write cache sizes house an amazing 757 MB or more.
That appears to be a huge amount, and it is tempting, with so much write cache available, to configure as many disk units as possible, that all feed off the same write cache, as the underlying hardware connections allow.
But should you?
Eight is enough
If your rate of arrival of new journal entries is substantial and it is your desire to provide the most optimal disk write path (least overhead/contention) for your journal disk traffic, you should NOT configure as many disk units as you can.
Instead, it can be, in some circumstances, wiser to restrict yourself to only associating seven or eight disk units with each IOA write cache.
Why? Because you may want to limit contention from separate disk units within the IOA.
Let us consider an analogy.
Each year many extended families gather to eat a traditional holiday meal. For my family, that gathering day is the USA holiday known as Thanksgiving. It is traditional to prepare a turkey, mashed potatoes, and lots of gravy. The cooks in the family labor all morning preparing the meal, the children gather around the table, out comes a steaming bowl piled high with mashed potatoes and all of the kids eyes grow wide. In fact, the real treat (as a kid) is to pile up a huge heaping mound of mashed potatoes in the shape of a volcano, hollow out the middle, pour gravy into the resulting cavity, and seal the top. It is great fun and every one of my cousins eagerly awaited their opportunity to do so as the heaping bowl of mashed potatoes was passed.
But you know what happens, right?
The first few cousins pile onto their plates far more than their fair share of potatoes and by the time the bowl reached my place there was none left; only an empty bowl. I had not received my fair share.
IOAs are much the same. They are a limited supply resource (like the bowl of mashed potatoes) and if each disk slurps up more than its fair share of space within the write cache of the IOA, there is little room left for those disk write requests that arrive later, and that is precisely the problem that can ensue if you attach more than eight disk units to the IOA and then expect to load up each disk with a great deal of scheduled journal traffic.
Not enough capacity for all disk units to be write-intensive
You see, the IOA designers did not really have a journal write intensive environment in mind when they sized the IOA. Instead, they envisioned a more balanced heterogeneous environment in which an application would spend more time reading from the disk surface rather than writing to it.
Isolating a set of disk units to a User ASP and thereby setting them aside for the select purpose of absorbing journal traffic (many writes, few reads) is not such an environment. Instead, it is a very skewed kind of behavior from an IOA designer's point of view.
It is for that reason that although IOA designers have allowed you to attach up to 15 disk units per IOA, a journal developer might caution you not to yield to that temptation.
While there may well be many instances in which your application behavior typifies far more reads from disk than writes to disk, the use of a journal is not such an environment. Instead, the journal tends to be a very disk-write intensive activity. As a consequence, an aggressive batch job updating many consecutive rows or addling many new rows to a journaled physical file or SQL table can unleash a torrent of back to back disk write requests.
Journal receivers try to absorb such disk write traffic and spread it around. To that end, journal receivers deliberately attempt to spread themselves across multiple disk units. If all of these disk units are attached to the same IOA, then this attempt to spread the load can be fraught with danger and an IOA can be overworked.
In addition, there is a second factor at work. The underlying microcode servicing the IOA attempts to limit how much of the IOA’s write cache each disk unit can consume. Once that quantity of buffered page images resides within the memory of the IOA, the microcode demands that the former cached pages be written to a disk surface. This self-limiting mechanism is known as the “fair share” rule. That is, each disk arm associated with an IOA is allowed to accumulate no more than its fair share of unwritten disk sector images. Once that limit has been reached, subsequent attempts to input more data into the IOA write cache on behalf of the same disk unit will experience a pause until the former pages have been emptied to make room for the new arrivals. The fair share amount is 100 MB per disk unit.
With a total IOA write cache size of 757 MBs, it is an easy calculation to conclude that 757/100 = 7.5 or so disk units maximum. That is, when a journal receiver is being aggressively filled (mostly disk writes occurring, very few disk reads), generally no more than seven or eight disk units (per IOA) can be comfortably serviced without incurring pauses triggered by this fair-share rule.
For that reason, in a particularly aggressive disk-write environment, it can make sense to limit the quantity of disk units servicing a journal receiver to no more than eight for IOA #1 and eight more for each subsequent IOA. By doing so, the IOA is generally able to grant each disk write request a rapid response, which in turn means that the application producing the torrent of journal entries works smoothly.
The ultimate evidence that sufficient IOA write cache has been configured to match the rate of arrival of journal entries is to use the algorithm identified in Appendix A, "How to calculate the fast write percentage", of Striving for Optimal Journal Performance on DB2 Universal Database for iSeries, SG24-6286. If that algorithm suggests that the ratio of fast writes is 99% or better, you probably have sufficient IOA write cache for your journal traffic. If you measure a value smaller than 99%, you may need to spread the journal across more disk units and ensure that none of the disk units selected are attached to an IOA that is trying to service too many contentious disk write requests. That is when the “eight is enough” guideline may make sense for journal intensive configurations.
Although there are no doubt many sufficiently tame environments in which attaching up to 15 disk units to an IOA causes no performance slowdowns, if you should find that your percentage of IOA fast disk writes is such that you are not keeping up, a helpful move can often be to reduce the number of disk units per IOA. In particularly aggressive journal disk-write environments, limiting the number of disk units to only seven or eight per IOA often seems prudent.
This material has not been submitted to any formal IBM test and is published AS IS. It has not been the subject of rigorous review. IBM assumes no responsibility for its accuracy or completeness. The use of this information or the implementation of any of these techniques is a client responsibility and depends upon the client's ability to evaluate and integrate them into the client's operational environment.
Follow IBM Redbooks
Follow IBM Redbooks