Journaling - How many journals should I configure?

Abstract

A frequently posed i5/OS™ question goes something like this, “I have 100 tables (files), should I try to route all the journal traffic into one master journal or would I be better off with perhaps 10 journals?" What are the pros and cons associated with each approach?”

The answer depends on what you’re trying to optimize: run-time performance, IPL recovery duration, target-side replay parallelism, or remote journal transport efficiency.

This Technote helps you think through the trade-offs.

Written by Larry Youngren
Software Engineer
IBM Systems &Technology Group, Development

Contents


The Configuration issue

Database tables (files) can be protected via journaling. A frequent inquiry poses the question: “How many journals should I employ?”

Generally, the answer is that it makes life simpler for human beings as well as more efficient for the operating system if multiple tables (files) are all journaled to the same place, within reason. In addition, the underlying i5/OS™ journal support has built-in natural bundling algorithms which tend to run at maximum efficiency when the algorithms are fed lots of consecutive journal entries in rapid succession. That is, an armful of journal entries written in unison to a shared disk surface can be handled more efficiently than separate trips to disk for each individual journal entry.

Hence, frequently the answer to the question “I have 100 tables (files), should they all be journaled to the same place?” is a resounding "Yes!" up to a point.

However, there are a few circumstances under which it may make sense, depending on your objective, to separate groups of tables (files) into smaller sets and then create an individual private journal for each subset. Such groupings should occur along natural database relationship boundaries, for example, those tables (files) participating in the same commit transaction should generally be journaled to the same place. Similarly, the presence of multi-format keyed logical files built over multiple underlying physical files should motivate you to try to journal all the underlying physical files to the same place.

Whether journaling to the same place or creating a slew of multiple journals, there are pros and cons related to each approach.

Let’s look at a recently posed question concerning this subject.

The question

“I’m working with a customer who is about to configure and enable journaling for the first time in order to feed a high availability replication product they intend to purchase. They have 100 files which they want replicated. They’re trying to decide how many journals to create and whether it makes more sense to have only one journal or create 10 separate journals. Which one is the wiser choice? What are the trade-offs they should consider? Will one choice provide obviously better performance than the other choice?”

The response

There's no single answer.

Instead, there are principles.

These principles lead to differing conclusions and advice depending on the objective.

Let’s think about your high availability replication environment. You’ll have a source/production machine. This machine will have local journaling enabled for your critical production files. You’ll also have a matching remote target machine. On this target system, you’ll be replaying the journaled changes against a set of replicated instances of these same files.

The more journals you have configured for your 100 files, the more journals the target machine must monitor.

Factors which favor employing more than one journal
  1. If high availability (HA) replication replay efficiency and currency or keep-up (on the target machine) are your most important objectives, then the more journals, the better. This approach affords your HA replication product the opportunity to perform more operations in parallel. Parallelism is your friend, and the more parallel operations being replayed, the less likely your HA replay step will be a pinch point and fall behind.
  2. If minimal recovery duration (on the source machine following an abnormal IPL or abrupt iASP disconnect followed by a matching vary-on step) is your most important objective, then the more journals, the better, since it affords the underlying i5/OS™ recovery code the opportunity for the maximum degree of parallelism. Both IPL and iASP vary-on will elect to start overlapped recovery for multiple journals in parallel. The more parallelism, the shorter the duration of these recovery steps, and hence the faster the machine is returned to your users.
  3. If you have multiple communication lines at your disposal between the source and the target machines for your HA environment, but none of them has sufficient bandwidth to handle your total system remote journal traffic, then use of multiple journals can be a helpful configuration choice since it affords the opportunity to drive parallel communication streams between the production and target machine: one per journal.
  4. Sometimes the sheer volume of journal traffic is so torrid that a single journal simply cannot absorb new images fast enough without slowing down your applications. In those instances, obviously splitting your journal stream into multiple paths makes sense. However, for most shops, that limiting condition is far beyond what they’re likely to witness. Clearly, there are i5/OS customers who have elected to stay with a single journal and have achieved deposit rates in excess of 2 million new journal entries per minute. Once you exceed that rate, you may need to consider identifying a few objects which ought to be routed to a separate journal.

Hence, there are multiple factors which favor the use of more than one journal.

Obviously, you don’t want to carry this too far, however. While 10 journals may be manageable, 1000 would probably be far more than any human being wants to try to manage or remember. Thus, a small amount of spread and matching parallelism may be a good thing, but don’t let it get too complicated.

Factors which favor employing a single journal
  1. On the other hand, if the most efficient use of source-side disk bandwidth and the fewest number of trips out to disk are your most important objectives, then having as few journals as practical is often the best choice. This approach allows both the underlying journal bundling mechanism and the journal caching mechanism to tend to end up with a larger armful of journal entries stitched together which make the trip out to disk in unison. You simply end up issuing fewer total disk writes (although wider writes) by sending all the objects to the same journal.
  2. If your HA replication product of choice intends to use remote journaling as the preferred connection/transport mechanism, you’ll probably want to make the communication line transport as efficient as possible. Having fewer journals is the helpful direction since here too the bundles tend to be larger. This makes the communication layer remote journal packet sizes more efficient. The fewer bundles you build in the main memory of the source machine, the fewer separate trips across the communication fabric, and the fewer metadata bytes which must be sent to describe each packet. This saves CPU path length on both the source and the target side.
  3. If you intend to use your journals to react to a site disaster and orchestrate a manual (APYJRNCHG) set of recovery steps, then you’re more likely to achieve consistency across related objects if all the objects which your application views as related are present within the same journal. This helps you achieve recovery to a matching point in time more easily. Hence, the fewer journals, the easier it can be to orchestrate and assure consistency.

So, there's NOT one single answer. Rather, it depends on which factors (IPL duration, target-side replay, source-side disk traffic, or remote journal transport efficiency) you're trying to optimize.

Special Notices

This material has not been submitted to any formal IBM test and is published AS IS. It has not been the subject of rigorous review. IBM assumes no responsibility for its accuracy or completeness. The use of this information or the implementation of any of these techniques is a client responsibility and depends upon the client's ability to evaluate and integrate them into the client's operational environment.

Profile

Publish Date
06 June 2006


Rating:
(based on 2 reviews)


Author(s)

IBM Form Number
TIPS0606