Journaling – Journal Receiver Diet tip 2: Consider using skinny headers
Published 05 March 2007
Authors: Hernando Bedoya
It is easy to jump to the conclusion that whatever i5/OS™ provides by default must be the best choice for your shop, but that is not always the case. This is particularly true when it comes to thinking about the kinds of extra metadata i5/OS appends to each journal entry (metadata that takes up space!).
You can think of a journal entry like a candy bar. It has a sweet spot inside (your database row image) and it also has a surrounding wrapper. Just like real candy bars, sometimes the wrapper provides essential information you cannot do without (for example, does the candy bar include any ingredients to which my child has an allergy?), and sometimes it provides more information than you really need. Also, much like real candy bars, you can request more elaborate/colorful wrappers or you can settle for a plainer wrapper. The standard wrapper provided by default for journal entries is a middle ground, neither too fancy nor too plain, but that does not mean the information it houses is truly needed by your shop.
Got a million journal entries? If so, you have a million wrappers as well. The question is, “How fancy do your wrappers need to be and are you really using the optional information housed within the standard wrappers or are they more fancy (and costly) that you really need?”
This Technote lays out two ways in which you can elect to dispense with the more elaborate wrappers and settle instead for a simpler, smaller, more efficient, and perhaps more practical approach.
Written by Larry Youngren, Software Engineer IBM Systems &Technology Group, Development iSeries Journaling
All wrappers are not created equal
You may have noticed that when you attempt to display a journal entry, say, a R/PT = Record/Put journal entry via the DSPJRN command that you see a display showing not only the matching row image from the underlying table but also descriptive information.
This descriptive information (for example, which program made the change, under which user profile was the program executing, and so on) can be helpful in debugging. This extra data is housed within a surrounding wrapper that accompanies each journal entry. Technically, this wrapper is known as the journal entry header it because physically precedes the row image on the disk.
If you could look deep into the microcode of the machine and see a journal entry just before it is written to disk, it would look something like this:
It is those leading 96 bytes that are fed to the DSPJRN command so that it in turn can produce screens like this:
Portions of the wrapper are “fixed”, that is, they are always present and essential, such as the journal sequence number, object identity, and journal entry type. Such information is labeled “fixed” on the screen above. They are always present and since the operating system itself interrogates such information at IPL time, it makes no sense to discard any of it.
Other portions of the wrapper are optional, such as the audit information on the screen below:
We draw your attention to this distinction because some header information is truly essential (for example, the journal sequence number and the identity of the matching object/table affected). The machine itself needs (and uses) such information in order to properly recover your files should the machine terminate abruptly.
Other header information (although normally included by default unless you take explicit action to direct otherwise) is, however, truly optional and nonessential. That is, the machine itself will behave properly and recover your files properly whether such information is present within the wrapper or not. In fact, the operating system itself never even looks for such information!
Yes, debuggers may appreciate its presence but that luxury comes at a price. The question is, “Should you be paying that price time and time again?”
If you truly need (and programmers in your shop regularly use) such information, keep it. If they do not, then you are burdening yourself with extra overhead in terms of main memory and disk space, quantity of disk writes, and CPU path-length. And you are paying this extra cost time and time again with each new journal entry your applications produce.
What we suggest is that if you can live without this optional data, you might want to consider doing so. In effect, you will be reducing the functions of the surrounding wrappers and thereby put your journal receiver on a diet.
What kind of information can I discard
There are three principal kinds of information normally residing within the header of each journal entry that you might want to consider trimming. These include:
- The name of the program under which the changes were made.
- The name of the user profile under which the changes were made.
- The name of the job under which the changes were made.
In fact, if you use the CHGJRN command and move to the second screen, and look at the default value provided for the keyword FIXLENDTA, you will see *JOBUSRPGM, which stands for Job Name, User Profile, and Program name. That is why all three of those attributes are listed on the screen below for “Fixed Length Data”.
A long and arduous climb
Does collecting all three of these standard pieces of descriptive information cost very much?
Let us take a simple example. Imagine that you have a batch job that is executed every night. It reads through 1 million rows in one file and adds corresponding new rows to a second file.
That means that you will be producing R/PT journal entries one million times for that second file. It also means that i5/OS™ will need to climb the execution stack from the SLIC layer of the operating system, upwards through the i5/OS layer, and ultimately to the user program layer one million times to determine and capture the name of your program. And what will the answer be? It will be the same program name time after time! As you can quickly recognize, this is a somewhat time-consuming chore that probably provides limited debug value.
Instead, if you truly felt the need to capture the program name, you could produce a single user-flavored journal entry by employing the SNDJRNE CL command at the beginning of the batch run so you can deposit the identity and version of the program you are using into the journal and not bother asking i5/OS to climb the stack to harvest this value one million more times.
The result? Efficiency!
How much savings could I glean
It is never easy talking folks into giving up something they have always had, whether they truly use it or not. A more convincing argument can often be made by quantifying the savings they could achieve by taking such actions. So, we set out to do just that.
We put together a simple set of steps you can take to analyze your own journal receivers and determine how much space you are consuming today by utilizing the standard wrapper. If that turns out to be minimal, do not bother trying to trim it. On the other hand, if that turns out to be a noticeable amount for your shop, then you might want to consider putting your journal on a diet.
A standard wrapper includes a few pieces of metadata that you cannot sacrifice. Those pieces that you can trim tend to take up 46 bytes of the total 96 bytes normally present by default within the header area of most journal entries. Hence, a batch job generating one million R/PT journal entries would also bring along 46 * 1,000,000 = 46 Megabytes of extra (but optional) space.
Ten of the 46 bytes are used to identify the user profile, ten more to identify the program, and finally twenty-six to identify the job.
Of the three pieces of default metadata collected in the wrapper, the program name is clearly the most costly to harvest from a CPU perspective while the job name consumes the largest quantity of bytes. Hence, anytime you can sacrifice either of these values, you may reap a performance benefit.
How much space could I save by omitting the descriptive metadata
If you would like a customized analysis of the space savings possible for one of the journal receivers in your shop, there is a simple query you can execute.
To discover how much space would be saved by omitting this optional metadata, start by displaying your journal in an outfile with the following CL command:
DSPJRN JRN(lib/jrn) OUTPUT(*OUTFILE) OUTFILE(lib/outfile)
Next, run the following SQL statement to determine the number of bytes consumed by the extra 46 bytes associated with each journal entry:
select count(*) * 46 as BytesReduced from LIB/OUTFILE
where JOJOB != '*OMITTED'
If you wonder what percentage of space is represented by this optional metadata, you need to know the total disk space consumed by the journal entries you have just analyzed. You can get this value by executing the following SQL statement:
SELECT SUM(JOENTL) FROM lib/outfile
Armed with these two values, you can then easily calculate the total percentage of space these pieces of optional metadata consume by dividing the output from the first select statement result by the last (and, of course, multiply by 100).
Obviously, the narrower your average database row width, the more influential and dramatic this savings is likely to be.
Why did we use the clause JOJOB != ‘*OMITTED’
You may have wondered why we included that curious clause in our first select statement.
Note that some journal entries, by default, have a full wrapper and some do not. While most journal entries related to SQL tables and database physical files (that is, those journal entries housing row images) do indeed have a wrapper that normally includes 46 bytes of extra metadata, a journal receiver often houses a set of additional interspersed entries contributed by the operating system itself. These are present to help ensure that the internal physical structure and statistics associated with objects remains intact. Some are also present to ensure that the System Managed Access Path Protection feature (SMAPP) can successfully recover your database indexes.
Both varieties are operating system-induced journal entries and as such have no need to drag along user profile, program, or job information, so they do not. Recovery is needed no matter which program or user profile made the change. As such, these entries already get by with omitting the extra metadata. Insiders claim these journal entries have “short” headers. In a sense, the operating-system produced entries have always been a diet!
What we are suggesting is that you may well be able to put the rest of your journal receiver on a diet and thereby accept shorter headers for ordinary journal entries as well, just like the operating system does for its entries.
The clause JOJOB != ‘*OMITTED, simply watched for such short headers (as evidenced by the fact that the job name had been omitted) and refrained from factoring them into our calculation of the number of times we could shave off 46 bytes.
How would you go about requesting “short” headers
So, let us assume you saw sufficient potential space savings from the calculations above that you would like to try your hand at trimming down your own journal entry headers. How do you start?
It turns out that there are two approaches. One lets you simply dispense with all three optional pieces of descriptive information. The other allows you to customize the journal entry header, discarding only the items you want to discard. Tossing all three is like diving into the deep end of the swimming pool. Discarding only a subset of the three is like only dipping your toe in the water. Let us look at both choices.
If you are willing to discard all three pieces of information (program name, job name, and user profile identity), execute the following CL command:
CHGJRN Jrn(mylib/myjrn) JrnRcv(*GEN)
RcvSizOpt(*MAXOPT2 *RMVINTENT *MINFIXLEN)
The final value: *MINFIXLEN is the significant one. It says “minimize the fixed-length data”. This is internal journal-speak. The so-called “fixed-length” data is the header section preceding each journal entry. That is where the three pieces of metadata we want to cease collecting is stored. By minimizing this leading header, we are instructing the operating system to refrain from acquiring and storing any of the three optional items of descriptive data. We are saying, in effect, we do not need any of the three pieces, so dispense with all of them.
On the other hand, let us say that you have found someone in your shop who claims they absolutely need to be able to display one of the three pieces of optional metadata but agree that they do not truly need all three. While you cannot discard all 46 bytes from each journal entry, you might be able to still trim your journal entries.
Let us say you want to continue capturing the identity of the user profile but are willing to forgo capturing the program name and job name. The resulting command to express this desire would look something like this:
CHGJRN Jrn(mylib/myjrn) JrnRcv(*GEN) FixLenDta(*USR)
In this case, we are using the keyword FixLenDta to identify the kind of “Fixed Length Data” (that is, header data) we want to preserve.
Most journal entries have wrappers. They take up space. You burn CPU cycles to assemble such information. Unless you are truly using such metadata on a regular basis, the extra space and path-length may be overhead you could trim.
Trimming such journal entry headers can be a helpful way to assist your journal in "shedding a few pounds."
The operating system functions fine without such optional data. It is not needed for IPL, commitment control, SMAPP, or APYJRNCHG.
If you have no one in your shop that regularly depends on and views such data, it may be time to put your journal on a diet and trim a bit of fat. By doing so, you will reduce the quantity of traffic flowing across the communication line if you have a remote journal configuration, as well as reduce the quantity of tape needed to save your journal receivers. Along the way, you will also reduce the quantity of bytes that need to be written to disk.
Are the savings huge? No, they are only modest. But armed with the SQL select statements shown above, you can get a feel for the quantity of overhead you can save in your shop.
This material has not been submitted to any formal IBM test and is published AS IS. It has not been the subject of rigorous review. IBM assumes no responsibility for its accuracy or completeness. The use of this information or the implementation of any of these techniques is a client responsibility and depends upon the client's ability to evaluate and integrate them into the client's operational environment.
Follow IBM Redbooks
Follow IBM Redbooks