Use of Shared Engines for Coupling Facilities - Implementation and Impact

Abstract

This tip describes how the support for use of shared engines by Coupling Facility LPARs is implemented, and discusses the impact this may have on systems that use a CF with shared engines. Because we still receive many questions from customers asking if the use of a shared engine is appropriate for a given situation, we wrote this tip to help people understand the consequences.

Contents


There are many potential ways to configure a Coupling Facility (CF) for use in a Parallel Sysplex. This tip discusses the use of shared, rather than dedicated, engines for the CF LPAR. For other configuration considerations, review the White Paper entitled "Coupling Facility Configuration Options", available on the Web at:
http://www.ibm.com/servers/eserver/zseries/library/techpapers/pdf/gf225042.pdf

We make a number of assumptions in this tip:
  • We use the term "engine" to refer to a processing unit, regardless of whether it is a CP that can also be used by operating system LPARs, or an ICF, which can only run CFCC, and whether it is in a standalone processor (like a 2064-100) or a general purpose server, like a 2064-216.
  • We assume that you are using z/OS 1.2 or later. z/OS 1.2 introduced a heuristic algorithm that decides whether a given CF request should be driven as a synchronous or asynchronous request based on actual response times and the sysplex configuration. This change is discussed in WSC FLASH 10159.
  • We use the term LP to describe a logical partition, sometimes also referred to as an LPAR.

IBM recommends that dedicated engines should always be used for a CF whose response times are critical. The use of shared engines for the CF always impacts CF response times, compared to an equivalent dedicated engine. Increased synchronous response times in turn increase the overhead on the attached processors. So, while the use of dedicated engines for a CF may appear expensive in some cases, when you take into account the overhead impact on the attached processors, it may actually be a less expensive option than using shared engines (especially if the CF is sharing engines with operating system LPs).

CFs always run in an LP, regardless of whether the CF is in a standalone processor or in a general purpose server. Like all LPs, CF LPs can be assigned processing weights, and the weight of a CF LP plays an important role in the performance delivered by that CF.

However, unlike z/OS, which is interrupt-driven and therefore gives up a shared engine when it is has no work to process, Coupling Facility Control Code (CFCC) runs in a polling loop, constantly looking to see if new work has arrived. Because CF response times are so short (typically around 100 or more times faster than DASD), running in this fashion allows the CF to deliver faster response times than if it was interrupt-driven. As a result, a CF LP will use as much of a shared engine as PR/SM will allow, even if it has no work to do.

In order to allow customers to set up a test CF LP that will use less resource, IBM introduced a feature known as Dynamic CF Dispatching (DCFD). If DCFD is turned OFF for a CF, the CF will run in a polling loop as before, ensuring good response times. A CF with DCFD turned off will never release a shared engine until the engine is taken away by PR/SM to be given to another LP. The default minimum timeslice for use of a shared engine is 12.5 milliseconds. As a result, we always recommend turning DCFD OFF for a production CF that is using shared engines. (The exception is if you define a CF LPAR that is using the Dynamic ICF Expansion feature to let it use a dedicated engine and a shared engine - in this case, you are not allowed to turn DCFD off.)

If DCFD is turned ON for a CF, when the CF LP is dispatched, CFCC will wake up and check for waiting work. If there is no work for it to do, it will set a timer and go back to sleep, releasing the shared engine. The longer the CF goes without having any work to do, the longer the amount of time it will sleep, up to a maximum of 20 milliseconds. On the other hand, if that CF has a lot of work to do, the time it will sleep will decrease (to a minimum of 5 milliseconds), eventually reverting back to a polling loop if the amount of work increases sufficiently.

DCFD is turned on or off for a CF by using the DYNDISP command on the CF console on the HMC. If you change the DCFD setting from the default, it will be remembered by the CF forever--or until you make a change to the CF LPAR that would require a different DCFD setting. The following table shows the default DCFD setting for various configurations.

Standalone CF
General Purpose Server
Dedicated engines
Shared engines
Dedicated engines + Shared engines
Dedicated ICF
Dedicated CP
Shared CP
Shared ICF
Dedicated ICF + Shared CP
Dedicated ICF + Shared ICF
DCFD Default
N/A
OFF
ON
N/A
N/A
ON
OFF
ON
ON
Can turn DCFD off?
N/A
Yes
No
N/A
N/A
Yes
Yes
No
No

The other attribute that determines when an LP is given a shared engine, and how long it is allowed to keep that engine, is the weight per logical CP of the LP, relative to other users of the shared engines. If the CF LP is running in a standalone CF, or is using shared ICFs in a general purpose server, the weight is relative to the weights of other CF LPs. If the CF LP is using shared CPs in a general purpose server, the weight is relative to the weight of all other LPs sharing those CPs. Given equal demand for the shared engine, an LP with a weight of 90 will be given that engine for roughly 9 times as long as a competing LP with a weight of 10.

As the number of logical CPs in the LP increases, the weight per logical CP decreases, so it is important to not define more logical CPs than necessary. Therefore, if you plan on sharing an engine between a production and a test CF, it is not sufficient to turn DCFD ON for the test CF and OFF for the production one. You must also set the weights to ensure that the production CF will be protected should the test CF get very busy and revert to polling mode.

When PR/SM is deciding who should have access to a shared engine, two attributes of the candidate LPs are taken into account:
  1. The weight per logical CP (based on the weight of the LP and the number of logical CPs defined for the LP)
  2. How much of its relative share the LP has actually used
If an LP has a low weight, but has used even less than its fair share (as determined by its weight), it will have a higher chance of being given an engine than an LP that has used all of, or more than, its fair share. Because the production CF will be running in the polling loop, it will tend to use at least all its fair share. If the test CF is using less than its fair share, and its timer pops, it stands a good chance of PR/SM taking the shared engine away from the production CF and giving it to the test CF. However, if the utilization of the test CF starts increasing, it gets closer to its fair share, and therefore has a reduced chance of preempting the production CF.

If you are sharing an engine between an operating system LP and a CF LP, the only way to get acceptable response time for the CF LP is to turn DCFD OFF, and give that LP a high weight relative to the operating system LPs it is sharing the engines with. However this will have the effect of driving up apparent overall CPU utilization, and have a negative impact on the operating system LPs, which must now wait for the CF LP to use up its timeslice. In addition, because PR/SM knows that the CF and the operating system are sharing engines, all synchronous requests are converted under the covers to asynchronous ones. This is not reported in RMF; however, you will see very long synchronous response times for that CF. For these reasons, plus the fact that you have to pay software licences for the capacity being used by a CF that is using general purpose engines, this is really not a recommended configuration.

Starting with z/OS 1.2, z/OS takes into account the response time of a CF when deciding if a new request should be sent synchronously or asynchronously. Because the response time for a CF with DCFD turned ON will generally be so high, nearly all requests sent to those CFs will be asynchronous. Prior to z/OS 1.2, when the CF response time was not taken into account, it was likely that a synchronous request would to be sent to a CF that is currently not dispatched, resulting in one of the logical CPs of that z/OS LP spinning, waiting for the CF to get dispatched and respond to it. If you are really unfortunate, you could send a number of synchronous requests to one of these CFs, resulting in a number of your logical CPs spinning (1 per request) until the CF answers, some milliseconds later. Given that "good" CF response times on current hardware is between 10 and 50 microseconds, having to wait milliseconds for a response is a very wasteful use of z/OS resources.

While z/OS 1.2 has addressed the very high overhead associated with the use of CFs with shared engines, it cannot do anything to address the fact that response times from CFs with shared engines and DCFD turned ON are typically 20 to 100 times slower than from CFs with dedicated engines. Also, even though the use of asynchronous requests addresses the impact of long synchronous response times, there is still significantly more overhead in processing an asynchronous request than in processing a short synchronous one.

An additional consideration is the use of shared CF engines with System Managed CF Structure Duplexing. Because enabling duplexing already impacts the response time of the duplexed structures, the additional elongation of response times as a result of using shared engines would probably result in response times that are not acceptable for production environments. Therefore, we recommend that you do not use shared engines in any CF that will contain structures duplexed with System Managed CF Structure Duplexing.

To summarize, the use of shared engines for CFs impacts CF response times, and is unlikely to deliver CF response times that would be acceptable for a production data sharing environment. Lock structures in particular should not be placed in a CF that will be using shared engines. Except for this restriction on lock structures, the use of shared engine CFs would probably provide acceptable performance and overhead for a resource sharing environment.

For additional information about the use of CFs with shared engines, refer to PR/SM Planning Guide, GA22-7236. For additional information about how PR/SM manages LP weights and logical CPs, refer to the section entitled "LPAR Dispatching and Shared CPs" in the IBM Redbook z/OS Intelligent Resource Director, SG24-5952.

Special Notices

This material has not been submitted to any formal IBM test and is published AS IS. It has not been the subject of rigorous review. IBM assumes no responsibility for its accuracy or completeness. The use of this information or the implementation of any of these techniques is a client responsibility and depends upon the client's ability to evaluate and integrate them into the client's operational environment. publib-b.boulder.ibm.com

Profile

Publish Date
03 September 2003


Rating:
(based on 5 reviews)


Author(s)

IBM Form Number
TIPS0237