IBM Technical University in Baltimore

Greetings all,

For those of you in the Baltimore/Washington DC area, there is an IBM Technical University event being held at the Hilton. This covers many z-centric topics, including 3 sessions on MQ.

The site for the event is:

https://www-304.ibm.com/services/learning/ites.wss/zz-en?pageType=page&c=J603289R25742I89

This is NOT a marketing event, but a technical event, you can see the complete agenda here:

https://ibmtechu.com/cgi-bin/cms/printA3.cgi?myevent=Baltimore2016

There is a fee for the event, but if your company is sending more than one person discounts are available. Please contact me (elkinsc@us.ibm.com) if your company is sending more than one person and you need the correct discount code. Or contact your local IBM account representative for assistance.

Thanks,
Lyn

New in MQ V9 SMF data

For years I have been complaining about the log manager data – the checkpoint count (QJSTLLCP) only included the number of times the LOGLOAD limit defined as a queue manager attribute had been reached. It did not include, nor was there any way to tell from the SMF data, the checkpoints that were as a result of log switches.

Well that has changed in V9, and it is not documented (yet). The QJSTLLCP is now a count of all checkpoints. I found this by accident when processing SMF data from a test using very small log files.

This will be documented in Part 2 of “Using the data from MQSMFCSV,” which I hope to get published in the next month.  In the meantime the MQ Log manager data showing this change looks like this:

logmanagercheckpoints

Getting started using MQSMFCSV – Part I

Using the data from MQSMFCSV_Part I

Greetings! This document is the first of three planned on using the output from MQSMFCSV. The first document includes 5 simple examples of how I have used the tool to extract information from the formatted MQ SMF data using SQL. The second document will take some of the traditional statistics (Message Manager, Log Manager, Buffer pool) and create charts from spreadsheets – I may add a couple of other samples as well. The third will include some additional complex queries, some of which have come from our customers.

These papers will also be published with MQSMFCSV on github. The tool can be found here: https://github.com/ibm-messaging/mq-smf-csv

 

Using the data from MQSMFCSV_PartI

How often should we issue CFSTRUCT BACKUP commands?

The frequency and location of your BACKUP CFSTRUCT command should be driven by a few things:
a) Log turnover – if the average rate of log switching is long, every half an hour for example, and you manage to keep 2+ hours of logs readily available on DASD for all queue managers in the QSG, then running a back-up every half-hour is probably fine. If however your average log switch rate is much higher, especially if you are seeing a lot of long running UOW and log shunting messages; then I would recommend running the BACKUP CFSTRUCT every 10 minutes.
b) How much data are you normally seeing written out after a BACKUP command? When looking at this, I use a percentage of the structure size. For example, if the structure is 100G (we wish!) and the past BACKUP commands have written out an average of 5% or less, then I would feel more comfortable setting the back-up interval to a half hour as long as I was not getting a lot of log turnover from messages going to private queues or other structures. If the BACKUP command resulted in an average of 50G every execution, then I would set the frequency much lower.
c) Are you using SMDS and Flash memory? I would take both these factors into consideration when deciding on the time between BACKUP commands. If they are in use, then I feel more comfortable with a longer interval.
d) Do you have an ‘administrative queue manager’ where these commands can be run without impacting the logs where application work is being done? If the BACKUP command is going to cause impact to the applications processing messages, then less frequently might work better for your environment.
e) How long can you wait for a structure to be restored in the event of something dreadful? If your back-up is an hour ago and there are 50 log files to go thru to recover from each queue manager – it is going to take noticeably more time than if there are only 3.

There is no ‘one size fits all’ answer, the decision on the frequency has to weigh a number of factors and I am sure there are some I missed. IMHO – every half an hour as a minimum, though I think official Hursley says every hour. I am old and paranoid, and am constantly trying to reduce recovery time because you don’t need these measures on a good day.

MQ Clients and z/OS Queue managers – or why is that CIO yelling at me?

This is a tale of licensing and expectations; a consolidation of the experiences of many. A tale of technical decisions vs. licensing.
To start with there is a simple and straightforward statement, connecting MQ Client application to z/OS queue managers works quite well and there can be many processing advantages. Foremost is the continuous message availability associated with shared queues, but there are other reasons including simplification of the infrastructure.
Having said that, some customers have experienced significant sticker shock after implementing direct client attachments to z/OS queue managers – even after they have been warned. When MQ clients connect to any queue manager, whether z/OS or distributed, the CPU cost of the MQ API requests is absorbed by the channel initiator (on z/OS) or its equivalent process on distributed. MQ on z/OS is typically an MLC (monthly license charge) product, while the distributed platform is OTC (one time charge). The monthly license charge is charged based on use, which in its simplest form is the CPU consumption.
If a client application is well behaved, that is connects once and processes many requests before disconnecting, the costs are more predictable and controllable. Some years ago my team measured the cost differences between a locally attached application and a very well behaved client attached application and found the CPU difference to be about 17% more for the client – all of that coming from the channel initiator address space. That was on old hardware and an old release of MQ (V 7.0.1 I believe).
If a client is not well behaved, then the CPU use and therefore the cost is unpredictable. In one particularly horrible example, a customer saw their MLC charges rise by a significant amount in a single month due to a poorly behaved MQ Client application. Like many, their first client application had been well behaved and just caused a ripple in increased costs. Their second application was not quite so well behaved, but not bad enough to gain attention. By this time the customer had conveniently forgotten the advice to implement a ‘client concentrator queue manager on distributed’ to absorb the expensive MQCONN requests and implemented their third MQ Client application. This application followed that well known and expensive model of MQCONN_>MQOPEN->MQPUT->MQCLOSE->MQDISC followed by >MQCONN-MQOPEN->MQGET->MQCLOSE->MQDISC. In a single month their MLC bill went up well over 30% and a very angry CIO was calling me. Fortunately I could point to the recommended topology we had created for them three years previously that included distributed client concentrator queue managers and why we made that recommendation. They had chosen to ignore the client concentrator queue manager advice because they did not want to pay for a couple of ‘unnecessary’ distributed licenses.
The specific API requests vary in CPU consumption as well. The most expensive is usually the MQCONN or MQCONNX, as the CHIN and queue manager do a lot of work to set up the connection. The second most expensive is likely an MQOPEN of a temporary dynamic queue, again there is a lot of work going on within the queue manager to set up the queue. Others are typically less expensive, but can add up – especially when misused. Like using an MQPUT1 to put multiple messages to a single queue.

So the best advice is to know the applications. Make sure they use connection pooling if available and are coded to use the CPU expensive verbs as sparingly as possible. True for any platform, doubly true for z/OS.
Another word of advice, if you are planning new workload, connecting new applications, to your z/OS queue managers then talk to your IBM sales rep about options for new workload.

Update to the MQ/CICS Properties sample code –

One of the most useful SupportPacs for MQ on z/OS was a Message Broker (now IIB) SupportPac called IP13. It contained a couple of very useful utilities that allowed queue to be loaded, CPU tracked, etc. This SupportPac was removed at some point, we don’t know exactly when, but we had a number of TechNotes and lab exercises that used the tools provided by IP13. In particular a program called OEMPUTX.
Colin Piace was gracious enough to add the OEMPUT program into SupportPac MP1B (note no X on this newer version). There are some slight differences between this version and the older implementation, so we have begun altering the TechNotes and lab exercises we deliver to use this new OEMPUT version.
The first of these is the sample COBOL CICS/MQ program that applies a message property to messages as they are copied from one queue to another. It can be found here:

http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS5189 
The MP1B SupportPac can be found here:

http://www.ibm.com/support/docview.wss?rs=171&uid=swg24005907&loc=en_US&cs=utf-8&lang=en

Other COBOL CICS/MQ samples that the Washington Systems Center has provided include:

QPU2 and QSU2 – Topics with embedded blanks
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS4852

The Wise MQ for z/OS Administrator – Defining a request and reply queue

In discussing what an MQ administrator has to know and consider as part of the day to day job at IBM it became apparent that there are many clearly defined steps, and some where the wise and experienced admin has to think about many issues outside just using the ISPF panels, MQ Explorer, or runmqsc to simply define an object or two.  There are some decisions that have the potential for being rules driven.  But it seems to me, and to others I have spoken with, that there are many decisions that cannot be as easily quantified, and depend on the administrator’s knowledge of the environment, the application owner, and the applications themselves.  In addition, even some things, which have the potential for being rules driven, would require a substantial effort into creating feedback and tracking that are not currently in the MQ today.

This is also much more of a challenge for shared resources, which MQ almost invariably is.  That is the queue managers own and manage queues, connections, and messages for many applications.  This is the typical usage model for MQ on z/OS, and in the case of UNIX and Linux, ever increasing usage model.  It will also be the model for the MQ appliance.

I am sure I have missed some things that need to be considered, especially on the distributed side.  But I hope to use this as the starting point for a ‘wise MQ administrator’ series that I hope will be useful to others in the future.

So, as an example – assume an MQ administrator that controls queue managers on multiple platforms; received a request for adding a pair of queues for a request/reply scenario for a new mobile hosted interface into an existing CICS transaction.  The flow of the messages would look like this:

Request/Reply MQ message flow sample
Request/Reply MQ message flow sample

 

Some of the decision making process would include, but certainly not be limited to:

1) Naming standards

Does the requesting application and/or the serving application have a naming standard in place?

If so, do the suggested names meet the standard?

If the suggested names meet the standard, go to 2.

If not,

Alter the suggested names to fit the standard and inform the requester of the ‘approved’ name once the queue creation is complete. Note that this may require application changes, if the application developers have been allowed to provision objects themselves during initial development phases. Go to 2.

Or, Reject the request with suggestions for names that will fit with the standards.

Or, create alias queues for the suggested names that will be used by the application, and approved names for the real queues.  Note that this may impact the security requirements. Go to 2.

Or, if there is no standard for the queue names.

Create a standard naming convention for the application queues based on patterns currently in use or based on the application’s requested names.  Note that security needs to be considered, so if there suggestions are something like REQUEST.whatever and REPLY.whatever they will not work with security profiles.  This step may require interaction between the application team(s), the security admins (for each platform the integration goes thru) and the MQ admin.  Remember that the security model on z/OS is very different from the distributed platforms.

Note the creation of new standard can be as simple as checking with a couple of people, or it can require negotiation between the teams.  While this step should have been done in the application design phase, it is often not considered until it is necessary to define resources.

Or, there are standards in place but they are very different for the different platforms and application components.  This typically requires queue alias definitions to be added, in addition to the base resources.

2) Where will the objects reside?

In this scenario, the request queue ultimately must reside on a z/OS queue manager, and the reply queue must ultimately reside on the Linux on Intel queue manager.   So the administrator must define the basic queues, but may have to define other objects as well.

Step 1 – Identify the CICS regions that host the transaction (service).  This requires working with the CICS administrator to identify the target region(s) and get feedback about the workload and availability characteristics.

Step 2 – Identify the queue manager(s) associated with the CICS region(s) hosting the transaction.

If the CICS region(s) is (are) not connected to a queue manager, then the creation task list just expanded to include connecting the CICS region(s) to MQ.  This will require the assistance of the CICS administrator.

Step 3 – Identify the queue manager(s) on Linux on Intel that will host the mobile connections and reply queue(s).

 

3)  Connectivity

Step 1 –  Do the identified queue managers already exist in a cluster?

If yes, the fewest definitions required are:

The request queue and any aliases on z/OS, including cluster membership

The reply queue and any aliases on Linux on Intel, including cluster membership

Proceed to the next set of considerations

Step 2 –  Do the queue managers already have channels defined?

If yes, the definitions required include:

On the z/OS queue manager:

The request queue

The reply queue remote definition using the existing transmission queue

On the Linux on Intel queue manager:

The reply queue

The request queue remote definition using the existing transmission queue

If no, the definitions required include:

On the z/OS queue manager:

The request queue

The transmission queue to the Linux on Intel queue manager

The reply queue remote definition using the new transmission queue

The sender channel to the Linux on Intel queue manager

The receiver channel from the Linux on Intel queue manager

On the Linux on Intel queue manager:

The reply queue

The transmission queue to the z/OS queue manager

The request queue remote definition using the new transmission queue

The sender channel to the z/OS queue manager

The receiver channel from the z/OS queue manager

 

4) Is triggering the CICS transaction required?

If no, proceed to # 5

Yes, the transaction is expected to be triggered:

What is the triggering scheme?

If First, the make sure the application is reading until end of queue and issuing a get-wait for a reasonable period of time.

If Every, make sure the applications understand the cost of this style of processing.

Is the CICS region already hosting triggered transactions?

If no, then on z/OS:

Define an initiation queue to be used by the CICS region

Define the MQ process that will be used, which includes the name of the transaction that will be triggered.

Create an instance of the CICS trigger monitor (CKTI transaction), pointing to the initiation queue.

Update the queue definition with the triggering information.

5) Queue and message characteristics gathering:

This includes anticipated workload information, along with availability requirements that can affect physical location of the queues.    This is an area where experience with the applications and application owners along with knowledge of the existing message volumes, peaks, SLAs, environment limitations, etc. feed back into the administrator’s decisions on queue placement.  Other requirements for both high and continuous availability must also be considered.

As an example, if the applications group consistently under-estimates their volume and capacity requirements, a wise administrator would typically add more than the standard reserve capacity.  If this is an entirely new application, he will probably double the estimates and assume their peak times will coincide with another top-tier application’s peak.  He may choose to set up special resource pools on z/OS to host the queues to isolate the new workload from existing work.  He may even conclude that the anticipated workload will not fit in current queue managers (on either platform) and must be hosted on new ones.

The information required for this includes:

Message size – Min, max, and average.

Message rates – Are there anticipated peak times?  How spiky is the workload?

What is the anticipated growth rate?

Maximum Message Depth –

How many uncommitted messages will be held before they are committed?

Are there requirements to hold messages for a period of time in the event of an emergency?

Are there service level agreements on the response time or QoS for the messages?

Message persistence – should the queue definition specify persistence as the default?

Message expiration – will the application be taking advantage of this feature?

Are there serialization requirements or can multiple instances to the serving transaction be deployed?

What type of transaction is being executed – inquiry only, or an update?

Availability requirements

Does the Linux on Intel queue manager need to be multi instance?

Does the z/OS queue manager need to be in a queue sharing group?  Does the request queue need to be shared?

Security requirements –

Does the request queue need to be put inhibited for applications on the z/OS queue manager?

If yes, then alias queue definitions are required to provide the granular security.

If this is an extension of an existing application, are there rules in place to protect the queues based on the existing naming scheme?

If not, then security decisions need to be made and, typically, rules developed on both platforms to protect the MQ objects.

 

6)  Queue Sharing Group considerations for queue definitions

If there are very high message availability requirements, placing the request queue on the coupling facility is necessary.  The wise administrator will need to think about the following:

  1. Is there sufficient capacity in an existing Coupling Facility structure and the links to the CFs based on the volume and peak estimates?
    1. If no, then a new structure must be added. This will include:
      1. Defining the structure to a CF with available storage, typically done by the z/OS Sysprog.
      2. Determining the offload type (DB2 or SMDS) and rules.
    2. Am I adding a new queue manager to the Queue Sharing group to support the workload?
      1. If yes, then the MQ admin and z/OS Sysprog need to think about:
        1. If the queue manager on a new LPAR or one that has not been connected to the SYSPLEX before? If so there are a number of new objects that must be defined in the ‘plex.
        2. Additional space may be needed in the MQ Admin structure to support an additional queue manager.

 

7) Private queue (on z/OS) considerations for queue definitions

If the z/OS defined queues can be private, then the MQ administrator needs to consider the following:

  1. Is there sufficient capacity in an existing bufferpool and pageset to accommodate the anticipated workload without impacting existing high priority workload?
    1. If no, then a new bufferpool and pageset are necessary.
      1. Is there enough virtual storage to allocate a new bufferpool? Review the storage messages (CSQY220I messages).
      2. If not, then the queue manager may need to be upgraded to V8 or this may require a new queue manager.
  • If there is storage available:
    1. Create the DEFINE BUFFPOOL command to be added into the CSQINP1 input to queue manager startup
    2. Define and format the pageset
    3. Create the DEFINE STGCLASS command to be added into the CSQINP2 input to queue manager startup
  1. Set the queue definition to use the storage class identified with the bufferpool and pageset identified.
  2. Please note for high volume applications, the request queue and the transmission queue used for the reply should be in different bufferpools.

 

Starting again

After an interlude from the world of blogging about MQ for z/OS performance, it is time to start again. I am going to repost my article on the wise MQ Administrator defining a request and reply queue. I hope it helps people.

As time goes on, I will expand on the theme of wise administration especially as more companies are looking to automate more administration tasks.