In the past few months I have heard of a few incidents caused by the sudden over use of a bufferpool and pageset that resulted in some significant slowdowns, application outages, or queue manager shutdowns. The incidents are similar in that there was an unexpected surge in messages arriving on a queue, which ultimately caused the pageset to go into expansion an eventually to fill. The problem was that the bufferpool and pageset were also used for some system operational queues, like the SYSTEM.COMMAND.INPUT queue.
The SYSTEM.COMMAND.INPUT queue defaults to using the SYSVOLAT storage class, pageset 3. This pageset is also used by the REMOTE storage class, which is the default STGCLASS value for the SYSTEM.CLUSTER.TRANSMIT.QUEUE and for other transmission queues. While we have long recommended that the SYSTEM.CLUSTER.TRANSMIT.QUEUE be placed in a dedicated storage class, using a devoted bufferpool and pageset, that recommendation is often much easier to make than to implement. Especially before MQ V8.
The problem arises when the pageset fills and commands to the queue manager cannot be put on the command input queue. And it can get really awful from that point on.
We’ve also recently seen this when a customer used the REMOTE storage class for their dead letter queue. The queue manager had not had a problem for the 15 years of its production life, until there was a surge of messages placed on the DLQ because of an application problem. Again, the underlying pageset filled up and no commands could be issued.
The recommendation is simple, isolate the SYSTEM.COMMAND.INPUT and a couple of other control queues to a separate storage class that points to an unique pageset and bufferpool to avoid the potential for a problem with an application, a channel, or the cluster to impact the ability to control the queue managers.
Recommended queues for the new storage class: