Last Sunday’s extended service outage… Logging!

I promised to blog about last Sunday’s extended service outage and what caused it. First, here’s what you KNOW not to do, right?

1. Do not make a configuration change to a production service without first doing it in a development or preproduction environment. No matter HOW small. I mean it.

2. Never EVER make more than one change at a time. One line. One setting. One clustered application server. ONE. Only ONE. (Okay…CAVEAT: Unless you do it first in an identical non-production environment and document exactly what you did and in what order so that you can do exactly that way in production).

So, breaking these two rules (because I didn’t do EXACTLY this in a non-production environment first), here’s what I did:

1. I swapped out a config.xml file correcting an SSL reference and removing double references to one of our Nodes. No big deal by itself.

2. I swapped out the original file written in SP3 with my edited version, which changed the logging for the webct.log from size-based to time based. Here’s the snippet:

# A2 is set to be a file appender
# Original line commented out:
# Next line rolls the log at midnight and noon everyday
# log4j.appender.A2.MaxFileSize=5MB
#log4j.appender.A2.layout.ConversionPattern=%d %-5p [%t] [%3x] %-17c{2} –

3. I also changed the webserver.log and WebCTManagedNodeN.log files in the WeblogicConsole from “BySize” to “ByTime” based rotation.


The cluster won’t start, but will throw an error about this line:
log4j.appender.A2.MaxBackupIndex=20   …so, don’t do that. (And Randal Dalhoff had hinted at this in an email to me earlier in the week!)

And, BEA 9.2 has a known issue with #3. (CR287029), the HTTP logging part (webserver.log) which has to do with the fact that the extensions field (under Advanced options) as implemented with a custom HTTP log file (.jar, ie, the Bb Vista app) is used, and when logging is changed through the UI, blanks out the extensions field causing java null pointer exceptions, which also will prevent your Nodes from starting.  …so, don’t do that either.

Thanks to Joel Diamant-Helpern of Bb support for finding the BEA bug here.

If you do want to change your webserver.log and your WebCTManagedNodeN.log to rotate based on time rather than size, OPEN the Advanced options for HTTP and insert your ELF fields there.

These would be supported by the .jar:

date time time-taken c-ip x-weblogic.servlet.logging.ELFWebCTSession sc-status cs-method cs-uri-stem cs-uri-query bytes x-weblogic.servlet.logging.ELFWebCTExtras

One thought on “Last Sunday’s extended service outage… Logging!

  1. Wow… Thanks for #3. Our logs are read by AWStats. Apparently, the rolling by size creates a problem that rolling by date would help solve. Not being able to start up the nodes would be really bad.

Comments are closed.