Sunday, May 30, 2010

How to create Window's dump files with unique names

We wanted to configure our server in Windows environment, so when one of its process crashes it will produce a dump file.

Our first try was Dr.Watson but we quickly found out that it creates all dump files with the same name i.e., in case there are several crashes only the last one will have a dump file. This was unacceptable.

After googling on this for a while I found this useful article that talks about NTSD (NT Symbolic Debugger).
This debugger has several advantages:
  1. it creates a unique dump file for each process that crashes
  2. it does not need a special installation as it is included in Windows installation (in C:\WINDOWS\system32)
Note: It might be understood from the article above that the NTSD version that comes with Windows Server 2003 does not support unique dump file names but it does.

Here is how to install NTSD:
  1. make sure your C code doe not catch exceptions (so the exception is handled by Windows code)
  2. open "regedit"
  3. go to "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug"
  4. choose "Debugger"
  5. set: "ntsd.exe" -p %ld -e %ld -g -c ".dump /ma /u c:\app.dmp;q"
this will create dump files with names like "app_2010-05-30_08-07-25-406_145C.dmp" in "c:\" folder

you can test the configuration by running CrashMe application.

Friday, May 21, 2010

Field experience: Java Util Logger preformance tunning

I needed to run one of our Java server in production environment with full traces enabled.
The server would crash under load, due to out of memory error. I've looked into the heap dump that was produced, with eclipse MemoryAnalayzer, and found that the memory outage was due to overflow in the incoming message queue. I concluded that running with traces damaged the throughput and caused the outage.

First, I wanted to measure the throughput without traces and found it is takes 0.18 ms to process a message. Then I run with full traces and measure that it took 20 ms to process a message, more than 100 times longer.

My next step was to review the code related to the logger initialization and formatter. I found that when the logger was initialized then it was using a ConsoleHandler (as well as FileHandler) by default while our server runs as NT service and have no console. This code was insert during development phase but was not needed in production. After removing this code, I measured 9.32 ms per message i.e., now it takes only 50 times longer.

Next I tried to change the way we format a trace message e.g., In the formater, we had code that calculated the name of the method from the call stack. Similar, we had code that calculated the class name from the logger record. both were designed to remove the burden form the programmer to write the class name and method each time he writes a trace message and let this information be appended automatically. After removing this code, I measure 1.94 ms per message i.e., now it takes 10 times longer. This code change means that the programmers will have to fix all existing message to include class and method names, and all future traces should be added with this information as well. This is a heavy task as hundreds of trace line should be manually fix but the impact of this change is an improvement by 500%. in case this change will enable running with traces in production than I think its worth the work.

Next, I replaced the time calculation that is appended to each trace line and put System.currentTime, but this change had a minor impact of 1.69 ms per message.
I had to decide if the burden of running a special tool to transfer the currentTime millisecond to date and daily time each time a trace is investigated is worth the minor throughput improvement and decided no to go with this change.

Here is a graph that summarize the results: the blue graph depict the improvements of average time of processing message due to code change and the pink graph is the average time of processing a message without traces.
the Vertical axis is the average time it took to process a message and the horizontal axis is the code changes I did:
  1. original code (name "base" in the graph)
  2. removal of tracing to console (named "console" in the graph)
  3. removal of class and method name from the formatter. (named "class and methods" in the graph)
  4. replacing Date with System.currentTime (named "currentTime" in the graph)


I've conducted a load test against our application with two set ups: one with the original logging we had and the other with logging version were I removed the logging to console and the class and method name from the formatter.
The result of the tests show that in the first set up the application was not available 100% of the time while in the second set up it was not available only 15% of the time. the maximum response time was ~10 minutes and the average response time was 24.6 seconds.

Friday, May 14, 2010

Java Logger with SimpleFormatter is ten time slower than OutputStream



I compared the time it takes to write a log file with several Java I/O classes and found that Logger takes 10 times longer than OutputStream.

I used OutputStream, RandomAccessFile and Logger to write a trace file of million lines. each line contained 250 characters and a prefix of the writing time. I repeated and collected the results 7 times.

The graph below summarize the results: vertical axe is the time it took to write the file in milliseconds and the horizontal is the sampling number.
Here is the I/O setup I used :
FileInputStream #1 - PrintWriter with auto flush enabled
FileInputStream #2 - PrintWriter with auto flush disabled
RandomAccessFile #1 - RandomAccessFile which write to file after each line
RandomAccessFile #2 - RandomAccessFile which store 10,000 lines in a buffer before writing to file
RandomAccessFile #3 - RandomAccessFile which store 200,000 lines in a buffer before writing to file
Logger #1 - Java util logger (JUL) with SimpleFormatter
Logger #2 - JUL with a Formatter that append the time to each message
Logger #3 - JUL with a Formatter that append the time and class name to each message