In order to extend the time coverage by the traces I did several changes:
- reduce the number of messages to be traced i.e.,"less is more" : I reduce the log level of many trace message to finest log level, keeping the most important messages in info level. So the finest messages can still be used during development stage but not in production environments.
- compress the messages before writing them to the log file.
Here are the different compress algorithms I tested:
- The Dictionary : when tracing a line - we first split it to sub strings and then look for them in a hash table if it exist it return an integer associated with this string otherwise it insert this string as a key to a new unique integer. The hash table is also written to a file which act as a dictionary. The trace contain only integers.
- Zip : when tracing a string, we buffered it into memory and after we have 100 messages we zip them to a byte array. Then we write to the array size followed by the its content.
- The Dictionary & Zip : when tracing a string, we use method 1 to replace the string with an integer, then buffering the integer into memory. After we have 1000 integers we zip them to a byte array. Then we write to the array size followed by the its content.
Here are the results I got:
No compression | Method-1 | Method-2 | Method-3 | |
Trace file size (KB) | 343 | 65 | 29 | 10 |
Compression ratio | 1 | 5.27 | 11.82 | 34.3 |
X-axis is the number of the compress algorithm as described above.
Y-axis is calculated by (trace file with no compression) / (trace file with compression method x)
I plan to post soon:
- measurements of performance degradation of option 3
- measurements how time coverage improved by reducing the amount of trace messages.