Profiling Suricata with JEMALLOC

JEMALLOC is a memory allocation library: http://www.canonware.com/jemalloc/

It offers many interesting things for a tool like Suricata. Ken Steele of EZchip (formerly Tilera) made me aware of it. In Ken’s testing it helps performance.

Install

wget http://www.canonware.com/download/jemalloc/jemalloc-3.6.0.tar.bz2
tar xvfj jemalloc-3.6.0.tar.bz2
cd jemalloc-3.6.0
./configure --prefix=/opt/jemalloc/
make
sudo make install

Then use it by preloading it:

LD_PRELOAD=/opt/jemalloc/lib/libjemalloc.so ./src/suricata -c suricata.yaml -l tmp/ -r ~/sync/pcap/sandnet.pcap -S emerging-all.rules -v

I haven’t benchmarked this, but if you’re running a high performance setup it may certainly be worth a shot.

Profiling

The library comes with many interesting profiling and debugging features.

make clean
./configure --prefix=/opt/jemalloc-prof/ --enable-prof
make
sudo make install

Start Suricata like this:

LD_PRELOAD=/opt/jemalloc-prof/lib/libjemalloc.so ./src/suricata -c suricata.yaml -l tmp/ -r ~/sync/pcap/sandnet.pcap -S emerging-all.rules -v

Now we don’t see any change as we need to tell jemalloc what we want.

Exit stats

Dumps a lot of stats to the screen.

MALLOC_CONF=stats_print:true LD_PRELOAD=/opt/jemalloc-prof/lib/libjemalloc.so ./src/suricata -c suricata.yaml -l tmp/ -r ~/sync/pcap/sandnet.pcap -S emerging-all.rules -v

Memory leak checks

MALLOC_CONF=prof_leak:true,lg_prof_sample:0 LD_PRELOAD=/opt/jemalloc-prof/lib/libjemalloc.so ./src/suricata -c suricata.yaml -l tmp/ -r ~/sync/pcap/sandnet.pcap -S emerging-all.rules -v
[... suricata output ...]
<jemalloc>: Leak summary: 2011400 bytes, 4523 objects, 645 contexts
<jemalloc>: Run pprof on "jeprof.22760.0.f.heap" for leak detail

Then use the pprof tool that comes with jemalloc to inspect the dumped stats.

$ /opt/jemalloc-prof/bin/pprof --show_bytes ./src/suricata jeprof.22760.0.f.heap
Using local file ./src/suricata.
Using local file jeprof.22760.0.f.heap.
Welcome to pprof!  For help, type 'help'.
(pprof) top
Total: 2011400 B
1050112  52.2%  52.2%  1050112  52.2% PacketGetFromAlloc
600064  29.8%  82.0%   600064  29.8% SCProfilePacketStart
103936   5.2%  87.2%   103936   5.2% SCACCreateDeltaTable
65536   3.3%  90.5%    66192   3.3% pcap_fopen_offline
35520   1.8%  92.2%    35520   1.8% ConfNodeNew
26688   1.3%  93.6%    26688   1.3% __GI___strdup
20480   1.0%  94.6%    20480   1.0% MemBufferCreateNew
20480   1.0%  95.6%    20480   1.0% _TmSlotSetFuncAppend
14304   0.7%  96.3%    14304   0.7% pcre_compile2
14064   0.7%  97.0%    25736   1.3% SCPerfRegisterQualifiedCounter

So it seems we don’t properly clean up our packet pools yet.

Create a PDF of this info:

$ /opt/jemalloc-prof/bin/pprof --show_bytes --pdf ./src/suricata jeprof.22760.0.f.heap > jemalloc.pdf

Dumping stats during runtime

Dump stats after every 16MiB of allocations (lg_prof_interval:24, means every 2^24 bytes, so 16MiB)

I’ve done this in a separate directory since it dumps many files.

$ mkdir jemalloc-profile
$ cd jemalloc-profile/
$ MALLOC_CONF="prof:true,prof_prefix:victor.out,lg_prof_interval:24" LD_PRELOAD=/opt/jemalloc-prof/lib/libjemalloc.so ../src/suricata -c ../suricata.yaml -l ../tmp/ -r ~/sync/pcap/sandnet.pcap -S ../emerging-all.rules -v

Then you should see new *.heap files appear, many during startup. But after some time it should slow down.

You can then visualize the diff between two dumps:

$ /opt/jemalloc-prof/bin/pprof --show_bytes --pdf ../src/suricata --base victor.out.24159.150.i150.heap victor.out.24159.200.i200.heap > jemalloc.pdf

This creates a PDF of the 200th dump taking the 150th dump as a baseline. As we dump every ~16MiB, this covers about 50 * 16 = 800MiB worth of allocations.

Further reading

http://www.canonware.com/jemalloc/
https://github.com/jemalloc/jemalloc/wiki
https://github.com/jemalloc/jemalloc/wiki/Use-Case%3A-Heap-Profiling

Many thanks to Ken Steele for pointing me to the lib and providing me with some good examples.

detecting: malloc(-1) or malloc(0xffffffff)

In Suricata we’re often not printing malloc errors. The reason is that we’re not willing to print such errors based on (attacker controlled) traffic. So often such cases are silently handled.

We came across a bug though, where a integer underflow led to -1/0xffffffff being passed to malloc. Luckily, malloc just failed by returning NULL, and this return was properly handled. Still, passing such a large value to malloc is a bug, so I would like to catch it.

This turned out to be trickier than I thought.

valgrind says: ==10274== Warning: silly arg (-1) to malloc(). However, it does not count it as an error. So calling valgrind with –error-exitcode=255 (my usual choice) doesn’t work.

glibc’s malloc responds to an environment variable, MALLOC_CHECK_ (I tried values 0 to 7), but this didn’t catch it at all.

AddressSanitizer also detects it as something non-fatal: ==18885== WARNING: AddressSanitizer failed to allocate 0xffffffffffffffff bytes. Not fatal like the usual errors, and no exit code.

tcmalloc, preloaded through LD_PRELOAD=/usr/lib/libtcmalloc_minimal.so.0, also just prints something to stderr: tcmalloc: large alloc 0 bytes == (nil) @

I also tried electric-fence, however this failed to work for me altogether.

As I wanted this check to be part of my QA, I needed an automated check. In this case however, I saw no other way than to just inspect the stderr output of one of the tools above. My choice was tcmalloc, as it’s fast and doesn’t require compile time options.

Suricata debugging

If you’re running into issues with Suricata, it may be worth spending some time looking at the debugging options.

To enable the debugging code, pass “–enable-debug” to configure.

./configure –enable-debug

And make & make install again. Make sure that during compilation you see -DDEBUG in the gcc commands.

Then to really enable it at runtime, pass the SC_LOG_LEVEL

SC_LOG_LEVEL=Debug

Depending on how you run the engine, this will output massive amounts of debugging info. Thats why we added a pcre regex filter option.

SC_LOG_OP_FILTER=regex

The regex currently is case sensitive. It will be matched against the full debug line. For example if you want to want to see only output related to the HTP module do something like:

SC_LOG_LEVEL=Debug SC_LOG_OP_FILTER=”htp” suricata -c suricata.yaml -r /path/to/file.pcap

Or maybe you want the stream messages as well:

SC_LOG_LEVEL=Debug SC_LOG_OP_FILTER=”(htp|stream)” suricata -c suricata.yaml -r /path/to/file.pcap

You can also control the logging format by passing the SC_LOG_FORMAT environment variable. By default it’s set to “[%i] %t – (%f:%l) <%d> (%n) — “.

The following format specifiers are available:

t timestamp
p process id (pid)
i thread id
m thread module name
d log level
f filename
l line number
n function name

Example:

SC_LOG_FORMAT=”[%i] %t – (%f:%l) <%d> (%n) — “

Putting it all together:

SC_LOG_LEVEL=Debug SC_LOG_FORMAT=”[%i] %t – (%f:%l) <%d> (%n) — ” SC_LOG_OP_FILTER=”(htp|stream)” suricata -c suricata.yaml -r /path/to/file.pcap

If you have any questions or suggestions, let me know!