Suricata bits, ints and vars

Since the beginning of the project we’ve spoken about variables on multiple levels. Of course flowbits defined by the Snort language came first, but other flow based variables quickly followed: flowints for basic counting, and vars for extracting data using pcre expressions.

I’ve always thought of the pcre data extraction using substring capture as a potentially powerful feature. However the implementation was lacking. The extracted data couldn’t really be used for much.

Internals

Recently I’ve started work to address this. The first thing that needed to be done was to move the mapping between variable names, such as flowbit names, and the internal id’s out of the detection engine. The detection engine is not available in the logging engines and logging of the variables was one of my goals.

This is a bit tricky as we want a lock less data structure to avoid runtime slow downs. However rule reloads need to be able to update it. The solution I’ve created has a read only look up structure after initialization that is ‘hot swapped’ with the new data at reload.

PCRE

The second part of the work is to allow for more flexible substring capture. There are 2 limitations in the current code: first, only single substring can be captured per rule. Second, the names of the variables were limited by libpcre. 32 chars with hardly any special chars like dots. The way to express these names has been a bit of a struggle.

The old way looks like this:

pcre:"/(?P.*)<somename>/";

This create a flow based variable named ‘somename’ that is filled by this pcre expression. The ‘flow_’ prefix can be replaced by ‘pkt_’ to create a packet based variable.

In the new method the names are no longer inside the regular expression, but they come after the options:

pcre:"/([a-z]+)\/[a-z]+\/(.+)\/(.+)\/changelog$/GUR, \
    flow:ua/ubuntu/repo,flow:ua/ubuntu/pkg/base,     \
    flow:ua/ubuntu/pkg/version";

After the regular pcre regex and options, a comma separated lists of variable names. The prefix here is ‘flow:’ or ‘pkt:’ and the names can contain special characters now. The names map to the capturing substring expressions in order.

Key/Value

While developing this a logical next step became extraction of key/value pairs. One capture would be the key, the second the value. The notation is similar to the last:

pcre:"^/([A-Z]+) (.*)\r\n/G, pkt:key,pkt:value";

‘key’ and ‘value’ are simply hardcoded names to trigger the key/value extraction.

Logging

Things start to get interesting when logging is added. First, by logging flowbits existing rulesets can benefit.

{
  "timestamp": "2009-11-24T06:53:35.727353+0100",
  "flow_id": 1192299914258951,
  "event_type": "alert",
  "src_ip": "69.49.226.198",
  "src_port": 80,
  "dest_ip": "192.168.1.48",
  "dest_port": 1077,
  "proto": "TCP",
  "tx_id": 0,
  "alert": {
    "action": "allowed",
    "gid": 1,
    "signature_id": 2018959,
    "rev": 2,
    "signature": "ET POLICY PE EXE or DLL Windows file download HTTP",
    "category": "Potential Corporate Privacy Violation",
    "severity": 1
  },
  "http": {
    "hostname": "69.49.226.198",
    "url": "/x.exe",
    "http_user_agent": "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)",
    "http_content_type": "application/octet-stream",
    "http_method": "GET",
    "protocol": "HTTP/1.1",
    "status": 200,
    "length": 23040
  },
  "vars": {
    "flowbits": {
      "exe.no.referer": true,
      "http.dottedquadhost": true,
      "ET.http.binary": true
    }
  }
}

When rules are created to extract info and set specific ‘information’ flowbits, logging can create value:

"vars": {
  "flowbits": {
    "port/http": true,
    "ua/os/windows": true,
    "ua/tool/msie": true
  },
  "flowvars": {
    "ua/tool/msie/version": "6.0",
    "ua/os/windows/version": "5.1"
  }
}
"http": {
  "hostname": "db.local.clamav.net",
  "url": "/daily-15405.cdiff",
  "http_user_agent": "ClamAV/0.97.5 (OS: linux-gnu, ARCH: x86_64, CPU: x86_64)",
  "http_content_type": "application/octet-stream",
  "http_method": "GET",
  "protocol": "HTTP/1.0",
  "status": 200,
  "length": 1688
},
"vars": {
  "flowbits": {
    "port/http": true,
    "ua/os/linux": true,
    "ua/arch/x64": true,
    "ua/tool/clamav": true
  },
  "flowvars": {
     "ua/tool/clamav/version": "0.97.5"
  }
}

In the current code the alert and http logs are showing the ‘vars’.

Next to this, a ‘eve.vars’ log is added, which is a specific output of vars independent of other logs.

Use cases

Some of the use cases could be to add more information to logs without having to add code. For example, I have a set of rules that set of rules that extracts the packages are installed by apt-get or for which Ubuntu’s updater gets change logs:

"vars": {
  "flowbits": {
    "port/http": true,
    "ua/tech/python/urllib": true
  },
  "flowvars": {
    "ua/tech/python/urllib/version": "2.7",
    "ua/ubuntu/repo": "main",
    "ua/ubuntu/pkg/base": "libxml2",
    "ua/ubuntu/pkg/version": "libxml2_2.7.8.dfsg-5.1ubuntu4.2"
  }
}

It could even be used as a simple way to ‘parse’ protocols and create logging for them.

Performance

Using rules to extract data from traffic is not going to be cheap for 2 reasons. First, Suricata’s performance mostly comes from avoiding inspecting rules. It has a lot of tricks to make sure as little rules as possible are evaluated. Likewise, the rule writers work hard to make sure their rules are only evaluated if they have a good chance of matching.

The rules that extract data from user agents or URI’s are going to be matching very often. So even if the rules are written to be efficient they will still be evaluated a lot.

Secondly, extraction currently can be done through PCRE and through Lua scripts. Neither of which are very fast.

Testing the code

Check out this branch https://github.com/inliniac/suricata/pull/2468 or it’s replacements.

Bonus: unix socket hostbits

Now that variable names can exist outside of the detection engine, it’s also possible to add unix socket commands that modify them. I created this for ‘hostbits’. The idea here is to simply use hostbits to implement white/blacklists. A set of unix socket commands will be added to manage add/remove them. The existing hostbits implementation handles expiration and matching.

To block on the blacklist:

drop ip any any -> any any (hostbits:isset,blacklist; sid:1;)

To pass all traffic on the whitelist:

pass ip any any -> any any (hostbits:isset,whitelist; sid:2;)

Both rules are ‘ip-only’ compatible, so will be efficient.

A major advantage of this approach is that the black/whitelists can be
modified from ruleset themselves, just like any hostbit.

E.g.:

alert tcp any any -> any any (content:"EVIL"; \
    hostbits:set,blacklist; sid:3;)

A new ‘list’ can be created this way by simply creating a rule that
references a hostbit name.

Unix Commands

Unix socket commands to add and remove hostbits need to be added.

Add:

suricatasc -c "add-hostbit <ip> <hostbit> <expire>"
suricatasc -c "add-hostbit 1.2.3.4 blacklist 3600"

If an hostbit is added for an existing hostbit, it’s expiry timer is updated.

Hostbits expire after the expiration timer passes. They can also be manually removed.

Remove:

suricatasc -c "remove-hostbit <ip> <hostbit>"
suricatasc -c "remove-hostbit 1.2.3.4 blacklist"

Feedback & Future work

I’m looking forward to getting some feedback on a couple of things:

  • log output structure and logic. The output needs to be parseable by things like ELK, Splunk and jq.
  • pcre options notation
  • general feedback about how it runs

Some things I’ll probably add:

  • storing extracted data into hosts, ippairs
  • more logging

Some other ideas:

  • extraction using a dedicated keyword, so outside of pcre
  • ‘int’ extraction

Let me know what you think!

Suricata 3.0 is out!

suri-400x400Today, almost 2 years after the release of Suricata 2.0, we released 3.0! This new version of Suricata improves performance, scalability, accuracy and general robustness. Next to this, it brings a lot of new features.

New features are too numerous to mention here, but I’d like to highlight a few:

  • netmap support: finally a high speed capture method for our FreeBSD friends, IDS and IPS
  • multi-tenancy: single instance, multiple detection configs
  • JSON stats: making it much easier to graph the stats in ELK, etc
  • Much improved Lua support: many more fields/protocols available, output scripts

Check the full list here in the announcement: http://suricata-ids.org/2016/01/27/suricata-3-0-available/

New release model

As explained here, this is the first release of the new release model where we’ll be trying for 3 ‘major’ releases a year. We originally hoped for a month of release candidate cycles, but due to some issues found and the holidays + travel on my end it turned into 2 months.

My goal is to optimize our testing and planning to reduce this further, as this release cycle process is effectively an implicit ‘freeze’. Take a look at the number of open pull requests to see what I mean. For the next cycle I’ll also make the freeze explicit, and announce it.

Looking forward

While doing a release is great, my mind is already busy with the next steps. We have a bunch of things coming that are exciting to me.

Performance: my detection engine rewrite work has been tested by many already, and reports are quite positive. I’ve heard reports up to 25% increase, which is a great bonus considering the work was started to clean up this messy code.

ICS/SCADA: Jason Ish is finalizing a DNP3 parser that is very full featured, with detection, logging and lua support. Other protocols are also being developed.

Documentation: we’re in the process of moving our user docs from the wiki to sphinx. This means we’ll have versioned docs, nice pdf exports, etc. It’s already 180 pages!

Plus lots of other things. Keep an eye out on our mailing lists, bug tracker or IRC channel.

New Suricata release model

suri-400x400As the team is back from a very successful week in Barcelona, I’d like to take a moment on what we discussed and decided on with regards to development.

One thing no one was happy with is how the release schedules are working. Releases were meant to reasonably frequent, but the time between major releases was growing longer and longer. The 2.0 branch for example, is closing in on 2 years as the stable branch. The result is that many people are missing out on many of the improvements we’ve been doing. Currently many people using Suricata actually use a beta version, of even our git master, in production!

What we’re going to try is time based releases. Pretty much releases will be more like snapshots of the development branch. We think this can work as our dev branch is more and more stable due to our extensive QA setup.

Of course, we’ll have to make sure we’re not going to merge super intrusive changes just before a release. We’ll likely get into some pattern of merge windows and (feature) freezes, but how this will exactly play out is something we’ll figure out as we go.

We’re going to try to shoot for 3 of such releases per year.

In our redmine ticket tracker, I’ve also created a new pseudo-version ‘Soon’. Things we think should be addressed for the next release, will be added there. But we’ll retarget the tickets when they are actually implemented.

Since it’s already almost 2 years since we’ve done 2.0, we think the next release warrants a larger jump in the versioning. So we’re going to call it 3.0. The first release candidate will likely be released this week hopefully followed by a stable in December.

Suricata has been added to Debian Backports

Thanks to the hard work of Arturo Borrero Gonzalez, Suricata has just been added to the openlogo-100Debian ‘backports’ repository. This allows users of Debian stable to run up to date versions of Suricata.

The ‘Backports’ repository makes the Suricata and libhtp packages from Debian Testing available to ‘stable’ users. As ‘testing’ is currently in a freeze, it may take a bit of time before 2.0.5 and libhtp 0.5.16 appear.

Anyway, here is how to use it.

Install

First add backports repo to your sources:

# echo "deb http://http.debian.net/debian wheezy-backports main" > /etc/apt/sources.list.d/backports.list
# apt-get update

As explained here http://backports.debian.org/Instructions/, this will not affect your normal packages.

To prove this, check:

# apt-get install suricata -s
Conf libhtp1 (0.2.6-2 Debian:7.7/stable [amd64])
Conf suricata (1.2.1-2 Debian:7.7/stable [amd64])

Not what we want, as that is still the old version.

To install Suricata from backports, we need to specify the repo:

# apt-get install -t wheezy-backports suricata -s
Conf libhtp1 (0.5.15-1~bpo70+1 Debian Backports:/wheezy-backports [amd64])
Conf suricata (2.0.4-1~bpo70+1 Debian Backports:/wheezy-backports [amd64])

Let’s do it!

# apt-get install -t wheezy-backports suricata
...
Setting up suricata (2.0.4-1~bpo70+1) ...
[FAIL] suricata disabled, please adjust the configuration to your needs ... failed!
[FAIL] and then set RUN to 'yes' in /etc/default/suricata to enable it. ... failed!

Suricata 2.0.4 is now installed, but it’s not yet running.
To see what features have been compiled in, run:

# suricata --build-info
This is Suricata version 2.0.4 RELEASE

Suricata Configuration:
  AF_PACKET support:                       yes
  PF_RING support:                         no
  NFQueue support:                         yes
  NFLOG support:                           no
  IPFW support:                            no
  DAG enabled:                             no
  Napatech enabled:                        no
  Unix socket enabled:                     yes
  Detection enabled:                       yes

  libnss support:                          yes
  libnspr support:                         yes
  libjansson support:                      yes
  Prelude support:                         yes
  PCRE jit:                                yes
  LUA support:                             yes
  libluajit:                               yes
  libgeoip:                                no
  Non-bundled htp:                         yes
  Old barnyard2 support:                   no
  CUDA enabled:                            no

  Suricatasc install:                      yes

It has Luajit enabled, libjansson for the JSON output, NFQ and AF_PACKET IPS modes, NSS for MD5 checksums and unix sockets. Quite a good feature set.

Run

To get it running, we need a few more steps:

Edit /etc/default/suricata:

1. Change RUN=no to RUN=yes
2. Change LISTENMODE to “af-packet”:

Now lets start it.

# service suricata start
Starting suricata in IDS (af-packet) mode... done.

And confirm that it’s running.

# ps aux|grep suricata
root     20295  1.8  4.1 200212 42544 ?        Ssl  00:50   0:00 /usr/bin/suricata -c /etc/suricata/suricata-debian.yaml --pidfile /var/run/suricata.pid --af-packet -D

Check if we’re seeing traffic:

# tail /var/log/suricata/stats.log -f|grep capture
capture.kernel_packets    | RxAFPeth01                | 406
capture.kernel_drops      | RxAFPeth01                | 0
capture.kernel_packets    | RxAFPeth11                | 0
capture.kernel_drops      | RxAFPeth11                | 0
capture.kernel_packets    | RxAFPeth01                | 411
capture.kernel_drops      | RxAFPeth01                | 0
capture.kernel_packets    | RxAFPeth11                | 0
capture.kernel_drops      | RxAFPeth11                | 0
capture.kernel_packets    | RxAFPeth01                | 417
capture.kernel_drops      | RxAFPeth01                | 0
capture.kernel_packets    | RxAFPeth11                | 0
capture.kernel_drops      | RxAFPeth11                | 0
capture.kernel_packets    | RxAFPeth01                | 587
capture.kernel_drops      | RxAFPeth01                | 0
capture.kernel_packets    | RxAFPeth11                | 0
capture.kernel_drops      | RxAFPeth11                | 0
capture.kernel_packets    | RxAFPeth01                | 593
capture.kernel_drops      | RxAFPeth01                | 0
capture.kernel_packets    | RxAFPeth11                | 0
capture.kernel_drops      | RxAFPeth11                | 0

Logging

As the init script starts Suricata in daemon mode, we need to enable logging to file:

Edit /etc/suricata/suricata-debian.yaml and go to the “logging:” section, there change the “file” portion to look like:

  - file:
      enabled: yes
      filename: /var/log/suricata/suricata.log

Note: in the YAML indentation matters, so make sure it’s exactly right.

Rules

Oinkmaster is automatically installed, so lets use that:

First create the rules directory:

mkdir /etc/suricata/rules/

Open /etc/oinkmaster.conf in your editor and add:

url = https://rules.emergingthreats.net/open/suricata-2.0/emerging.rules.tar.gz

Then run:

# oinkmaster -C /etc/oinkmaster.conf -o /etc/suricata/rules
Loading /etc/oinkmaster.conf
Downloading file from https://rules.emergingthreats.net/open/suricata-2.0/emerging.rules.tar.gz... done.
...

Edit /etc/suricata/suricata-debian.yaml and change “default-rule-path” to:

default-rule-path: /etc/suricata/rules

Finally, restart to load the new rules:

# service suricata restart

Validate

Now that Suricata is running with rules, lets see if it works:

# wget http://www.testmyids.com
--2015-01-08 01:21:30--  http://www.testmyids.com/
Resolving www.testmyids.com (www.testmyids.com)... 82.165.177.154

This should trigger a specific rule:

# tail /var/log/suricata/fast.log 
01/08/2015-01:21:30.870346  [**] [1:2100498:7] GPL ATTACK_RESPONSE id check returned root [**] [Classification: Potentially Bad Traffic] [Priority: 2] {TCP} 82.165.177.154:80 -> 192.168.122.181:59190

Success! 🙂

Thanks

Thanks to Arturo Borrero Gonzalez for taking on this work for us. Also many thanks for Pierre Chifflier for maintaining the Suricata and libhtp packages in Debian.

Profiling Suricata with JEMALLOC

JEMALLOC is a memory allocation library: http://www.canonware.com/jemalloc/

It offers many interesting things for a tool like Suricata. Ken Steele of EZchip (formerly Tilera) made me aware of it. In Ken’s testing it helps performance.

Install

wget http://www.canonware.com/download/jemalloc/jemalloc-3.6.0.tar.bz2
tar xvfj jemalloc-3.6.0.tar.bz2
cd jemalloc-3.6.0
./configure --prefix=/opt/jemalloc/
make
sudo make install

Then use it by preloading it:

LD_PRELOAD=/opt/jemalloc/lib/libjemalloc.so ./src/suricata -c suricata.yaml -l tmp/ -r ~/sync/pcap/sandnet.pcap -S emerging-all.rules -v

I haven’t benchmarked this, but if you’re running a high performance setup it may certainly be worth a shot.

Profiling

The library comes with many interesting profiling and debugging features.

make clean
./configure --prefix=/opt/jemalloc-prof/ --enable-prof
make
sudo make install

Start Suricata like this:

LD_PRELOAD=/opt/jemalloc-prof/lib/libjemalloc.so ./src/suricata -c suricata.yaml -l tmp/ -r ~/sync/pcap/sandnet.pcap -S emerging-all.rules -v

Now we don’t see any change as we need to tell jemalloc what we want.

Exit stats

Dumps a lot of stats to the screen.

MALLOC_CONF=stats_print:true LD_PRELOAD=/opt/jemalloc-prof/lib/libjemalloc.so ./src/suricata -c suricata.yaml -l tmp/ -r ~/sync/pcap/sandnet.pcap -S emerging-all.rules -v

Memory leak checks

MALLOC_CONF=prof_leak:true,lg_prof_sample:0 LD_PRELOAD=/opt/jemalloc-prof/lib/libjemalloc.so ./src/suricata -c suricata.yaml -l tmp/ -r ~/sync/pcap/sandnet.pcap -S emerging-all.rules -v
[... suricata output ...]
<jemalloc>: Leak summary: 2011400 bytes, 4523 objects, 645 contexts
<jemalloc>: Run pprof on "jeprof.22760.0.f.heap" for leak detail

Then use the pprof tool that comes with jemalloc to inspect the dumped stats.

$ /opt/jemalloc-prof/bin/pprof --show_bytes ./src/suricata jeprof.22760.0.f.heap
Using local file ./src/suricata.
Using local file jeprof.22760.0.f.heap.
Welcome to pprof!  For help, type 'help'.
(pprof) top
Total: 2011400 B
1050112  52.2%  52.2%  1050112  52.2% PacketGetFromAlloc
600064  29.8%  82.0%   600064  29.8% SCProfilePacketStart
103936   5.2%  87.2%   103936   5.2% SCACCreateDeltaTable
65536   3.3%  90.5%    66192   3.3% pcap_fopen_offline
35520   1.8%  92.2%    35520   1.8% ConfNodeNew
26688   1.3%  93.6%    26688   1.3% __GI___strdup
20480   1.0%  94.6%    20480   1.0% MemBufferCreateNew
20480   1.0%  95.6%    20480   1.0% _TmSlotSetFuncAppend
14304   0.7%  96.3%    14304   0.7% pcre_compile2
14064   0.7%  97.0%    25736   1.3% SCPerfRegisterQualifiedCounter

So it seems we don’t properly clean up our packet pools yet.

Create a PDF of this info:

$ /opt/jemalloc-prof/bin/pprof --show_bytes --pdf ./src/suricata jeprof.22760.0.f.heap > jemalloc.pdf

Dumping stats during runtime

Dump stats after every 16MiB of allocations (lg_prof_interval:24, means every 2^24 bytes, so 16MiB)

I’ve done this in a separate directory since it dumps many files.

$ mkdir jemalloc-profile
$ cd jemalloc-profile/
$ MALLOC_CONF="prof:true,prof_prefix:victor.out,lg_prof_interval:24" LD_PRELOAD=/opt/jemalloc-prof/lib/libjemalloc.so ../src/suricata -c ../suricata.yaml -l ../tmp/ -r ~/sync/pcap/sandnet.pcap -S ../emerging-all.rules -v

Then you should see new *.heap files appear, many during startup. But after some time it should slow down.

You can then visualize the diff between two dumps:

$ /opt/jemalloc-prof/bin/pprof --show_bytes --pdf ../src/suricata --base victor.out.24159.150.i150.heap victor.out.24159.200.i200.heap > jemalloc.pdf

This creates a PDF of the 200th dump taking the 150th dump as a baseline. As we dump every ~16MiB, this covers about 50 * 16 = 800MiB worth of allocations.

Further reading

http://www.canonware.com/jemalloc/
https://github.com/jemalloc/jemalloc/wiki
https://github.com/jemalloc/jemalloc/wiki/Use-Case%3A-Heap-Profiling

Many thanks to Ken Steele for pointing me to the lib and providing me with some good examples.

Crossing the Streams in Suricata

At it’s core, Suricata is a packet processor. It reads packets and pushes them through a configurable pipeline. The 2nd most important processing unit in Suricata is the flow. In Suricata we use the term flow for the bidirectional flows of packets with the same 5 tuple (proto, src ip, dst ip, sp, dp. Vlans can be added as well). In fact, much of Suricata’s threading effort revolves around the flow. In the 2 main runmodes, autofp and workers, flow based load balancing makes sure that a all packets of a single flow always go through the same threading pipeline. In workers this means one single thread, in autofp 2: the capture thread and a stream/detect/output thread.

Flows are the central unit for out ‘app layer’ parsing. Protocol parsers like HTTP don’t even have access to the original packet. It all runs on top of the stream engine, which tracks TCP flows in … our flow structure.

Another place where the flow is crucial is in many of the rules. Rules extensively use the concept of ‘flowbits’. This allows one rule to ‘flag’ a flow, and then another to check this flag. In Emerging Threats many hundreds of rules use this logic.

Ever since we started Suricata, we’ve been talking about what some called ‘global flowbits’. A bit of a strange and contradictory name, but pretty much rule writers wanted the logic of flowbits, but then applied to other units as well. So a few weeks ago I (finally) decided to check if I could quickly implement ‘hostbits’. As Suricata already has a scalable ‘host table’, it was easy add the storage of ‘bits’ there. In a few hours I had the basics working and made it public: see this pull request.

Although I got some nice feedback, I was mostly interested in what the ET folks would think, since they would be the main consumers. While presenting the work I also mentioned the xbits ideas by Michael Rash and the response was “wow, do we have ip_pair tracking now?”. Ehh, no, just ip/host based… “Ah well, I guess that is nice too”. Not exactly the response I hoped for 🙂

IP pair tracking is not something Suricata already did. But as the need was clear I decided to have a look at it. Turned out it was quite simple to do. The IPPair tracker is much like the Host tracking. It’s only done on demand, which sets it apart from the Flow tracking which is done unconditionally. In this case only the new keyword is making use of the IP Pair storage.

So, what I have implemented is pretty much ‘xbits’. It supports tracking by ‘ip_src’, ‘ip_dst’ and ‘ip_pair’. It uses the syntax as suggested by Michael Rash:

xbits:<set|unset|isset|isnotset|toggle>,<bitname>,\
      track <ip_src|ip_dst|ip_pair>,expire <seconds>

It’s only lightly tested, so I would appreciate testing feedback!

You’ll find the code here in PR 1275 at github. This should normally end up in Suricata 2.1, which will come out early next year.

SMTP file extraction in Suricata

In 2.1beta2 the long awaited SMTP file extraction support for Suricata finally appeared. It has been a long development cycle. Originally started by BAE Systems, it was picked up by Tom Decanio of FireEye Forensics Group (formerly nPulse Technologies) followed by a last round of changes from my side. But it’s here now.

It contains:

  • a MIME decoder
  • updates to the SMTP parser to use the MIME decoder for extracting files
  • SMTP JSON log, integrated with EVE
  • SMTP message URL extraction and logging

As it uses the Suricata file handling API, it shares almost everything with the existing file handling for HTTP. The rule keyword work and the various logs work automatically with SMTP as well.

Trying it out

To enable the file extraction, make sure that the MIME decoder is enabled:

app-layer:
  protocols:
    smtp:
      enabled: yes
      # Configure SMTP-MIME Decoder
      mime:
        # Decode MIME messages from SMTP transactions
        # (may be resource intensive)
        # This field supercedes all others because it turns the entire
        # process on or off
        decode-mime: yes

        # Decode MIME entity bodies (ie. base64, quoted-printable, etc.)
        decode-base64: yes
        decode-quoted-printable: yes

        # Maximum bytes per header data value stored in the data structure
        # (default is 2000)
        header-value-depth: 2000

        # Extract URLs and save in state data structure
        extract-urls: yes

Like with HTTP, SMTP depends on the stream engine working correctly. So this page applies https://redmine.openinfosecfoundation.org/projects/suricata/wiki/File_Extraction, although of course the HTTP specific settings are irrelevant to SMTP.

Troubleshooting (SMTP) file extraction issues should always start here: https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Self_Help_Diagrams#File-Extraction-and-Logging-Issues

Logging

Enabling the SMTP logging is simple, just add ‘smtp’ to the list of types in your EVE config, like so:

  # Extensible Event Format (nicknamed EVE) event log in JSON format
  - eve-log:
      enabled: yes
      filetype: regular #regular|syslog|unix_dgram|unix_stream
      filename: eve.json
      # the following are valid when type: syslog above
      #identity: "suricata"
      #facility: local5
      #level: Info ## possible levels: Emergency, Alert, Critical,
                   ## Error, Warning, Notice, Info, Debug
      types:
        - alert:
            # payload: yes           # enable dumping payload in Base64
            # payload-printable: yes # enable dumping payload in printable (lossy) format
            # packet: yes            # enable dumping of packet (without stream segments)
            # http: yes              # enable dumping of http fields
        - http:
            extended: yes     # enable this for extended logging information
            # custom allows additional http fields to be included in eve-log
            # the example below adds three additional fields when uncommented
            #custom: [Accept-Encoding, Accept-Language, Authorization]
        - dns
        - tls:
            extended: yes     # enable this for extended logging information
        - files:
            force-magic: no   # force logging magic on all logged files
            force-md5: no     # force logging of md5 checksums
        #- drop
        - smtp
        - ssh
        # bi-directional flows
        #- flow
        # uni-directional flows
        #- newflow

URLs

As a bonus, the MIME decoder also extracts URL’s from the SMTP message body (not attachments) and logs them in the SMTP log. This should make it easy to post process them. Currently only ‘HTTP’ URLS are extracted, starting with ‘http://‘. So HTTPS/FTP or URLs that don’t have the protocol prefix aren’t logged.

Testing

Naturally, if you’re using SMTP over TLS or have STARTTLS enabled, as you should at least on public networks, none of this will work.

Please help us test this feature!

Suricata Flow Logging

Pretty much from the start of the project, Suricata has been able to track flows. In Suricata the term ‘flow’ means the bidirectional flow of packets with the same 5 tuple. Or 7 tuple when vlan tags are counted as well.

Such a flow is created when the first packet comes in and is stored in the flow hash. Each new packet does a hash look-up and attaches the flow to the packet. Through the packet’s flow reference we can access all that is stored in the flow: TCP session, flowbits, app layer state data, protocol info, etc.

When a flow hasn’t seen any packets in a while, a separate thread times it out. This ‘Flow Manager’ thread constantly walks the hash table and looks for flows that are timed out. The time a flow is considered ‘active’ depends on the protocol, it’s state and the configuration settings.

In Suricata 2.1, flows will optionally be logged when they time out. This logging is available through a new API, with an implementation for ‘Eve’ JSON output already developed. Actually, 2 implementations:

  1. flow — logs bidirectional records
  2. netflow — logs unidirectional records

As the flow logging had to be done at flow timeout, the Flow Manager had to drive it. Suricata 2.0 and earlier had a single Flow Manager thread. This was hard coded, and in some cases it was clearly a bottleneck. It wasn’t uncommon to see this thread using more CPU than the packet workers.

So adding more tasks to the Flow Manager, especially something as expensive as output, was likely going to make things worse. To address this, 2 things are now done:

  1. multiple flow manager support
  2. offloading of part of the flow managers tasks to a new class of management threads

The multiple flow managers simply divide up the hash table. Each thread manages it’s own part of it. The new class of threads is called ‘Flow Recycler’. It takes care of the actual flow cleanup and recycling. This means it’s taking over a part of the old Flow Manager’s tasks. In addition, if enabled, these threads are tasked with performing the actual flow logging.

As the flow logging follows the ‘eve’ format, passing it into Elasticsearch, Logstash and Kibana (ELK) is trivial. If you already run such a setup, the only thing that is need is enabling the feature in your suricata.yaml.

kibana-flow

kibana-netflowThe black netflow dashboard is available here: http://www.inliniac.net/files/NetFlow.json

Many thanks to the FireEye Forensics Group (formerly nPulse Technologies) for funding this work.

More on Suricata lua flowints

This morning I added flowint lua functions for incrementing and decrementing flowints. From the commit:

Add flowint lua functions for incrementing and decrementing flowints.

First use creates the var and inits to 0. So a call:

    a = ScFlowintIncr(0)

Results in a == 1.

If the var reached UINT_MAX (2^32), it’s not further incremented. If the
var reaches 0 it’s not decremented further.

Calling ScFlowintDecr on a uninitialized var will init it to 0.

Example script:

    function init (args)
        local needs = {}
        needs["http.request_headers"] = tostring(true)
        needs["flowint"] = {"cnt_incr"}
        return needs
    end

    function match(args)
        a = ScFlowintIncr(0);
        if a == 23 then
            return 1
        end

        return 0
    end
    return 0

This script matches the 23rd time it’s invoked on a flow.

Compared to yesterday’s flowint script and the earlier flowvar based counting script, this performs better:

   Num      Rule         Gid      Rev      Ticks        %      Checks   Matches  Max Ticks   Avg Ticks   Avg Match   Avg No Match
  -------- ------------ -------- -------- ------------ ------ -------- -------- ----------- ----------- ----------- -------------- 
  1        1            1        0        2434188332   59.71  82249    795      711777      29595.35    7683.20     29809.22   
  2        2            1        0        1015328580   24.91  82249    795      154398      12344.57    3768.66     12428.27   
  3        3            1        0        626858067    15.38  82249    795      160731      7621.47     3439.91     7662.28    

The rules:

alert http any any -> any any (msg:"LUAJIT HTTP flowvar match"; luajit:lua_flowvar_cnt.lua; flow:to_server; sid:1;)
alert http any any -> any any (msg:"LUAJIT HTTP flowint match"; luajit:lua_flowint_cnt.lua; flow:to_server; sid:2;)
alert http any any -> any any (msg:"LUAJIT HTTP flowint incr match"; luajit:lua_flowint_incr_cnt.lua; flow:to_server; sid:3;)

Please comment, discuss, review etc on the oisf-devel list.