File extraction in Suricata

Today I pushed out a new feature in Suricata I’m very excited about. It has been long in the making and with over 6000 new lines of code it’s a significant effort. It’s available in the current git master. I’d consider it alpha quality, so handle with care.

So what is this all about? Simply put, we can now extract files from HTTP streams in Suricata. Both uploads and downloads. Fully controlled by the rule language. But thats not all. I’ve added a touch of magic. By utilizing libmagic (this powers the “file” command), we know the file type of files as well. Lots of interesting stuff that can be done there.

Rule keywords

Four new rule keywords were added: filename, fileext, filemagic and filestore.

Filename and fileext are pretty trivial: match on the full name or file extension of a file.

alert http any any -> any any (filename:”secret.xls”;)
alert http any any -> any any (fileext:”pdf”;)

More interesting is the filemagic keyword. It runs on the magic output of inspecting the (start of) a file. This value is for example:

GIF image data, version 89a, 1 x 1
PE32 executable for MS Windows (GUI) Intel 80386 32-bit
HTML document text
Macromedia Flash data (compressed), version 9
MS Windows icon resource – 2 icons, 16×16, 256-colors
PNG image data, 70 x 53, 8-bit/color RGBA, non-interlaced
JPEG image data, JFIF standard 1.01
PDF document, version 1.6

So how the filemagic keyword allows you to match on this is pretty simple:

alert http any any -> any any (filemagic:”PDF document”;)
alert http any any -> any any (filemagic:”PDF document, version 1.6″;)

Pretty cool, eh? You can match both very specifically and loosely. For example:

alert http any any -> any any (filemagic:”executable for MS Windows”;)

Will match on (among others) these types:

PE32 executable for MS Windows (DLL) (GUI) Intel 80386 32-bit
PE32 executable for MS Windows (GUI) Intel 80386 32-bit
PE32+ executable for MS Windows (GUI) Mono/.Net assembly

Finally there is the filestore keyword. It is the simplest of all: if the rule matches, the files will be written to disk.

Naturally you can combine the file keywords with the regular HTTP keywords, limiting to POST’s for example:

alert http $EXTERNAL_NET any -> $HOME_NET any (msg:”pdf upload claimed, but not pdf”; flow:established,to_server; content:”POST”; http_method; fileext:”pdf”; filemagic:!”PDF document”; filestore; sid:1; rev:1;)

This will alert on and store all files that are uploaded using a POST request that have a filename extension of pdf, but the actual file is not pdf.


The storage to disk is handled by a new output module called “file”. It’s config looks like this:

enabled: yes # set to yes to enable
log-dir: files # directory to store the files
force-magic: no # force logging magic on all stored files

It needs to be enabled for file storing to work.

The files are stored to disk as “file.1”, “file.2”, etc. For each of the files a meta file is created containing the flow information, file name, size, etc. Example:

TIME: 01/27/2010-17:41:11.579196
PCAP PKT NUM: 2847035
DST PORT: 56207
FILENAME: /msdownload/update/software/defu/2010/01/mpas-fe_7af9217bac55e4a6f71c989231e424a9e3d9055b.exe
MAGIC: PE32+ executable for MS Windows (GUI) Mono/.Net assembly
SIZE: 5204


The file extraction is for HTTP only currently, and works on top of our HTTP parser. As the HTTP parser runs on top of the stream reassembly engine, configuration parameters of both these parts of Suricata affect handling of files.

The stream engine option “stream.reassembly.depth” (default 1 Mb) controls the depth into a stream in which we look. Set to 0 for no limit.
The libhtp options request-body-limit and response-body-limit control how far into a HTTP request or response body we look. Again set to 0 for no limit. This can be controlled per HTTP server.


The file handling is fully streaming, so it’s very efficient. Nonetheless there will be an overhead for the extra parsing, book keeping, writing to disk, etc. Memory requirements appear to be limited as well. Suricata shouldn’t keep more than a few kb per flow in memory.


Lack of limits is a limitation. For file storage no limits have been implemented yet. So it’s easy to clutter your disk up with files. Example: 118Gb enterprise pcap storing just JPG’s extracted 400.000 files. Better use a separate partition if you’re on a life link.

Future work

Apart from stabilizing this code and performance optimizing it, the next step will be SMTP file extraction. Possibly other protocols, although nothing is set in stone there yet.

Suricata 1.1 released, 1.2 on the horizon

Today we released Suricata 1.1. This ends a rather long development cycle of more than a year. And it shows. Performance, accuracy and features were all greatly improved. I think it’s the best Suricata so far. If you’ve been looking at trying Suricata, now might be a good time to jump in.

The long development cycles should be something of the past. At our last brainstorm session, at RAID 2011, we decided to change our release policy. The aim of this policy is to do time based releases, roughly a “stable” every 2 months and a beta every other month. This way we’ll be making it much easier for users to stay current without have to run our “git master”.

Looking forward, we’ve started work on the 1.2 release, which should happen in about 2 months. Focus will be on performance. We’re planning to do a significant refactoring of our pattern matching engine, which should lead both to better performance and improved accuracy. Next to this, we’ll be finally adding the “file_data” keyword along with HTTP file carving — extracting files from HTTP requests. I am personally very excited about this.

We’re starting to see more and more community involvement. Not just on the user side, but also on the development side. As seen on the oisf-devel mailinglist, a large SSL/TLS patch set was contributed by Pierre Chifflier. This will make it’s way into the 1.2 release as well. Smaller contributions were accepted on PF_RING code and the HTTP code. I am very grateful for the contributions.

Eric Leblond and I will be doing a talk next week at DeepSec on Suricata. If you are able to, please come meet us!

RAID 2011 Thoughts

The last few days I’ve been at the Recent Advances in Intrusion Detection (RAID) conference in California. Overall it has been a very pleasant and interesting experience. The nice California weather was certainly helping a lot!

I’ve seen all talks and some were very interesting. However, being a Suricata IDS developer, I was not just interested in research for the hell of it, but I was actively scouting for ideas we could implement into Suricata. In this respect the conference was highly disappointing. Although with some of the talks I thought the idea was applicable in general security, like Erik Bosmans high speed memory tainting detection, I found nothing like that for NIDS.

Most inspiring part of the conference was spending an evening with Seth Hall, one of the Bro IDS engineers. Bro has a very different approach to inspecting the network than Suricata. Actually, I should say Suricata does it differently as Bro has been around much longer than Suricata. 🙂 The conversation was all about sharing of ideas and experiences, and finding common grounds for actual cooperation.

A couple of notes from that conversation. First, Bro supports Unified2/Barnyard2 now, as input (so actually Barnyard2 can output to Bro). This means it can extend it’s analysis to include Suricata generated events. Second, we might try to have Suricata and Bro work together, where Suricata would be controlled by Brocolli. This way Bro could benefit from Suricata’s high speed signature matching engine, functionality Bro doesn’t have, and Suricata could benefit from Bro’s higher level understanding of the network. Finally, Bro’s binpack effort to define protocol parsers in a higher level language that can then be compiled into native code looks interesting as well. It would probably take quite a bit of changes to get this all going, but it might just be worth it.

Then there was the panel at the conference with Martin Roesch, Seth Hall and myself. A lot of people expected fireworks, but no such thing happened. Everyone was polite, respectful and friendly. It never really turned into a real discussion though, it was more a Q&A with the audience. Dominique Karg blogged about the panel here.

It was good to talk to Martin Roesch. The OISF – Sourcefire relation has definitely not started well, so it was good to have normal conversations and such. I offered Marty to work together, especially on SCADA detection. As was announced earlier, OISF will maintain the Digital Bond Quickdraw SCADA parsers and keywords, not only for Suricata, but also for Snort. Hopefully we can start a more constructive relationship on this topic, and elsewhere.

Some final thoughts on RAID. It was well organized and it was great to meet so many smart(er) people thinking about generally the same topics as I do. On the negative side I do feel disappointed over the apparent disconnect between the academic world and the more real world focused efforts like Suricata, Snort and tools like Streamdb, Sguil, Snortby, Squert, etc. But maybe I’m just lacking the vision to put the theory to practice.

The current tools out there may not be considered sufficient by everyone for every task. However, if RAID was a good benchmark, I fear we’ll have to settle for those for a while. Thats not necessarily a bad thing as fore-mentioned tools are under active development and continue to improve steadily.

Suricata IPS improvements

January has been a productive month for Suricata, especially for the IPS part of it. I’ve quite some time on adding support to the stream engine to operate differently when running inline. This was needed as dropping attacks found in the reassembled stream or the application layer was not reliable. Up until now the stream engine would offer the reassembled stream to the detection engine as soon as it was ACK’d. This meant that by definition the packets containing the data had already passed the IPS device. Simply switching to sending un-ACK’d data to the detection engine would have it’s own set of issues.

To be able to work with un-ACK’d data, we need to make sure we deal with possible evasions properly. The problem, as extensively documented by Judy Novak and Steven Sturges, is that in TCP streams there can be overlapping packets. Those are being dealt with differently based on the receiving OS. If we would need to account for overlaps in the application layer, we would have to be able to tell the HTTP parser for example: “sorry, that last data is wrong, please revert and use the new packet instead”. A nightmare.

The solution I opted for was to not care about destination OS’ for overlaps and such. The approach is fairly simple: once we have accepted a segment, thats what it’s going to be. This means that if we receive a segment later that (partially) overlaps and has different data, it’s data portion will simply be overwritten to be the same as the original segment. This way, the IPS and not an obscure mix of the sender (attacker?) and destination OS, determines the data the destination will see.

Of course the approach comes with some drawbacks. First, we need to keep segments in memory for a longer period of time. This causes significantly higher memory usage. Secondly, if we rewrite a packet, it needs to be reinjected on the wire. As we modified the packet payload a checksum recalculation is required.

In Suricata’s design the application layer parsers, such as our HTTP parser, run on top of the reassembly engine. After the reassembly engine and the app layer parsers are updated, the packet with the associated stream and app layer state is passed on to the detection engine. In the case where we work with ACK’d data, an ACK packet in the opposite direction triggers the reassembly process. If we detect based on that, and decide we need to drop, all we can do is drop the ACK packet as the actual data segment(s) have already passed. This is not good enough in many cases.

In the new code the data segment itself triggers the reassembly process. In this case, if the detection engine decides a drop is required, the packet containing the data itself can be dropped, not just the ACK. The reason we’re not taking the same approach in IDS mode is that we wouldn’t be able to properly deal with the said evasion/overlap issues. The IPS can exactly control what packets pass Suricata. The IDS, being passive, can not.

You can try this code by checking out the current git master. In the suricata.yaml that lives in our git tree you’ll find a new option in the stream config, “stream.inline”. If you enable this, the code as explained above is activated.

Feedback is very welcome!

Suricata 1.1 beta 1 released

Today we’ve released Suricata 1.1 beta 1, the first beta of the upcoming Suricata 1.1 release. The official release announcement is here on the OISF website.

The main focus of the new release has been to improve performance and to add support to the features the new ET/ETpro ruleset needs. ET and ETpro have rulesets specially tuned and geared for Suricata. We’re still missing some new rule keywords that are used by VRT, so in the 1.1 beta 2 release we’ll address that.

Other than that, I got quite a few patches waiting. We’ll be improving stream reassembly, inline mode, prelude output, and numerous other things.

Like always, please give this a try and let us know how it works for you!

Suricata 1.0.2 released

After some well deserved vacation I’m getting back up to speed in Suricata development. Luckily most of our dev team continued to work in my absence, making today’s 1.0.2 release possible.

The main focus of this release was fixing the TCP stream engine. Judy Novak found a number of ways to evade detection. See her blog post describing the issues.

The biggest other change is the addition of a new application layer module. The SSH parser parses SSH sessions and stops detection/inspection of the stream after the encrypted part of the session has started. So this is mainly a module focused on reducing the number of packets that need inspection, just like the SSL and TLS modules.

As a bonus though, we introduced two rule keywords that match on the parsed SSH parameters:

ssh.protoversion will match against the ssh protocol version. I’ll give some examples.


This will match on 2.0 exactly.


This will match on 2, but also 1.99 and other versions compatible to “2”.


The last example will match on all versions starting with “1.”, so 1.6, 1.7, etc.

ssh.softwareversion will match on the software version identifier. An example:


This will match only on session using the PuTTY SSH client.

Other changes include better HTTP accuracy, better IPS functionality.

For the next release we will focus on further improving overall detection accuracy, improving inline mode further, improving performance and specifically improving CUDA performance. As always, we welcome any feedback. Or if you are interested in helping out, please contact us!

Update: added a link to Judy Novak’s blog post on the TCP evasions.

Setting up Suricata 0.9.0 for initial use on Ubuntu Lucid 10.04

The last few days I blogged about compiling Suricata in IDS and IPS mode. Today I’ll write about how to set it up for first use.

Starting with Suricata 0.9.0 the engine can run as an unprivileged user. For this create a new user called “suricata”.

useradd --no-create-home --shell /bin/false --user-group --comment “Suricata IDP account” suricata

This command will create a user and group called “suricata”. It will be unable to login as the shell is set to /bin/false.

The next thing to do is creating a configuration directory. Create /etc/suricata/ and copy the suricata.yaml example config into it. The example configuration can be found in the source archive you used to build Suricata:

mkdir /etc/suricata
cp /path/to/suricata-0.9.0/suricata.yaml /etc/suricata/
cp /path/to/suricata-0.9.0/classification.config /etc/suricata/

Next, create the log directory.

mkdir /var/log/suricata

The log directory needs to be writable for the user and group “suricata”, so change the ownership:

chown suricata:suricata /var/log/suricata

The last step I’ll be describing here is retrieving an initial ruleset. The 2 main rulesets you can use are Emerging Threats (ET) and Sourcefire’s VRT ruleset. Since putting VRT to use is a little bit more complicated I’ll be focussing on ET here.

First, download the emerging rules:


Go to /etc/suricata/ and extract the rules archive:

cd /etc/suricata/
tar xzvf /path/to/emerging.rules.tar.gz

There is a lot more to rules, such as tuning and staying updated, but thats beyond the scope of this post.

Suricata is now ready to be started:

suricata -c /etc/suricata/suricata.yaml -i eth0 --user suricata --group suricata

If all is setup properly, Suricata will tell you it is now running:

[2087] 9/5/2010 — 18:17:47 – (tm-threads.c:1362) (TmThreadWaitOnThreadInit) — all 8 packet processing threads, 3 management threads initialized, engine started.

There are 3 log files in /var/log/suricata that will be interesting to monitor:

– stats.log: displays statistics on packets, tcp sessions etc.
– fast.log: a alerts log similar to Snort’s fast log.
– http.log: displays HTTP requests in a Apache style format.

This should get you going. There is a lot more to deploying Suricata that I plan to blog on later.

Compiling Suricata 0.9.0 in Ubuntu Lucid 10.04 in IPS (inline) mode

Note: the difference with the 0.8.2 post is that addition of libcap-ng-dev. This allows Suricata to run as an unprivileged user.

Here is how to compile Suricata 0.9.0 in inline mode on Ubuntu Lucid 10.04.

First, make sure you have the “universe” repository enabled. Go to the System menu, Administration, Software Sources. There enable “Community-maintained Open Source Software (universe)”. If you’re not running a gui, edit /etc/apt/sources.list and enable the universe repository there. Don’t forget doing an “apt-get update”.

Install the following packages needed to build Suricata: libpcre3-dev libpcap-dev libyaml-dev zlib1g-dev libnfnetlink-dev libnetfilter-queue-dev libnet1-dev libcap-ng-dev.

apt-get install libpcre3-dev libpcap-dev libyaml-dev zlib1g-dev libnfnetlink-dev libnetfilter-queue-dev libnet1-dev libcap-ng-dev

Download Suricata 0.9.0 here

Extract the suricata-0.9.0.tar.gz file as follows:

tar xzvf suricata-0.9.0.tar.gz

Enter the extracted directory suricata-0.9.0.

Run “./configure –enable-nfqueue”
If “./configure –enable-nfqueue” was succesful, run “make”
If “make” was succesful, run “sudo make install”
Except for Suricata itself, the build process installed “libhtp”. For that to work properly, run “ldconfig”.

Run “suricata -V” and it should report version 0.9.0.

To use Suricata in inline mode, pass -q <queue id> to the command line. Example:

suricata -c /etc/suricata/suricata.yaml -q 0

Suricata 0.9.0 released

Yesterday we released we first release candidate for our upcoming 1.0 release of Suricata. See the announcement on the OISF site here.

Most notable changes are the following new features:

– Support for the http_headers keyword was added
– libhtp was updated to version 0.2.3
– Privilege dropping using libcap-ng is now supported
– Proper support for “pass” rules was added
– Inline mode for Windows was added

Go get the release here:

Compiling Suricata 0.8.2 in Ubuntu Lucid 10.04 in IPS (inline) mode

Yesterday I wrote about how to compile and install Suricata 0.8.2 as an IDS on Ubuntu Lucid 10.04, today I’ll explain the steps to compile and install it as an IPS. In IPS mode the engine runs in inline mode. This means that it gets it’s packets from netfilter and sets a verdict on them after inspecting them. This way we can drop packets that trigger the rules.

First, make sure you have the “universe” repository enabled. Go to the System menu, Administration, Software Sources. There enable “Community-maintained Open Source Software (universe)”. If you’re not running a gui, edit /etc/apt/sources.list and enable the universe repository there. Don’t forget doing an “apt-get update”.

Install the following packages needed to build Suricata: libpcre3-dev libpcap-dev libyaml-dev zlib1g-dev libnfnetlink-dev libnetfilter-queue-dev libnet1-dev.

apt-get install libpcre3-dev libpcap-dev libyaml-dev zlib1g-dev libnfnetlink-dev libnetfilter-queue-dev libnet1-dev

Download Suricata 0.8.2 here

Extract the suricata-0.8.2.tar.gz file as follows:

tar xzvf suricata-0.8.2.tar.gz

Enter the extracted directory suricata-0.8.2.

Run “./configure –enable-nfqueue”
If “./configure –enable-nfqueue” was succesful, run “make”
If “make” was succesful, run “sudo make install”
Except for Suricata itself, the build process installed “libhtp”. For that to work properly, run “ldconfig”.

Run “suricata -V” and it should report version 0.8.2.

To use Suricata in inline mode, pass -q <queue id> to the command line. Example:

suricata -c /etc/suricata/suricata.yaml -q 0