Suricata Lua scripting flowint access

A few days ago I wrote about my Emerging Threats sponsored work to support flowvars from Lua scripts in Suricata.

Today, I updated that support. Flowvar ‘sets’ are now real time. This was needed to fix some issues where a script was invoked multiple times in single rule, which can happen with some buffers, like HTTP headers.

Also, I implemented flowint support. Flowints in Suricata are integers stored in the flow context.

Example script:

function init (args)
    local needs = {}
    needs["http.request_headers"] = tostring(true)
    needs["flowint"] = {"cnt"}
    return needs
end

function match(args)
    a = ScFlowintGet(0);
    if a then
        ScFlowintSet(0, a + 1)
    else
        ScFlowintSet(0, 1)
    end 
        
    a = ScFlowintGet(0);
    if a == 23 then
        return 1
    end 
    
    return 0
end 

return 0

It does the same thing as this flowvar script:

function init (args)
    local needs = {}
    needs["http.request_headers"] = tostring(true)
    needs["flowvar"] = {"cnt"}
    return needs
end

function match(args)
    a = ScFlowvarGet(0);
    if a then
        a = tostring(tonumber(a)+1)
        ScFlowvarSet(0, a, #a)
    else
        a = tostring(1)
        ScFlowvarSet(0, a, #a)
    end 
    
    if tonumber(a) == 23 then
        return 1
    end
    
    return 0
end

return 0

Only, at about half the cost:

   Num      Rule         Gid      Rev      Ticks        %      Checks   Matches  Max Ticks   Avg Ticks   Avg Match   Avg No Match
  -------- ------------ -------- -------- ------------ ------ -------- -------- ----------- ----------- ----------- -------------- 
  1        1            1        0        2392221879   70.56  82249    795      834993      29085.12    6964.14     29301.02   
  2        2            1        0        998297994    29.44  82249    795      483810      12137.51    4019.44     12216.74   

Suricata: Handling of multiple different SYN/ACKs

synackWhen processing the TCP 3 way handshake (3whs), Suricata’s TCP stream engine will closely follow the setup of a TCP connection to make sure the rest of the session can be tracked and reassembled properly. Retransmissions of SYN/ACKs are silently accepted, unless they are different somehow. If the SEQ or ACK values are different they are considered wrong and events are set. The stream events rules will match on this.

I ran into some cases where not the initial SYN/ACK was used by the client, but instead a later one. Suricata however, had accepted the initial SYN/ACK. The result was that every packet from that point was rejected by the stream engine. A 67 packet pcap resulting in 64 stream events.

If people have the stream events enabled _and_ pay attention to them, a noisy session like this should certainly get their attention. However, many people disable the stream events, or choose to ignore them, so a better solution is necessary.

Analysis

In this case the curious thing is that the extra SYN/ACK(s) have different properties: the sequence number is different. As the SYN/ACKs sequence number is used as “initial sequence number” (ISN) in the “to client” direction, it’s crucial to track it correctly. Failing to do so, Suricata will loose track of the stream, causing reassembly to fail. This could lead to missed alerts.

Whats happening on the wire:

TCP SSN 1:

-> SYN: SEQ 10
<- SYN/ACK 1: ACK 11, SEQ 100
<- SYN/ACK 2: ACK 11, SEQ 1000
-> ACK: SEQ 11, ACK 101

TCP SSN 2:

-> SYN: SEQ 10
<- SYN/ACK 1: ACK 11, SEQ 100
<- SYN/ACK 2: ACK 11, SEQ 1000
-> ACK: SEQ 11, ACK 1001

It’s clear that in SSN 1 the client ACKs the first SYN/ACK while in SSN 2 the 2nd SYN/ACK is ACK’d. It’s likely that the first SYN/ACK was lost before it reached the client. Suricata accepts the first though, and rejects any others that are not the same.

Solution

The solution I’ve been working on is to delay judgement on the extra SYN/ACKs until Suricata sees the ACK that completes the 3whs. At that point Suricata knows what the client accepted, and which SYN/ACKs were either ignored, or never received.

Logic in pseudo code:

Normal SYN/ACK coming in:

    UpdateState(p);
    ssn->state = TCP_SYN_RECV;

Extra SYN/ACK packets:

    if (p != ssn) {
        QueueState(p);

On receiving the ACK that completes the 3whs:

    if (ssn->queue_len) {
        q = QueueFindState(p);
        if (q)
            UpdateState(q);
    }
    UpdateState(p);
    ssn->state = TCP_ESTABLISHED;

So when receiving the ACK, Suricata first searches for the proper SYN/ACK on the list. If it’s not found, the ACK will be processed normally, which means it’s checked against the original SYN/ACK. If Suricata did have a queued state, it will first apply it to the SSN. Then the ACK will be processed normally, so that is can complete the 3whs and move the state to ESTABLISHED.

Limitations

Queuing these states takes some memory, and for this reason there is a limit to the number each SSN will accept. This is configurable through a new stream option:

stream:
  max-synack-queued: 5

It defaults to 5. I’ve seen a few (valid) hits against a few terrabytes of traffic, so I think the default is reasonably safe. An event is being set if the limit is exceeded. It can be matched using a stream-event rule:

  alert tcp any any -> any any (msg:"SURICATA STREAM 3way handshake \
      excessive different SYN/ACKs"; stream-event:3whs_synack_flood; \
      sid:2210055; rev:1;)

Performance

This functionality doesn’t affect the regular “fast path” except for a small check to see if we have queued states. However, if the queue list is being used Suricata enters a slow path. Currently this involves an memory allocation per stored queue. It may be interesting to consider using pools here, although a single global pool might be ineffecient. In such a case a lock would have to be used and this might lead to contention, especially in a case where Suricata would be flooded. Per thread pools (519, 520, 521) may be best here.

IPS mode

SYN/ACKs that exceed the limit are dropped if stream.inline is enabled as is the case with all packets that are considered to be bad in some way.

Code

The code is now part of the git master through commit 4c6463f3784f533a07679589dab713096137a439. Feedback welcome through our oisf-devel list.

Suricata 1.4 is out

About 5 months after 1.3 came out we’ve released 1.4, and we’ve been quite busy. Eric Leblond’s post here has all the stats and graphs. There are three big new features: unix socket, ip reputation and luajit. For each of these the same is true: it’s usesable now, but it’s the potential that we’re most excited about. Over the next months we’ll be extending each of those to be even more useful. We’re very much interested in ideas and feedback.

Performance obviously matters to many in the IDS world, and here too we have improved Suricata quite a bit again. We now have Suricata 1.4 running on a ISP 10gbit/s network on commodity hardware with a large ET ruleset. Of course, YMMV, but we’re definitely making a lot of progress here.

Sometimes the little things matter a lot as well. A minor new feature is that live “drop” stats are the the stats.log now:

capture.kernel_packets    | AFPacketem21              | 13640581
capture.kernel_drops      | AFPacketem21              | 442864
capture.kernel_packets    | AFPacketem22              | 7073228
capture.kernel_drops      | AFPacketem22              | 9449
capture.kernel_packets    | AFPacketem23              | 10528970
capture.kernel_drops      | AFPacketem23              | 148281
capture.kernel_packets    | AFPacketem24              | 7212584
capture.kernel_drops      | AFPacketem24              | 12643
capture.kernel_packets    | AFPacketem25              | 9763439
capture.kernel_drops      | AFPacketem25              | 17874
capture.kernel_packets    | AFPacketem26              | 10464106
capture.kernel_drops      | AFPacketem26              | 20378
capture.kernel_packets    | AFPacketem27              | 8869182
capture.kernel_drops      | AFPacketem27              | 18336
capture.kernel_packets    | AFPacketem28              | 7925045
capture.kernel_drops      | AFPacketem28              | 258168

This is supported for AF_PACKET, PF_RING and libpcap.

Last August we’ve added Suricata to github to make it easier to participate. Also, the code review tools associated with the pull requests are very useful. Github has been an unexpected success for us. At the time of writing there are 24 forks of Suricata on it, I’ve processed about 250 pull requests. The patches that have been submitted range from small fixes to full blown features, and more are on the way. I’m very grateful for these contributions and everyone’s patience with me.

Now that 1.4 is out, we’ll be taking it slow over the holidays. The team has been working like crazy, and everyone deserves a break. So the next weeks we’ll focus on further consolidation, fixing bugs that no doubt will pop up. Other than that, things will be slow. After the holidays we’ll start planning for the next milestone. Again, your ideas and contributions are very welcome! 🙂

IPv6 Evasions, Scanners and the importance of staying current

Lots of activity on the IPv6 front lately. There was a talk on a conference on bypassing IDS using IPv6 tricks. Also a new scan tool (Topera) claimed to scan a host while staying below the radar of an IDS was released. To start with the latter, even though Suricata doesn’t have a dedicated port scan detector, the tool’s traffic lights up like a Christmas tree. The trick it pulls is to pack a lot of duplicate DST OPTS extension headers in the IPv6 packets. These options are just fillers, the only options they use are the “pad” option. In Suricata we’ve had an event for duplicate DST OPTS headers since 1.3 and the padding only headers generate an event in 1.4. Both alerts will be very noisy, so calling this a stealth attack rather dubious.

The other thing was a talk on IPv6 evasions, where the author compared Snort and Suricata. Suricata didn’t do very well. Sadly the authors chose not to contact us. On closer inspection it turned out an old Suricata version was used. Which one wasn’t specified, but as they did mention using Security Onion, I’m assuming 1.2. In the 1.3 branch (current stable) we’ve fixed and improved IPv6 in a lot of areas. Nonetheless, while testing the various protocol tricks, we did find some bugs that are now fixed in the git masters for the 1.3 stable branch and the 1.4 development branch.

I think these developments serve as a reminder that staying current with your IDS software’s version is critical. For that reason it’s too bad that distro’s like Security Onion, Debian, Ubuntu all lag significantly. The reasons differ through. For the guys from Security Onion it’s mostly a time problem (so go help them if you can!) for Debian and Ubuntu it’s actually policy. For that reason we’re providing PPAs for Ubuntu and for Debian we’re working on getting Suricata into the “backports” repo. The only mainstream distro that does it right for us is Fedora. They just update to the latest stable as soon as it’s out.

Given the complexity of protocols like IPv6 and the new developments all over the board, I see no viable case for staying on older versions. I know it’s a hassle, but stay current. It’s important.

Closing in on Suricata 1.4

I just made Suricata 1.4rc1 available with some pretty exciting features: unix socket mode and IP reputation.

Unix socket

First of all, Eric Leblond’s work on the Unix socket was merged. The unix socket work consists of two parts. The unix socket protocol implementation and a new runmode.

The protocol implementation is based on JSON messages over unix socket. Eric will be fully documenting it soon. Currently the commands are limited to shutting down and getting some basic stats. This part isn’t very exciting yet, but the groundwork for many future extensions has been laid.

The part that is exciting right now, is the unix socket runmode. That this does is start Suricata with all the rules and such, and then it waits for commands on the unix socket. Then the commands will be a pcap filename – log directory pair. This pcap will then be inspected against the rules and the logs go into the log directory supplied. As this can be easily scripted (a python script is provided), it’s a very fast way to test your pcap collections, as the overhead of starting and stopping is skipped.

This may initialy appeal mostly for those of you doing sandnetting and malware analysis, where tens of thousands of pcaps and automatically processed every hour or day, I think this could grow into a feature for a wider audience as well. For example, I could see use in Sguil or Snorby, or pretty much every event manager with full packet capture support, adding an option to scan a pcap associated with an event again. Maybe against _all_ rules, instead of the tuned set running on the live sensors. Maybe you can re-inspect old sessions against the current rules this way to find hits on attacks that were 0-days at the time, etc.

I think there could be many possibilities.

IP Reputation

A slightly more polished version of the code I discussed here is now available in this release. It’s one of those things where it will be very interesting to see how people will put it to use.

Matt Jonkman just wrote some of his ideas to the Emerging Threats mailing list: one of the ideas Matt wrote about is to amend weak rules with reputation data. So if you have a signature that is phrone to false positives, you probably disable it currently. But what if you combine it with reputation data? If the weak rule fires on a sketchy ip, it may be a more reliable alert.

We’ll see how this plays out.

1.4 final

We’re hoping that if nothing big happens, we can do a mid-December 1.4 final release. So please consider running this new release. It’s running very stable on quite a number of places, ISP networks, Lab networks, home networks, sandnetting networks, etc. But we need much more testing to find issues and/or gain confidence that we have found the most important issues. Thanks for helping out!

IP Reputation in Suricata

Disclaimer: this work was sponsored by Emerging Threats Pro.

One thing we’ve been talking about for many years at OISF is IP Reputation. The basic idea is that many organizations have information about specific IP-addresses. This information may be that a host is infected, acts as a spam relay or many other things. We’ve always thought it might be useful to apply this info to the IDS directly.

In the last weeks I’ve developed code to load IP reputation information into Suricata. This code is now part of the Suricata git master, so it’s available to all.

The work consisted of 3 main parts: data load, internal data structures and a rule keyword.

Data loading

The data I worked with was provided by Emerging Threats Pro. The data format is very simple. Two types of CSV files, one to define a mapping between category names and id’s and the other to define the scores for hosts in the categories.

The data formats are documented here: IP Reputation Format.

Internal Data Structures

To store the data in memory I hooked into our “Hosts” API. The Hosts API is a hash table like the Flow table that can be used to store data per host. It’s in use for Tagging and Thresholding. I added storage for IP Reputation to it.

Rule keyword

A new rule keyword to match on the reputation data was introduced: “iprep”. The keyword allows a rule to match on a specific category. Example:

alert ... (flow:to_server; iprep:src,Bot,>,10;)

This will generate an alert if the SRC IP of the host talking to a server is known to have a score of >10 in the “Bot” category.

The keyword is compatible to Suricata’s concept of “IP-only” rules. These are rules that do not inspect packet content or flow state and can thus be inspected once per flow direction instead of for each packet.

Speed

I’ve been playing with data sets of up to a million entries. Loading it takes hardly any time and I’m confident larger numbers will work just fine. The host table just needs bigger memcaps and hash sizes.

At runtime, the speed depends mostly on the rules. A pure “iprep” rule is quite expensive when not IP-only, although this is mostly due to the frequency of the checks. Such rules will be checked against large numbers of packets.

When created as a IP-only rule, things change. Such rules are checked only once per flow direction, so overhead appears to be minimal in this case.

Data

The data I used from Emerging Threats Pro is not available for free, so for those who want to test creating your own data is required right now. Matt Jonkman from Emerging Threats Pro will make a free feed available within a few weeks though. Of course you could also get the paid data from Emerging Threats Pro. 🙂

Update 29/11/2012

This feature is part of the just released 1.4rc1 version, please help us test it!

Important Suricata update

We just released Suricata 1.3.3 which contains some important accuracy fixes. Also, it should be much more robust against out of memory conditions.

For those of you running Suricata in IPS mode, this is important as well. We found that rules that have the drop or reject actions, were not playing well with thresholding.

So upgrading is highly recommended!

Code changes are not too big, largest changes are due to some extra unittests:

 ChangeLog                           |   11 +
 libhtp/htp/dslib.c                  |    4 +-
 libhtp/htp/hooks.c                  |   31 +-
 libhtp/htp/htp_connection.c         |   34 ++-
 libhtp/htp/htp_connection_parser.c  |   25 +-
 libhtp/htp/htp_parsers.c            |    2 +-
 libhtp/htp/htp_request.c            |    4 +-
 libhtp/htp/htp_request_apache_2_2.c |   24 +-
 libhtp/htp/htp_transaction.c        |   68 +++--
 libhtp/htp/htp_util.c               |   35 ++-
 src/alert-debuglog.c                |    4 +-
 src/app-layer.c                     |    9 +-
 src/decode.h                        |    3 +-
 src/detect-detection-filter.c       |   96 ++++++
 src/detect-engine-alert.c           |   37 ++-
 src/detect-engine-hcbd.c            |    5 +
 src/detect-engine-hhd.c             |  121 +++++++-
 src/detect-engine-hsbd.c            |    5 +
 src/detect-engine-iponly.c          |    5 +-
 src/detect-engine-payload.c         |   26 ++
 src/detect-engine-threshold.c       |   15 +-
 src/detect-filemd5.c                |   24 +-
 src/detect-filestore.c              |   11 +-
 src/detect-filestore.h              |    2 +-
 src/detect-pcre.c                   |  485 +----------------------------
 src/detect-threshold.c              |  569 ++++++++++++++++++++++++++++++++++-
 src/detect.c                        |   11 +-
 src/detect.h                        |    2 +-
 src/flow-hash.c                     |   10 +-
 src/flow-timeout.c                  |   10 +-
 src/flow.c                          |    1 -
 src/flow.h                          |   14 +
 src/log-httplog.c                   |    2 +-
 src/runmodes.c                      |    2 +-
 src/source-ipfw.c                   |    1 +
 src/source-pfring.c                 |   20 +-
 src/stream-tcp-reassemble.c         |    4 +-
 src/stream-tcp.c                    |   12 +-
 src/stream.c                        |    3 +-
 src/threads.h                       |    1 +
 src/tmqh-packetpool.c               |    5 +-
 src/util-buffer.h                   |    6 +-
 src/util-debug.c                    |    2 +-
 src/util-host-os-info.c             |   32 +-
 src/util-threshold-config.c         |  210 +++++++++++++
 suricata.yaml.in                    |    6 +-
 46 files changed, 1340 insertions(+), 669 deletions(-)

Suricata on Myricom capture cards

Myricom and OISF just announced that Myricom joined to OISF consortium to support the development of Suricata. The good folks at Myricom already sent me one of their cards earlier. In this post I’ll describe how you can use these cards already, even though Suricata doesn’t have native Myricom support yet. So in this guide I’ll describe using the Myricom libpcap support.

Getting started

I’m going to assume you installed the card properly, installed the Sniffer driver and made sure that all works. Make sure that in your dmesg you see that the card is in sniffer mode:

[ 2102.860241] myri_snf INFO: eth4: Link0 is UP
[ 2101.341965] myri_snf INFO: eth5: Link0 is UP

I have installed the Myricom runtime and libraries in /opt/snf

Compile Suricata against Myricom’s libpcap:

./configure --with-libpcap-includes=/opt/snf/include/ --with-libpcap-libraries=/opt/snf/lib/ --prefix=/usr --sysconfdir=/etc --localstatedir=/var
make
sudo make install

Next, configure the amount of ringbuffers. I’m going to work with 8 here, as my quad core + hyper threading has 8 logical CPU’s.

pcap:
  - interface: eth5
    threads: 8
    buffer-size: 512kb
    checksum-checks: no

The 8 threads setting makes Suricata create 8 reader threads for eth5. The Myricom driver makes sure each of those is attached to it’s own ringbuffer.

Then start Suricata as follows:
SNF_NUM_RINGS=8 SNF_FLAGS=0x1 suricata -c suricata.yaml -i eth5 --runmode=workers

If you want 16 ringbuffers, update the “threads” variable in your yaml to 16 and start Suricata:
SNF_NUM_RINGS=16 SNF_FLAGS=0x1 suricata -c suricata.yaml -i eth5 --runmode=workers

It looks like you can use any number of ringbuffers, so not limited to a power of 2 for example.

Example with CPU affinity

You can also use Suricata’s built in CPU affinity settings to assign a worker to a cpu/core. In this example I’ll create 7 worker threads that will each run on their own logical CPU. The remaining CPU can then be used by the management threads, most importantly the flow manager.

max-pending-packets: 8192

detect-engine:
  - sgh-mpm-context: full

mpm-algo: ac-bs

threading:
  set-cpu-affinity: yes
  cpu-affinity:
    - management-cpu-set:
      cpu: [ "0" ]
    - detect-cpu-set:
      cpu: [ "1-7" ]
      mode: "exclusive"
      prio:
        default: "high"

pcap:
  - interface: eth5
    buffer-size: 512kb
    threads: 7
    checksum-checks: no

Then start Suricata with:
SNF_NUM_RINGS=7 SNF_FLAGS=0x1 suricata -c suricata.yaml -i eth5 --runmode=workers

This configuration will reserve cpu0 for the management threads and will assign a worker thread (and thus a ringbuffer) to cpu1 to cpu7. Note that I added a few more performance tricks to it. This config with 7 cpu pinned threads appears to be a little faster than the case where CPU affinity is not used.

Myricom has a nice traffic replay tool as well. This replays a pcap at 1Gbps:
snf_replay -r 1.0 -i 10 -p1 /path/to/pcap

Final remarks

The Myricom card already works nicely with Suricata. Because of the way their libpcap code works, we can already use the ringbuffers feature of the card. Myricom does also offer a native API. Later this year, together with Myricom, we’ll be looking into adding support for it.

Suricata http_user_agent vs http_header

One of the new features in Suricata 1.3 is a new content modifier called http_user_agent. This allows rule writers to match on the User-Agent header in HTTP requests more efficiently. The new keyword is documented in the OISF wiki. In this post, I’ll show it’s efficiency with two examples.

Example 1: rarely matching UA

Consider a signature where the match if on a part of the UA that is very rare, so not part of regular User Agents. In my example “abc”.

The signature looks like this:
alert http any any -> any any (msg:"User-Agent abc http_header"; content:"User-Agent: "; http_header; nocase; content:"abc"; http_header; distance:0; pcre:"/User-Agent:[^\n]*abc/iH"; sid:1; rev:1;)

The http_user_agent variant looks much simpler:
alert http any any -> any any (msg:"User-Agent abc http_user_agent"; content:"abc"; http_user_agent; sid:2; rev:1;)

Now when running this against a pcap with over 12.500 HTTP requests, neither signature matched. However, signature 1 was inspected 209752 times! This high number is because the request headers are inspected one-by-one. Signature 2 wasn’t inspected at all, as it never made it past the multi pattern matching stage (mpm).

When looking at pcap runtime, running with only the http_user_agent version is about 10% faster.

Example 2: commonly matching UA

So, what if we want to match on something that is quite common? In other words, the signature will have frequent matches?

First, the http_header signature:
alert http any any -> any any (msg:"User-Agent MSIE 6 http_header"; content:"User-Agent: "; http_header; nocase; content:"MSIE 6"; http_header; distance:0; pcre:"/User-Agent:[^\n]*MSIE 6/iH"; sid:3; rev:1;)
The http_user_agent variant:
alert http any any -> any any (msg:"User-Agent MSIE 6 http_user_agent"; content:"MSIE 6"; http_user_agent; sid:4; rev:1;)

In this case both signatures do match, just over 10.000 times even. The stats look like this:

Each of the inspections of signature 4, the http_user_agent variant, is actually a match. This makes sense as we look for a simple string and the mpm will only invoke the signature if that string is found. It’s clear that the http_header variant takes way more resources. Here too, when looking at pcap runtime, running with only the http_user_agent version is approximately 10% faster.

Final remarks

It’s quite clear that the http_user_agent keyword is much more efficient that inspecting all the HTTP headers. But other than efficiency, the http_user_agent also allows for much easier to read rules.

The Emerging Threats project will likely fork their Suricata ruleset for 1.3 (see this blog post). Even though this will be a significant effort on their side, it’s pretty clear to me the performance effect will be noticeable!

Suricata MD5 blacklisting

For a few months Suricata has been able to calculate the MD5 checksum of files it sees in HTTP streams. Regardless of extraction to disk, the MD5 could be calculated and logged. Martin Holste created a set of very cool scripts to use the logged MD5 to look it up at VirusTotal and some other similar services. This is done outside of Suricata. One thing I have been wanting to try is matching against these MD5’s in Suricata itself.

In the recent 1.3beta2 release, I’ve added a first attempt at this. The current support is crude but works. I’ve added a rule keyword, called “filemd5”.

Syntax:
filemd5:filename;

The keyword opens the file “filename” from your rule directory and loads it’s content. It expects a heximal MD5 per line:

91849eac70248b01e3723d12988c69ac

Any extra info on a line is ignored, so the output of md5sum can be used safely:

91849eac70248b01e3723d12988c69ac suricata-1.3beta2.tar.gz

At start up, Suricata will tell you how much memory the hash table uses. The hash table is quite compact. It uses hash_rows * 4 + md5’s * 16 bytes. For 20155064 MD5’s it uses a bit more than 300mb:

[3748] 9/6/2012 -- 08:40:44 - (detect-filemd5.c:264) (DetectFileMd5Parse) -- MD5 hash size 324578208 bytes

Performance so far seems to be great. I’ve been testing with 20 million MD5’s and so far I’m not seeing any significant performance impact. The dedicated data structures I created for it seem to hold up quite nicely. Right now the only slow down I see is at start up, where it adds a few seconds. The data structure is currently limited to 32 bit, so a 4GB table. This should allow ~250 million MD5’s, although I haven’t tested that.

As this is a regular rule keyword, it can be combined with other rule keywords, such as filemagic or filename. A sig like “filemagic:pdf; filemd5:bad_pdfs;” would match the list “bad_pdfs” only against pdf files.

I think there are several possible use cases for this new functionality. First, I could imagine a project like Emerging Threats shipping a list of the most recent malware MD5’s. It should be possible to distribute the most recent 100k or so MD5’s.

Second, this could be used as a poor man’s DLP. Hash the files you don’t want to see on your network outbound or unencrypted and have Suricata look for them.

The most interesting use case probably is not implemented yet, but will be. When negated matching is implemented, the filemd5 keyword could be used for white listing.

As Martin Holste tweeted: “Awesome, more than enough to handle all Windows OS files. So can we just do: filemagic:exe; filemd5:!whitelist.txt;?”

I think an alert for all executable downloads that are not “pre approved” is definitely something that can be useful.

Again, the work continues! 🙂