OISF engine prototype: streams handling

I’ve been thinking about how to deal with streams in the OISF engine. We need to do stream reassembly to be able to handle spliced sessions, otherwise it would be very easy to evade detection. Snort traditionally used an approach of inspecting the packets individually and reassembling (part of) the stream in a pseudo packet, that was inspected mostly like a normal packet. Recent Snort versions, especially when Stream5 was introduced, have a so called stream api. This enables detection modules to control the reassembly better.

In Snort_inline’s Stream4 I’ve been experimenting with ways to improve stream reassembly in an inline setup. The problem with Snort’s pseudo packet scanning way of operation is that it’s after the fact scanning. Which means that any threat detected in the reassembled stream can’t be dropped anymore. The way I tried to work around this was by constantly scanning a sliding window of reassembled unacked data. It worked quite well, except for the performance of it. That was quite bad.

I’m thinking about a stream reassembler for the OISF engine that can both do the after-the-fact pseudo packet scanning and do a sliding window approach as I did in stream4inline. This would be used for the normal tcp signatures. I think it should be possible to determine the minimal size of the reassembled packet based on the signatures per port, possibly more fine grained. Of course things like target based reassembly and robust reassembly will be part of it.

In addition to this I’m thinking about a way to make modules act on the stream similary to how programs deal with sockets. Code that only wakes up if a new piece of data in that connection is received, with semantics similar to recv()/read(). I haven’t really made up my mind about how such an api should work exactly, but I think it would be very useful to detection module writers if they only have to care about handling the next chunk of data.

I haven’t implemented any of this yet, but I plan to start working on this soon. I’ll start with simple TCP state tracking that I’m planning to build on top of the flow handling already implemented. I’ll blog about this as I go…

OISF engine prototype: threading

In Januari I first wrote about my prototype code for the OISF engine. The first thing I started with when creating the code was the threading. The current code can run as a single thread or with many threads. In my normal testing I run with about 11 threads, 10 of which handle packets, 1 is a management thread.

The basic principle in the threading is that a packet is always handled by one thread at a time only. The reason for this is that it saves a lot of locking issues. If there is more than one thread, the engine can handle multiple packete simultaniously.

All functionality is created in what I call threading modules. Such a module is run in a thread. Threads can have one or more of the modules running. Examples of these modules are Decoding, Detection, Alerting, etc. I intend to make these modules plugins in the future so that 3rd party modules can be loaded without recompiling the codebase.

The threading model works both in a parallel and serial way. The parallel way can be used to have multiple threads doing the same jobs, for example have 2 threads both acquire packets, decoding, detection and alerting. The serial way of threading works differently. In that case a thread has a limited number of tasks (e.g. Decoding) and if it’s done with a packet it passes the packet on to the next thread (that for example does Detection). Both methods can be combined, which I use by default: I have 1 packet aquiring thread, 2 decoding, 2 detection, 1 verdict (I’m using nfq), and a few alerting and active response threads.

Between the (serial) threads queue’s are used to transfer the packet from one thread to another. A queue can contain multiple packets. In the above example, the nfq packet acquire thread can read packets from the queue as fast as it can and put them in it’s queue. The 2 decoding threads then get packets from this queue as fast as they can. Then they put them in the next queue where they are picked up again, etc.

Using the queue’s code paths can also be determined. It’s possible for example to have IPv4 packets be handled by different threads than IPv6 packets. Or packets with alerts differently from packets that didn’t have alerts.

One big challenge is that this is all extremely complex & configurable. Threads have to created, queue’s, queue handlers, CPU affinity can be set per thread, threading modules need to be assigned to threads, etc. I think power users & apliance builders would be interested in having all these options, but for casual users it’s probably (way) too complex to be bothered with. So some reasonable defaults need to be created, maybe in the form of having a number of pre-configure profiles.

OISF IDS/IPS engine prototype intro

For over a year I’ve been working on a prototype implementation of a new IDS/IPS engine for the Open Infosec Foundation. This is not necessarily going to be the engine we’ll be using in OISF, although it’s likely that at least some of the code will be used. Discussions about features for the engine are still ongoing (wiki, list), once that settles down we’ll see whats usable and whats not. In the worst case I still think many parts like hashing functions, pattern matcher implementations, protocol decoders, etc can be used.

So what is there so far? It’s all new code written in the C language and has about 30k lines of code in 150+ files so far. It’s fully threaded in a way that should make it very scalable on many cores/cpu’s. More about the threading in a future post. The code is heavily unit tested, which really helps a lot in preventing and tracing bugs.

Right now it’s limited to being an inline IDS/IPS, using the libnetfilter_queue interface in Linux to acquire and verdict packets. The packet input and verdict subsystem is very modular (I learned a lot from the mess we created in Snort_inline, where we supported 3 types of inline packet capture methods, creating a true #ifdef hell). It has working protocol decoders for IPv4 and IPv6, TCP and UDP. It has a flow engine, a detection engine and output plugins.

For rules/signatures it currently only supports the Snort signature syntax, and loads about 70% of the current VRT and Emerging Threats signatures out there. The biggest thing missing is support for the flowbits option, which is used in a lot of the sigs. It has basic HTTP parsing, enabling at least uri matching.

A lot of things are missing too. For example there is fragment handling, TCP stream state tracking, TCP stream reassembly, a pcap mode, portscan detection, a flowbits like function, normalization, etc, etc.

There are a lot of plans and ideas, for example having output pipes for configurable captured network data. It’s already possible to capture for example a user agent in a rule and match on that captured data. I think it would be very useful to be able to have some pipe to an external program that receives just the user agents and does something with them. Many many more ideas and usecases exist and I hope to write about that more at a later stage.

The most interesting about writing this code is that every time I’m working on some part, I’m getting more and more ideas about possibilities for improvements, optimizations and such. I intent to share those here on my blog from now. Also, I intent to write about the various parts of the code I wrote already. So stay tuned!