OISF engine development update(2)

Another quick update on the development of the OISF engine. Overall development is going great. Basics like signature keywords, stream reassembly, ip defragmentation are nearing completion. Unified1 + barnyard was already working for quite some time, but now we also have unified2 compatible output. I’ve tested this to work with barnyard2 and Sguil which works nicely.

We have the first versions of our new YAML based configuration format checked in, a brand new logging API, midstream pickup support in our Stream engine, native PFRING support and many other additions.

Next up in development is IP reputation support in the engine, support for advanced vars and more L7 modules.

The IP reputation is one of the things we have high expectation about. In working group discussions we defined 16 categories for which to keep reputation, for example: spammer, cnc, p2p, etc. We’re currently designing the datatypes for holding as much of this info in memory as possible.

The advanced vars is the idea of applying ModSecurity’s var collections to our engine. They will be like Snort’s flowbits, but then more advanced. At least 2 more types will be supported: integers and raw buffers. Integers should enable signatures to modify counters, but also compare various counters with each other.

In the L7 modules development we’re putting a big focus on HTTP (more on that later). We’re also currently working on a DCE/RPC decoder. For this work we hired Kirby Kuehl from Breaking Point. The L7 framework I’m still working on should make development of modules for new protocols fairly straightforward.

We expect to announce the name and mascot of our engine soon. Also our bylaws should also finally been done almost now, just as the consortium license. Once thats all into place, we expect to open up our code to the public quite soon. Exciting times!

OISF engine development update

The last month has been crazy busy. Development of the engine is progressing nicely. My own role has been assigning tasks to our coders, guiding them, reviewing their work, integrating it and of course write code. We currently have nine people coding, not all full time though, and are still looking for more coders.

Progress has been made on a number of things: we have many more decoders, threading updates, a stats subsystem, stream tracking and reassembly, a L7 protocol parser framework and many more unittests. We’re working on OpenCL hardware accelaration, although we’re running into driver issues, so that may take some time before it’s usable.

On the QA side Will Metcalf is busy setting up an automated test rig, doing daily tests runs of our unittests on various platforms and with different compiler settings and such. When that is done pcap based tests and live traffic testing is next.

We have set up a number of “working group” mailinglists that discuss different subjects such as a configuration language and a rule language. Most are still ongoing, however the configuration language discussion seems to have come to a conclusion.

For the configuration language the discussion has settled on using YAML, a structured but still nicely editable format. It has many language bindings, so I hope management tools will be built for it later.

Other discussions, such as about the ip reputation, are still ongoing. You are very welcome to share your ideas with the group.

Like stated above, we’re still looking for coders. If you are a C coder and you’re interested in working with and for us, send us your resume!

OISF engine prototype: streams handling

I’ve been thinking about how to deal with streams in the OISF engine. We need to do stream reassembly to be able to handle spliced sessions, otherwise it would be very easy to evade detection. Snort traditionally used an approach of inspecting the packets individually and reassembling (part of) the stream in a pseudo packet, that was inspected mostly like a normal packet. Recent Snort versions, especially when Stream5 was introduced, have a so called stream api. This enables detection modules to control the reassembly better.

In Snort_inline’s Stream4 I’ve been experimenting with ways to improve stream reassembly in an inline setup. The problem with Snort’s pseudo packet scanning way of operation is that it’s after the fact scanning. Which means that any threat detected in the reassembled stream can’t be dropped anymore. The way I tried to work around this was by constantly scanning a sliding window of reassembled unacked data. It worked quite well, except for the performance of it. That was quite bad.

I’m thinking about a stream reassembler for the OISF engine that can both do the after-the-fact pseudo packet scanning and do a sliding window approach as I did in stream4inline. This would be used for the normal tcp signatures. I think it should be possible to determine the minimal size of the reassembled packet based on the signatures per port, possibly more fine grained. Of course things like target based reassembly and robust reassembly will be part of it.

In addition to this I’m thinking about a way to make modules act on the stream similary to how programs deal with sockets. Code that only wakes up if a new piece of data in that connection is received, with semantics similar to recv()/read(). I haven’t really made up my mind about how such an api should work exactly, but I think it would be very useful to detection module writers if they only have to care about handling the next chunk of data.

I haven’t implemented any of this yet, but I plan to start working on this soon. I’ll start with simple TCP state tracking that I’m planning to build on top of the flow handling already implemented. I’ll blog about this as I go…

OISF engine prototype: threading

In Januari I first wrote about my prototype code for the OISF engine. The first thing I started with when creating the code was the threading. The current code can run as a single thread or with many threads. In my normal testing I run with about 11 threads, 10 of which handle packets, 1 is a management thread.

The basic principle in the threading is that a packet is always handled by one thread at a time only. The reason for this is that it saves a lot of locking issues. If there is more than one thread, the engine can handle multiple packete simultaniously.

All functionality is created in what I call threading modules. Such a module is run in a thread. Threads can have one or more of the modules running. Examples of these modules are Decoding, Detection, Alerting, etc. I intend to make these modules plugins in the future so that 3rd party modules can be loaded without recompiling the codebase.

The threading model works both in a parallel and serial way. The parallel way can be used to have multiple threads doing the same jobs, for example have 2 threads both acquire packets, decoding, detection and alerting. The serial way of threading works differently. In that case a thread has a limited number of tasks (e.g. Decoding) and if it’s done with a packet it passes the packet on to the next thread (that for example does Detection). Both methods can be combined, which I use by default: I have 1 packet aquiring thread, 2 decoding, 2 detection, 1 verdict (I’m using nfq), and a few alerting and active response threads.

Between the (serial) threads queue’s are used to transfer the packet from one thread to another. A queue can contain multiple packets. In the above example, the nfq packet acquire thread can read packets from the queue as fast as it can and put them in it’s queue. The 2 decoding threads then get packets from this queue as fast as they can. Then they put them in the next queue where they are picked up again, etc.

Using the queue’s code paths can also be determined. It’s possible for example to have IPv4 packets be handled by different threads than IPv6 packets. Or packets with alerts differently from packets that didn’t have alerts.

One big challenge is that this is all extremely complex & configurable. Threads have to created, queue’s, queue handlers, CPU affinity can be set per thread, threading modules need to be assigned to threads, etc. I think power users & apliance builders would be interested in having all these options, but for casual users it’s probably (way) too complex to be bothered with. So some reasonable defaults need to be created, maybe in the form of having a number of pre-configure profiles.

OISF IDS/IPS engine prototype intro

For over a year I’ve been working on a prototype implementation of a new IDS/IPS engine for the Open Infosec Foundation. This is not necessarily going to be the engine we’ll be using in OISF, although it’s likely that at least some of the code will be used. Discussions about features for the engine are still ongoing (wiki, list), once that settles down we’ll see whats usable and whats not. In the worst case I still think many parts like hashing functions, pattern matcher implementations, protocol decoders, etc can be used.

So what is there so far? It’s all new code written in the C language and has about 30k lines of code in 150+ files so far. It’s fully threaded in a way that should make it very scalable on many cores/cpu’s. More about the threading in a future post. The code is heavily unit tested, which really helps a lot in preventing and tracing bugs.

Right now it’s limited to being an inline IDS/IPS, using the libnetfilter_queue interface in Linux to acquire and verdict packets. The packet input and verdict subsystem is very modular (I learned a lot from the mess we created in Snort_inline, where we supported 3 types of inline packet capture methods, creating a true #ifdef hell). It has working protocol decoders for IPv4 and IPv6, TCP and UDP. It has a flow engine, a detection engine and output plugins.

For rules/signatures it currently only supports the Snort signature syntax, and loads about 70% of the current VRT and Emerging Threats signatures out there. The biggest thing missing is support for the flowbits option, which is used in a lot of the sigs. It has basic HTTP parsing, enabling at least uri matching.

A lot of things are missing too. For example there is fragment handling, TCP stream state tracking, TCP stream reassembly, a pcap mode, portscan detection, a flowbits like function, normalization, etc, etc.

There are a lot of plans and ideas, for example having output pipes for configurable captured network data. It’s already possible to capture for example a user agent in a rule and match on that captured data. I think it would be very useful to be able to have some pipe to an external program that receives just the user agents and does something with them. Many many more ideas and usecases exist and I hope to write about that more at a later stage.

The most interesting about writing this code is that every time I’m working on some part, I’m getting more and more ideas about possibilities for improvements, optimizations and such. I intent to share those here on my blog from now. Also, I intent to write about the various parts of the code I wrote already. So stay tuned!