Learning Rust: hash map lookup/insert pattern

In Suricata we’re experimenting with implementing app-layer parser in Rust. See Pierre Chifflier’s presentation at the last SuriCon: [pdf].

The first experimental parsers will soon land in master.

So coming from a C world I often use a pattern like:

value = hash_lookup(hashtable, key)
if (!value) {
    hash_insert(hashtable, key, somevalue);
}

Playing with Rust and it’s HashMap implementation I wanted to do something very similar. Look up a vector and update it with the new data if it exists, or create a new vector if not:

match self.chunks.get_mut(&self.cur_ooo_chunk_offset) {
    Some(mut v) => {
        v.extend(data);
    },
    None => {
        let mut v = Vec::with_capacity(32768);
        v.extend(data);
        self.chunks.insert(self.cur_ooo_chunk_offset, v);
    },
};

Not super compact but it looks sane to me. However, Rust’s borrow checker doesn’t accept it.

src/filetracker.rs:233:29: 233:40 error: cannot borrow `self.chunks` as mutable more than once at a time [E0499]
src/filetracker.rs:233                             self.chunks.insert(self.cur_ooo_chunk_offset, v);
                                                   ^~~~~~~~~~~
src/filetracker.rs:233:29: 233:40 help: run `rustc --explain E0499` to see a detailed explanation
src/filetracker.rs:224:27: 224:38 note: previous borrow of `self.chunks` occurs here; the mutable borrow prevents //subsequent moves, borrows, or modification of `self.chunks` until the borrow ends
src/filetracker.rs:224                     match self.chunks.get_mut(&self.cur_ooo_chunk_offset) {
                                                 ^~~~~~~~~~~
src/filetracker.rs:235:22: 235:22 note: previous borrow ends here
src/filetracker.rs:224                     match self.chunks.get_mut(&self.cur_ooo_chunk_offset) {
...
src/filetracker.rs:235                     };
                                           ^
error: aborting due to previous error

Rust has strict rules on taking references. There can be only one mutable reference at one time, or multiple immutable references.

The ‘match self.chunks.get_mut(&self.cur_ooo_chunk_offset)’ counts as one mutable reference. ‘self.chunks.insert(self.cur_ooo_chunk_offset, v)’ would be the second. Thus the error.

My naive way of working around it is this:

let found = match self.chunks.get_mut(&self.cur_ooo_chunk_offset) {
    Some(mut v) => {
        v.extend(data);
        true
    },
    None => { false },
};
if !found {
    let mut v = Vec::with_capacity(32768);
    v.extend(data);
    self.chunks.insert(self.cur_ooo_chunk_offset, v);
}

This is accepted by the compiler and works.

But I wasn’t quite happy yet, so I started looking for something better. I found this post on StackOverflow (where else?)

It turns there is a Rust pattern for this:

use std::collections::hash_map::Entry::{Occupied, Vacant};

let c = match self.chunks.entry(self.cur_ooo_chunk_offset) {
    Vacant(entry) => entry.insert(Vec::with_capacity(32768)),
    Occupied(entry) => entry.into_mut(),
};
c.extend(data);

Much better 🙂

It can even be done in a single line:

(*self.chunks.entry(self.cur_ooo_chunk_offset).or_insert(Vec::with_capacity(32768))).extend(data);

But personally I think this is getting too hard to read. But maybe I just need to grow into Rust syntax a bit more.