Sunday, August 16, 2015

Exploring Rabbit Holes

First off, thanks to @brucedang for getting me back into this blog. I started writing this thing for myself in hopes of keeping track of what I've done with Bro and how I learned it. It's rewarding to hear other people find it useful too.

I was tossing around a few ideas for a blog post and decided to grep through scripts/ for modules to discuss. If you search Github for "module" you can find them, but the command I used to do so was:
grep -R '^module' ./* | cut -d':' -f2 -s | sort -u | less -S

I came across a few things I hadn't heard of or didn't remember digging into before. The KRB module was new to me. It's for Kerberos protocol analysis. Other modules were more familiar to me but I haven't grokked the code in them. Things such as:
  • Known
  • Weird
  • Files
  • PE
  • AppStats

I also found a confusing module called Threading. I found this confusing because I was under the impression Bro was single threaded. The documentation for setting up a Bro cluster discusses CPU pinning for workers. What was this threading module used for?

I grepped through scripts/ again for any references to this module with:
grep -R '^module Thread' ./*
One hit. In scripts/base/init-bare.bro. This file is hard coded (linking to lines in Github generally isn't immutable, so search for 'add_input_file("base/init-bare.bro");') to load as long as Bro isn't run in "bare" mode (a mode where Bro loads a minimal set of scripts). So I opened this file and found a very small amount of Bro script belonging to that module's namespace. In the module's export declartion was a single re-definable constant called "heartbeat_interval". It was set to 1.0 seconds. There were also some comments stating that changing this value will likely break some things.

Interesting. So I searched through scripts/ again, this time looking for references to "heartbeat_interval". I found zero use of the value. So why was it defined at every Bro invocation?

At first I thought it must be dead code. Perhaps the module was left over from an unfinished feature or perhaps it was from a feature that was available in an older version but longer supported. Large C++ projects are often notorious for having dead code. But this is the Bro team; I should have known better.

I went to Github again and looked through the blames around that line in inti-bare. The definition of heartbeat_interval was done in a merge named "topic/robin/input-thread-merge". I was still confused. So I grepped through src/ to try to find where this value was used. This time I found a few different .cc files.

Again going back to Github and looking at the blames for these files I found the heartbeat_interval variable is used in the Input framework. Going back to the Input framework documentation, the use of threads for reading data into Bro is described. Thus the mystery was solved. Threads are used for reading in files from disk for Bro to use such as in the Intel framework. I had encountered the use of threads in the Input framework before but didn't truly know how they worked until now.

Pretty neat. 

This is how I found one of the many interesting pieces of Bro. If you poke around the scripts and core source enough, you find a huge amount of interesting code and concepts.