Monday, February 3, 2014

Conditional Intelligence

Intel Framework Background

Bro's intelligence framework allows programmers to build tables of indicators for Bro to use while monitoring a network. Indicators come in different types. For example, Intel::DOMAIN, Intel::SOFTWARE, and Intel::EMAIL are all indicator types. Given an indicator with a type of Intel::DOMAIN, Bro will monitor for occurrences of that domain in all locations a domain could be. Bro's intel framework also allows an indicator to have a location specified with it. For example, Bro is able to differentiate between 'evil.com' within an HTTP host header or 'evil.com' within a DNS qname. Likewise, an Intel::SOFTWARE indicator can monitor for an HTTP user agent value, SMTP user agent value, or a server response version value.

Bro is able to do these types of tasks easily because of how Bro handles streams. Bro parses protocols it recognizes and creates objects for each connection (an originating and responding stream) as the connection occurs. Protocols have predefined (but redefinable) structures associated with them (Bro uses the internal record data type for this). For examples of what indentifiers are witihin a connection, try printing a connection object to STDOUT from a connection_state_remove event. A connection object is not only a container for other protocol objects, but additional fields used internally by different Bro scripts. For example, look at how the c$dns$ready value is used within %PREFIX%/bro/share/bro/base/protocols/dns/main.bro, which handles creating parts of the DNS object within connection objects.

The intel framework essentially allows for the monitoring of indicators (string values) within a selection of object identifiers, aka fields. This is very powerful. A large amount of object identifiers are provided by Bro out of the box. Additionally, object are redefinable by programmers and fields/identifiers witihin them don't need to be extracted from values within the connection; they can be derived from other any other source within scriptland (any other identifier/field added to protocol records with redef statements).


Conditional Intelligence (aka rules)

I've been playing around with Bro's intelligence framework the past month or so and have created a prototype extension for it that allows conditional rules to be built on top of indicators. Inspired in part by Yara's combination of strings and conditions and IOC's search capability, this extension's goal is to allow for complex situations to be recognized by Bro without the need to write and rewrite different event handlers. For example, consider HTTP connections to 'www.google.com' with a user agent of 'Wget/1.13.4 (linux-gnu)', a URI path of '/foo' or '/bar', and a return status code of 404. Generating this type of connection is easily done by running

wget www.google.com/foo

from the command line. Identifying this within a Bro script is easily done with event pseudo code such as:

event connection_state_remove(c: connection)
{
    if (! c?$http) return;
    if ( (c$http$host == "www.google.com") &&
         (/^\/foo/ in c$http$uri || /^\/bar/ in c$http$uri) &&
         (c$http$user_agent == "Wget/1.13.4 (linux-gnu)") &&
         (c$http$status == 404)
       )
    {
        print "found a connection";
    }
}

If one day Google decides to change their server's HTTP response status code from a 404 to a 401, the above code needs to be modified or a new event handler has to be written. Maintaining a list of rules and indicators would be much easier than maintaining sets of event handling scripts. The above example using Google and wget is already built into the Rules extension's indicator and rules files found here. Rules can consist of: only indicators, nested rules (yo dawg, rules of rules), or mixed rules (rules of indicators and rules). By building a framework where indicators and rules can be reused and recombined, the amount of script writing needed to identify when something very specific happens can be reduced.

Currently, the largest limitation of the extension is scalability. Unique identifies for indicators were needed for rules to reference. This was accomplished by expanding the available metadata fields associated with indicators (provided by Bro's intel framework). Unfortunately, Bro does not distribute metadata to all worker nodes (for good reasons). This leaves the Rules extension for standalone instances only. Additionally, Bro's intel framework only supports string indicators. I have yet to determine if no having regular expressions is a hindrance or a blessing. I'll keep everyone posted :)