Friday, August 9, 2013

Intro to Brogramming - Complex Data Types

The previous section covered variable conventions and basic atomic data structures in Bro. This section continues to build on the previous section
by introducing complex data structures. These complex types are composed of atomic types and behave differently. Below is a list of the complex data structures Bro has available for use.
  • enumerable - Enumerables are collections of dissimilar things and are used to bypass strong typing. Enumerable are strange, I think of them as sets of things with no type.
  • records - Records are collections of related things. records can be thought of as a single row within a table or a C-like structure. Each variable in a record has a type. Records are very important in Bro. Logging is done with records. Other data structures, such as the connection type, are built using records.
  • sets - Sets are two dimensional arrays of a single atomic type. Think of a set as an array of things that share a type. A set of strings could list user-agents.
  • tables - Tables are just associative arrays. Tables have keys which map to values.  Tables in Bro are similar to hashes in Perl/Ruby and dictionaries in Python. In Perl a common data structure is a hash of hashes, in Bro it is very possible to have a table of tables.
  • vectors - Vectors are tables which are always indexed by a count. Vectors may be on their way out of the Bro code (I heard something about that). Try not to get too attached to vectors.
  • functions - Functions are named blocks of code, surrounded by curly braces, that can be reused. Functions sometimes return a value and always require a type. If a function doesn't return a value it must a a type of void.
  • events - Events are raised by the C++ event engine within Bro. Code within event blocks is executed when Bro's core raises that event. Events often get passed parameters from the event engine to use within the user defined code block. For example, when the event engine raises a DNS request event it passes scriptland information about that request, like the query string, the IP address making the request, the IP address the request is being sent to, and a bunch of other information.
  • file - A file handle. Bro can write to files, and often does within the logging framwork, but the Input framework should be used to read from files. (We'll get to frameworks later on).
Other variable types exist within the Bro source, however they are built on combinations of simple and complex data types presented in this and the previous section and are mostly syntacitc sugar. The connection type I mentioned earlier is one such type. It is easy to image how these types can be combined to create many different data structures. For example, a database table could be represented by a table of records.

Because Bro deals with network traffic, which has a high volatility and often a high volume, data goes stale quickly and Bro needs a way to deal with this.
Attribute decorations are used to give a variable a special property, such as when the variable expires (variable timers), or if the variable is required to be defined (e.g. within a record). The Bro website details all the built in data types in great length, so I won't. It can be found here.

Run and read this script and be sure you understand what is going on.

Within the script the following occurs
  • A new variable type is built using a record type. 
  • Values are assigned to two instantiations of the NewType and printed to STDOUT. 
  • Sets of ports and strings are created, values are added and removed from them. 
  • The size of the set is calculated and printed to STOUT. 
  • A port table is created, accessed, and printed to STDOUT using a for loop (we'll get to control flow later).
    Run the script a few times and pay attention to the order the table values are printed to STDOUT each time. For loops in Bro are random access. Each value will get accessed once within the loop, but their order is NOT guaranteed. This is rather different from languages I was used to and its bitten me more than once. Be aware of the random access for loop.

No comments:

Post a Comment