Saturday, November 10, 2018

Visualizing your Zeek (Bro) data with Splunk - conn.log (connection logs)

To be able to visualize this data, we first need to understand it's structure. Zeek's (Bro's) data by default are in a tab delimited format. To verify this, let's look at a sample connection log - conn.log.

The tool we will use to help us look at Zeek's (Bro's) data is "bro-cut". As always, you should look at the help of your tools before you utilize them.


root@securitynik-host:/opt/bro/logs/current# bro-cut --help

bro-cut [options] [<columns>]

Extracts the given columns from an ASCII Bro log on standard input.
If no columns are given, all are selected. By default, bro-cut does
not include format header blocks into the output.

Example: cat conn.log | bro-cut -d ts id.orig_h id.orig_p

    -c       Include the first format header block into the output.
    -C       Include all format header blocks into the output.
    -d       Convert time values into human-readable format.
    -D <fmt> Like -d, but specify format for time (see strftime(3) for syntax).
    -F <ofs> Sets a different output field separator.
    -n       Print all fields *except* those specified.
    -u       Like -d, but print timestamps in UTC instead of local time.
    -U <fmt> Like -D, but print timestamps in UTC instead of local time.

For time conversion option -d or -u, the format string can be specified by
setting an environment variable BRO_CUT_TIMEFMT.

For this series of posts we will focus on the "-C". This allows us the opportunity to see the field headers. This is important as we need to know the placement of the fields to properly parse them in Splunk.

Now as you may see above in the "Example: cat conn.log | bro-cut ...", Zeek's (Bro's), input typically comes from the output of "cat" which is piped in. We will do it a slightly different way, we will instead use the "conn.log" (and future logs files) and provide it as an input to "bro-cut" using "<".

Let's get going with looking at the structure of the "conn.log" file.


root@securitynik-host:/opt/bro/logs/current# bro-cut -C < conn.log | head --lines=10 --verbose
....
#fields ts      uid     id.orig_h       id.orig_p       id.resp_h       id.resp_p       proto   service duration        orig_bytes      resp_bytes      conn_state    local_orig      local_resp      missed_bytes    history orig_pkts       orig_ip_bytes   resp_pkts       resp_ip_bytes   tunnel_parents
#types  time    string  addr    port    addr    port    enum    string  interval        count   count   string  bool    bool    count   string  count   countcount    count   set[string]
1541350796.901974       C54zqz17PXuBv3HkLg      192.168.0.26    54855   54.85.115.89    443     tcp     ssl     0.153642        1147    589     SF      T    F0       ShADadfF        7       1439    8       921     (empty)
1541350796.904578       CFsKQb2ZSp2qo1jf7a      192.168.0.26    54856   54.85.115.89    443     tcp     ssl     0.195532        1127    489     SF      T    F0       ShADadfF        7       1419    8       821     (empty)

As we can see above and to the right of "#fields", there are a number of fields starting with "ts", "uid", "id.orig_h", "id.orig_p", etc. Now that we know the structure, let's go back to Splunk and build (extract) these out.

Our search filter for Splunk is now:

index=_* OR index=* sourcetype=Bro-Security-Monitoring source="/opt/bro/logs/current/conn.log" NOT "#fields"
| rex field=_raw "(?<ts>.*?\t)(?<uid>.*?\t)(?<orig_h>.*?\t)(?<orig_p>.*?\t)(?<resp_h>.*?\t)(?<resp_p>.*?\t)(?<proto>.*?\t)(?<service>.*?\t)(?<duration>.*?\t)(?<orig_bytes>.*?\t)(?<resp_bytes>.*?\t)(?<conn_state>.*?\t)(?<local_orig>.*?\t)(?<local_resp>.*?\t)(?<missed_bytes>.*?\t)(?<history>.*?\t)(?<orig_pkts>.*?\t)(?<orig_ip_bytes>.*?\t)(?<resp_pkts>.*?\t)(?<resp_ip_bytes>.*?\t)" 
|  stats count by ts,uid,orig_h,orig_p,resp_h,resp_p,proto,service,duration,orig_bytes,resp_bytes,conn_state,local_orig,local_resp,missed_bytes,history,orig_pkts,orig_ip_bytes,resp_pkts,resp_ip_bytes

This filter above extract all fields except the "tunnel_parents". I had an error being reported by Splunk about needing to reconfigure the "limits.conf" file when I added this field. I was not in the mood for troubleshooting this issue as it is not a priority at this time.

Here is a sample screenshot of all the fields extracted.



Once we have extracted all the fields, then what we do with each of them is up to us. Let's expand on this a bit more by looking first for the top 100 source IPs seen by Zeek (Bro)

index=_* OR index=* sourcetype=Bro-Security-Monitoring source="/opt/bro/logs/current/conn.log" NOT "#fields"
| rex field=_raw "(?<ts>.*?\t)(?<uid>.*?\t)(?<orig_h>.*?\t)(?<orig_p>.*?\t)(?<resp_h>.*?\t)(?<resp_p>.*?\t)(?<proto>.*?\t)(?<service>.*?\t)(?<duration>.*?\t)(?<orig_bytes>.*?\t)(?<resp_bytes>.*?\t)(?<conn_state>.*?\t)(?<local_orig>.*?\t)(?<local_resp>.*?\t)(?<missed_bytes>.*?\t)(?<history>.*?\t)(?<orig_pkts>.*?\t)(?<orig_ip_bytes>.*?\t)(?<resp_pkts>.*?\t)(?<resp_ip_bytes>.*?\t)" 
|  stats count by orig_h 
| sort -count limit=100



























The above gives us the opportunity to identify the top 100 IP addresses.

However, similarly to know what the top IP addresses are in your environment, it is also critical that you know what the unique or rare ones are. To help us with this let's run another filter.

index=_* OR index=* sourcetype=Bro-Security-Monitoring source="/opt/bro/logs/current/conn.log" NOT "#fields" NOT src_ip=192.168.0.0/24 NOT src_ip=0.0.0.0
| rex field=_raw "(?<ts>.*?\t)(?<uid>.*?\t)(?<orig_h>.*?\t)(?<orig_p>.*?\t)(?<resp_h>.*?\t)(?<resp_p>.*?\t)(?<proto>.*?\t)(?<service>.*?\t)(?<duration>.*?\t)(?<orig_bytes>.*?\t)(?<resp_bytes>.*?\t)(?<conn_state>.*?\t)(?<local_orig>.*?\t)(?<local_resp>.*?\t)(?<missed_bytes>.*?\t)(?<history>.*?\t)(?<orig_pkts>.*?\t)(?<orig_ip_bytes>.*?\t)(?<resp_pkts>.*?\t)(?<resp_ip_bytes>.*?\t)" 
|  stats count by orig_h 
| rare limit=25 orig_h

This time we visualize using a pie chart.



























As you see in the first extraction, we used a table view. In the second we used a pie chart. Feel free to experiment with what best suits you.

Let's move on to the top source and destination IP pairs along with the destination ports on which the communication is occurring.


index=_* OR index=* sourcetype=Bro-Security-Monitoring source="/opt/bro/logs/current/conn.log" NOT "#fields" NOT dst_ip=192.168.0.0/24 NOT dst_ip=208.67.222.222 NOT dst_ip=208.67.220.220 NOT dst_ip=224.0.0.0/8 NOT dst_ip=239.0.0.0/8 NOT dst_ip=255.255.255.255 NOT src_ip=0.0.0.0
| rex field=_raw "(?<ts>.*?\t)(?<uid>.*?\t)(?<orig_h>.*?\t)(?<orig_p>.*?\t)(?<resp_h>.*?\t)(?<resp_p>.*?\t)(?<proto>.*?\t)(?<service>.*?\t)(?<duration>.*?\t)(?<orig_bytes>.*?\t)(?<resp_bytes>.*?\t)(?<conn_state>.*?\t)(?<local_orig>.*?\t)(?<local_resp>.*?\t)(?<missed_bytes>.*?\t)(?<history>.*?\t)(?<orig_pkts>.*?\t)(?<orig_ip_bytes>.*?\t)(?<resp_pkts>.*?\t)(?<resp_ip_bytes>.*?\t)" 
|  stats count by orig_h,resp_h,resp_p 
| dedup orig_h,resp_h,resp_p 
| sort -count

The above produces the following:



























At this point, let's wrap this up. As you have already extracted all the fields above, you can basically use any of these fields you wish to gain statistics on your environment. I recommend that you look at the destination IPs and ports also.

See you in the next post where we focus on the "http.log" file.

Posts in this series:
Visualizing your Zeek (Bro) data with Splunk - The Setup
Visualizing your Zeek (Bro) data with Splunk - conn.log (connection logs)
Visualizing your Zeek (Bro) data with Splunk - http.log (http logs)
Visualizing your Zeek (Bro) data with Splunk - dns.log (connection logs)
Visualizing your Zeek (Bro) data with Splunk - x509.log (connection logs)


No comments:

Post a Comment