To be able to visualize this data, we first need to understand it's structure. Zeek's (Bro's) data by default are in a tab delimited format. To verify this, let's look at a sample connection log - conn.log.
The tool we will use to help us look at Zeek's (Bro's) data is "bro-cut". As always, you should look at the help of your tools before you utilize them.
root@securitynik-host:/opt/bro/logs/current# bro-cut --help bro-cut [options] [<columns>] Extracts the given columns from an ASCII Bro log on standard input. If no columns are given, all are selected. By default, bro-cut does not include format header blocks into the output. Example: cat conn.log | bro-cut -d ts id.orig_h id.orig_p -c Include the first format header block into the output. -C Include all format header blocks into the output. -d Convert time values into human-readable format. -D <fmt> Like -d, but specify format for time (see strftime(3) for syntax). -F <ofs> Sets a different output field separator. -n Print all fields *except* those specified. -u Like -d, but print timestamps in UTC instead of local time. -U <fmt> Like -D, but print timestamps in UTC instead of local time. For time conversion option -d or -u, the format string can be specified by setting an environment variable BRO_CUT_TIMEFMT.
For this series of posts we will focus on the "-C". This allows us the opportunity to see the field headers. This is important as we need to know the placement of the fields to properly parse them in Splunk.
Now as you may see above in the "Example: cat conn.log | bro-cut ...", Zeek's (Bro's), input typically comes from the output of "cat" which is piped in. We will do it a slightly different way, we will instead use the "conn.log" (and future logs files) and provide it as an input to "bro-cut" using "<".
Let's get going with looking at the structure of the "conn.log" file.
root@securitynik-host:/opt/bro/logs/current# bro-cut -C < conn.log | head --lines=10 --verbose .... #fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p proto service duration orig_bytes resp_bytes conn_state local_orig local_resp missed_bytes history orig_pkts orig_ip_bytes resp_pkts resp_ip_bytes tunnel_parents #types time string addr port addr port enum string interval count count string bool bool count string count countcount count set[string] 1541350796.901974 C54zqz17PXuBv3HkLg 192.168.0.26 54855 54.85.115.89 443 tcp ssl 0.153642 1147 589 SF T F0 ShADadfF 7 1439 8 921 (empty) 1541350796.904578 CFsKQb2ZSp2qo1jf7a 192.168.0.26 54856 54.85.115.89 443 tcp ssl 0.195532 1127 489 SF T F0 ShADadfF 7 1419 8 821 (empty)
As we can see above and to the right of "#fields", there are a number of fields starting with "ts", "uid", "id.orig_h", "id.orig_p", etc. Now that we know the structure, let's go back to Splunk and build (extract) these out.
Our search filter for Splunk is now:
index=_* OR index=* sourcetype=Bro-Security-Monitoring source="/opt/bro/logs/current/conn.log" NOT "#fields" | rex field=_raw "(?<ts>.*?\t)(?<uid>.*?\t)(?<orig_h>.*?\t)(?<orig_p>.*?\t)(?<resp_h>.*?\t)(?<resp_p>.*?\t)(?<proto>.*?\t)(?<service>.*?\t)(?<duration>.*?\t)(?<orig_bytes>.*?\t)(?<resp_bytes>.*?\t)(?<conn_state>.*?\t)(?<local_orig>.*?\t)(?<local_resp>.*?\t)(?<missed_bytes>.*?\t)(?<history>.*?\t)(?<orig_pkts>.*?\t)(?<orig_ip_bytes>.*?\t)(?<resp_pkts>.*?\t)(?<resp_ip_bytes>.*?\t)" | stats count by ts,uid,orig_h,orig_p,resp_h,resp_p,proto,service,duration,orig_bytes,resp_bytes,conn_state,local_orig,local_resp,missed_bytes,history,orig_pkts,orig_ip_bytes,resp_pkts,resp_ip_bytes
This filter above extract all fields except the "tunnel_parents". I had an error being reported by Splunk about needing to reconfigure the "limits.conf" file when I added this field. I was not in the mood for troubleshooting this issue as it is not a priority at this time.
Here is a sample screenshot of all the fields extracted.
Once we have extracted all the fields, then what we do with each of them is up to us. Let's expand on this a bit more by looking first for the top 100 source IPs seen by Zeek (Bro)
index=_* OR index=* sourcetype=Bro-Security-Monitoring source="/opt/bro/logs/current/conn.log" NOT "#fields" | rex field=_raw "(?<ts>.*?\t)(?<uid>.*?\t)(?<orig_h>.*?\t)(?<orig_p>.*?\t)(?<resp_h>.*?\t)(?<resp_p>.*?\t)(?<proto>.*?\t)(?<service>.*?\t)(?<duration>.*?\t)(?<orig_bytes>.*?\t)(?<resp_bytes>.*?\t)(?<conn_state>.*?\t)(?<local_orig>.*?\t)(?<local_resp>.*?\t)(?<missed_bytes>.*?\t)(?<history>.*?\t)(?<orig_pkts>.*?\t)(?<orig_ip_bytes>.*?\t)(?<resp_pkts>.*?\t)(?<resp_ip_bytes>.*?\t)" | stats count by orig_h | sort -count limit=100
The above gives us the opportunity to identify the top 100 IP addresses.
However, similarly to know what the top IP addresses are in your environment, it is also critical that you know what the unique or rare ones are. To help us with this let's run another filter.
index=_* OR index=* sourcetype=Bro-Security-Monitoring source="/opt/bro/logs/current/conn.log" NOT "#fields" NOT src_ip=192.168.0.0/24 NOT src_ip=0.0.0.0 | rex field=_raw "(?<ts>.*?\t)(?<uid>.*?\t)(?<orig_h>.*?\t)(?<orig_p>.*?\t)(?<resp_h>.*?\t)(?<resp_p>.*?\t)(?<proto>.*?\t)(?<service>.*?\t)(?<duration>.*?\t)(?<orig_bytes>.*?\t)(?<resp_bytes>.*?\t)(?<conn_state>.*?\t)(?<local_orig>.*?\t)(?<local_resp>.*?\t)(?<missed_bytes>.*?\t)(?<history>.*?\t)(?<orig_pkts>.*?\t)(?<orig_ip_bytes>.*?\t)(?<resp_pkts>.*?\t)(?<resp_ip_bytes>.*?\t)" | stats count by orig_h | rare limit=25 orig_h
This time we visualize using a pie chart.
As you see in the first extraction, we used a table view. In the second we used a pie chart. Feel free to experiment with what best suits you.
Let's move on to the top source and destination IP pairs along with the destination ports on which the communication is occurring.
index=_* OR index=* sourcetype=Bro-Security-Monitoring source="/opt/bro/logs/current/conn.log" NOT "#fields" NOT dst_ip=192.168.0.0/24 NOT dst_ip=208.67.222.222 NOT dst_ip=208.67.220.220 NOT dst_ip=224.0.0.0/8 NOT dst_ip=239.0.0.0/8 NOT dst_ip=255.255.255.255 NOT src_ip=0.0.0.0 | rex field=_raw "(?<ts>.*?\t)(?<uid>.*?\t)(?<orig_h>.*?\t)(?<orig_p>.*?\t)(?<resp_h>.*?\t)(?<resp_p>.*?\t)(?<proto>.*?\t)(?<service>.*?\t)(?<duration>.*?\t)(?<orig_bytes>.*?\t)(?<resp_bytes>.*?\t)(?<conn_state>.*?\t)(?<local_orig>.*?\t)(?<local_resp>.*?\t)(?<missed_bytes>.*?\t)(?<history>.*?\t)(?<orig_pkts>.*?\t)(?<orig_ip_bytes>.*?\t)(?<resp_pkts>.*?\t)(?<resp_ip_bytes>.*?\t)" | stats count by orig_h,resp_h,resp_p | dedup orig_h,resp_h,resp_p | sort -count
The above produces the following:
At this point, let's wrap this up. As you have already extracted all the fields above, you can basically use any of these fields you wish to gain statistics on your environment. I recommend that you look at the destination IPs and ports also.
See you in the next post where we focus on the "http.log" file.
Posts in this series:
Visualizing your Zeek (Bro) data with Splunk - The Setup
Visualizing your Zeek (Bro) data with Splunk - conn.log (connection logs)
Visualizing your Zeek (Bro) data with Splunk - http.log (http logs)
Visualizing your Zeek (Bro) data with Splunk - dns.log (connection logs)
Visualizing your Zeek (Bro) data with Splunk - x509.log (connection logs)
No comments:
Post a Comment