Monday, October 9, 2023

Beginning Fourier Transform - Detecting Beaconing in our networks

Before digging any deeper, I must state, this notebook/post heavily leverages the work done by Joe Petroske on "Hunting Beacon Activity with Fourier Transforms" along with his notebook on GitHub at https://github.com/target/Threat-Hunting/blob/master/Beacon%20Hunting/find_beacons_by_fourier.ipynb

More importantly, it ties together what we teach in the SANS SEC595: Applied Data Science and AI/Machine Learning for Cybersecurity Professionals as a relates to leveraging Fourier Analysis to find beacons: https://www.sans.org/cyber-security-courses/applied-data-science-machine-learning/

While as mentioned above, this notebook/post leverages the above content heavily, we will move this from a problem to a solution. Meaning, we will start from scratch and then implement the solution, once again, based heavily on Joe's code. This way, when you are about to implement this in your environment, you are clear on how you can solve your problems.

You can grab the link to my notebook from my GitHub:

Issue/Problem/Concern:

One day, while capturing some packets for an unrelated issue, I saw the following:

securitynik@peeper:~$ sudo tcpdump -n --interface 2 '(port 53) and not (host 127.0.0.1)' -c 10  
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode  
listening on any, link-type EN10MB (Ethernet), snapshot length 262144 bytes  
18:39:24.124355 IP 10.0.0.9.46088 > 10.0.0.2.53: 40639+ A? somedomain.securitynik.local. (44)  
18:39:24.124604 IP 10.0.0.2.53 > 10.0.0.9.46088: 40639 4/0/0 CNAME somedomain.ca.securitynik.local., CNAME   securitynik-something.us-east-1.elb.amazonaws.com., A 172.16.16.55, A 172.16.16.211 (203)  
18:39:26.134773 IP 10.0.0.9.50992 > 10.0.0.2.53: 40640+ A? somedomain.securitynik.local. (44)  
18:39:26.135072 IP 10.0.0.2.53 > 10.0.0.9.50992: 40640 4/0/0 CNAME somedomain.ca.securitynik.local., CNAME     securitynik-something.us-east-1.elb.amazonaws.com., A 172.16.16.211, A 172.16.16.55 (203)  
18:39:28.144568 IP 10.0.0.9.49995 > 10.0.0.2.53: 40641+ A? somedomain.securitynik.local. (44)  
18:39:28.144829 IP 10.0.0.2.53 > 10.0.0.9.49995: 40641 4/0/0 CNAME somedomain.ca.securitynik.local., CNAME   securitynik-something.us-east-1.elb.amazonaws.com., A 172.16.16.55, A 172.16.16.211 (203)  
18:39:29.172416 IP 10.0.0.32.41636 > 10.0.0.2.53: 2+ A? pool.ntp.org. (30)  
18:39:29.181785 IP 10.0.0.2.53 > 10.0.0.32.41636: 2 4/0/0 A 162.159.200.123, A 137.220.55.232, A 217.180.209.214, A 209.115.181.107 (94)
...
 

Did you see anything interesting? 

I doubt whether at first glance, you saw what the issue is. Do you see the issue now that I have highlighted the time below?

securitynik@peeper:~$ sudo tcpdump -n --interface 2 '(port 53) and not (host 127.0.0.1)' -c 10  
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode  
listening on any, link-type EN10MB (Ethernet), snapshot length 262144 bytes  
**18:39:24**.124355 IP 10.0.0.9.46088 > 10.0.0.2.53: 40639+ A? somedomain.securitynik.local. (44)  
**18:39:24**.124604 IP 10.0.0.2.53 > 10.0.0.9.46088: 40639 4/0/0 CNAME somedomain.ca.securitynik.local., CNAME   securitynik-something.us-east-1.elb.amazonaws.com., A 172.16.16.55, A 172.16.16.211 (203)  
**18:39:26**.134773 IP 10.0.0.9.50992 > 10.0.0.2.53: 40640+ A? somedomain.securitynik.local. (44)  
**18:39:26**.135072 IP 10.0.0.2.53 > 10.0.0.9.50992: 40640 4/0/0 CNAME somedomain.ca.securitynik.local., CNAME   securitynik-something.us-east-1.elb.amazonaws.com., A 172.16.16.211, A 172.16.16.55 (203)  
**18:39:28**.144568 IP 10.0.0.9.49995 > 10.0.0.2.53: 40641+ A? somedomain.securitynik.local. (44)  
**18:39:28**.144829 IP 10.0.0.2.53 > 10.0.0.9.49995: 40641 4/0/0 CNAME somedomain.ca.securitynik.local., CNAME   securitynik-something.us-east-1.elb.amazonaws.com., A 172.16.16.55, A 172.16.16.211 (203)  
18:39:29.172416 IP 10.0.0.32.41636 > 10.0.0.2.53: 2+ A? pool.ntp.org. (30)  
18:39:29.181785 IP 10.0.0.2.53 > 10.0.0.32.41636: 2 4/0/0 A 162.159.200.123, A 137.220.55.232, A 217.180.209.214, A 209.115.181.107 (94)  
...

This DNS query is being made every 2 seconds it seems.  

This may be some type of beaconing. Or maybe it is just normal activity.  

Let's dig a bit deeper with TShark to see that there is definitely something worth paying attention to.   

Capture and write a few packets with tcpdump to the file system.  

securitynik@peeper:~$ **sudo tcpdump -n --interface 2 '(port 53) and not (host 127.0.0.1)' -v -w /tmp/dns-beacon.pcap**  
tcpdump: listening on any, link-type EN10MB (Ethernet), snapshot length 262144 bytes  
^C368 packets captured  
368 packets received by filter  

Take a view of some of the statistics from TShark for this specific host at 10.0.0.9

securitynik@peeper:~$ tshark -n -r /tmp/dns-beacon.pcap -q -z "io,stat,2,ip.addr==10.0.0.9 && udp.port==53" -t ad | more  

===============================================  
| IO Statistics                               |  
|                                             |  
| Duration: 205. 49758 secs                   |  
| Interval:   2 secs                          |  
|                                             |  
| Col 1: ip.addr==10.0.0.9 && udp.port==53    |  
|---------------------------------------------|  
|                     |1               |      |  
| Date and time       | Frames | Bytes |      |  
|--------------------------------------|      |  
| 2023-10-01 18:46:05 |      2 |   331 |      |   
| 2023-10-01 18:46:07 |      2 |   331 |      |  
| 2023-10-01 18:46:09 |      2 |   331 |      |    
| 2023-10-01 18:46:11 |      2 |   331 |      |    
| 2023-10-01 18:46:13 |      2 |   331 |      |    
| 2023-10-01 18:46:15 |      2 |   331 |      |  
| 2023-10-01 18:46:17 |      2 |   331 |      |  
| 2023-10-01 18:46:19 |      2 |   331 |      |  
| 2023-10-01 18:46:21 |      2 |   331 |      |  
| 2023-10-01 18:46:23 |      2 |   331 |      |  
| 2023-10-01 18:46:25 |      2 |   331 |      |  
| 2023-10-01 18:46:27 |      2 |   331 |      |  
| 2023-10-01 18:46:29 |      2 |   331 |      |  
| 2023-10-01 18:46:31 |      2 |   331 |      |  
| 2023-10-01 18:46:33 |      2 |   331 |      |  
| 2023-10-01 18:46:35 |      2 |   331 |      |  
| 2023-10-01 18:46:37 |      2 |   331 |      |  
| 2023-10-01 18:46:39 |      2 |   331 |      |  
...

Clearly from above, we can see there is something interesting. Every 2 seconds, we have 2 frames of the same size 331 bytes.  

At this point, we can connect to the host to attempt to learn which process might be making this request.  

I'm taking a different route, as this post/notebook is about looking at things from the network perspective.  

Fortunately for us, one of the tools in this monitored environment is Zeek. A Security monitoring framework we spend a lot of time on during day 4 of the SANS SEC503: Network Monitoring and Threat Detection In-Depth.

While I can pull this specific log, let's instead go back in time to extract a historical log. More specifically, I'm taking a log of the time we know this network should not be busy. Let's take a log file that should have records for between 01:00 and 02:00 AM.

securitynik@peeper:~$ ** ls /opt/zeek/logs/2023-10-01/dns.01\:00\:00-02\:00\:00.log.gz** 
/opt/zeek/logs/2023-10-01/dns.01:00:00-02:00:00.log.gz  

Let's read this log with zcat and then pipe it into jq then output it to a file

Here is what a sample from the Zeeks DNS log look like in NSON.

securitynik@peeper:~$ zcat /opt/zeek/logs/2023-10-01/dns.01\:00\:00-02\:00\:00.log.gz | jq '.' | more 
{  
  "ts": 1696122000.354959,  
  "uid": "CZ7wYd2iz86Xl4KbKl",  
  "id.orig_h": "10.0.0.4",
  "id.orig_p": 45084,
  "id.resp_h": "172.17.17.202",
  "id.resp_p": 53,
  "proto": "udp",
  "trans_id": 45635,
  "query": "4.0.0.10.in-addr.arpa",
  "qclass": 1,
  "qclass_name": "C_INTERNET",
  "qtype": 12,
  "qtype_name": "PTR",
  "rcode": 3,
  "rcode_name": "NXDOMAIN",
  "AA": false,
  "TC": false,
  "RD": true,
  "RA": false,
  "Z": 0,
  "rejected": false
}

Writing the log out to a file that can be read by Pandas  
Notice the "--slurp". If I don't use this, Pandas is going to complain about some trailing data issue and fail to read the file: See this link: https://datascientyst.com/fix-valueerror-trailing-data-pandas-and-json/

securitynik@peeper:~$ cat /opt/zeek/logs/2023-10-01/dns.01\:00\:00-02\:00\:00.log.gz | jq '.' --slurp > /tmp/dns-beacon-blog.json
securitynik@peeper:~$ ls /tmp/dns-beacon-blog.json
/tmp/dns-beacon-blog.json  

With this file in place, let's now copy the file to our local system where we will leverage some data science and the Fast Fourier Transform algorithm to solve this beaconing issue once and for all :-) 

C:\Users\SecurityNik>scp securitynik@peeper:/tmp/dns-beacon-blog.json d:\ml\dns-beacon-blog.json  
securitynik@peeper's password:  
dns-beacon-blog.json                                                                               100% 5337KB  12.4MB/s   00:00  

Load some libraries to start getting the real work done

import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import matplotlib.pyplot as plt

Read our DNS Zeek log data.  Do note, while I am using the DNS log, you can use any log file you want that is coming out of Zeek. Notice though, my file is in JSON format. If you have a .CSV file, you will need to read that instead. This also means you may need to make other changes as you read your input.

df_dns = pd.read_json(r'd:/ML/dns-beacon-blog.json', date_unit='s')
df_dns


ts	uid	id.orig_h	id.orig_p	id.resp_h	id.resp_p	proto	trans_id	query	qclass	...	rcode_name	AA	TC	RD	RA	Z	rejected	rtt	answers	TTLs
0	1.696122e+09	CZ7wYd2iz86Xl4KbKl	10.0.0.4	45084	172.17.17.202	53	udp	45635	4.0.0.10.in-addr.arpa	1.0	...	NXDOMAIN	False	False	True	False	0	False	NaN	NaN	NaN
1	1.696122e+09	CZ7wYd2iz86Xl4KbKl	10.0.0.4	45084	172.17.17.202	53	udp	45635	4.0.0.10.in-addr.arpa	1.0	...	NXDOMAIN	False	False	True	False	0	False	NaN	NaN	NaN
2	1.696122e+09	C3uf182pULaa9EMXSk	10.0.0.4	50481	172.17.17.202	53	udp	22814	37.0.0.10.in-addr.arpa	1.0	...	NXDOMAIN	False	False	True	False	0	False	NaN	NaN	NaN
3	1.696122e+09	C3uf182pULaa9EMXSk	10.0.0.4	50481	172.17.17.202	53	udp	22814	37.0.0.10.in-addr.arpa	1.0	...	NXDOMAIN	False	False	True	False	0	False	NaN	NaN	NaN
4	1.696122e+09	CCUXAw1G7JacmmyKg5	10.0.0.4	57870	172.17.17.202	53	udp	43043	2.0.0.10.in-addr.arpa	1.0	...	NXDOMAIN	False	False	True	False	0	False	NaN	NaN	NaN
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
8212	1.696126e+09	CKcOYZKBKoynKzGWb	10.0.0.8	45024	172.17.17.198	53	udp	60189	3.pool.ntp.org	1.0	...	NOERROR	False	False	True	True	0	False	0.015551	[192.95.0.223, 158.69.20.38, 174.94.155.224, 1...	[26, 26, 26, 26]
8213	1.696126e+09	Cybpy5GqwGfARfcBd	10.0.0.8	47334	172.17.17.198	53	udp	60445	time.google.com	1.0	...	NOERROR	False	False	True	True	0	False	0.015610	[216.239.35.8, 216.239.35.12, 216.239.35.0, 21...	[13571, 13571, 13571, 13571]
8214	1.696126e+09	CoJXrj4DErFd7n6BMk	10.0.0.9	40965	10.0.0.2	53	udp	10875	somedomain.securitynik.local	1.0	...	NOERROR	False	False	True	True	0	False	0.000250	[somedomain.ca.securitynik.local, a37295100167...	[83, 23, 23, 23]
8215	1.696126e+09	CPwL9M3cooP7rtZmB9	10.0.0.24	36625	10.0.0.2	53	udp	44475	i.ytimg.com	1.0	...	NOERROR	False	False	True	True	0	False	0.013948	[142.251.33.182, 142.251.41.86, 142.251.32.86,...	[274, 274, 274, 274]
8216	1.696126e+09	CQVtNH8FvtlwO44Fl	10.0.0.24	58969	10.0.0.2	53	udp	60354	youtubei.googleapis.com	1.0	...	NOERROR	False	False	True	True	0	False	0.035568	[142.251.32.74, 142.251.41.42, 172.217.1.10, 1...	[249, 249, 249, 249, 249, 249]
8217 rows × 24 columns

Get the list of columns. I need this as I will drop a few columns.

df_dns.columns

Index(['ts', 'uid', 'id.orig_h', 'id.orig_p', 'id.resp_h', 'id.resp_p',
       'proto', 'trans_id', 'query', 'qclass', 'qclass_name', 'qtype',
       'qtype_name', 'rcode', 'rcode_name', 'AA', 'TC', 'RD', 'RA', 'Z',
       'rejected', 'rtt', 'answers', 'TTLs'],
      dtype='object')

Let's go ahead and drop some of these columns that are of no use to us. I'm keeping the port to also see if all of this activity is occurring on the same source port. Dropping the destination port as we know this is DNS. Definitely keeping the timestamp as this is what Joe used in his code to find beacons. It is also what we will use. Definitely also keeping the query as we need to know what domain the host(s) was/were trying to resolve.

df_dns = df_dns.drop(columns=[ 'uid', 'id.resp_p', 'proto', 'trans_id', 'qclass', 'qclass_name', 'qtype', 'qtype_name', 'rcode', 'rcode_name', 'AA', 'TC', 'RD', 'RA', 'Z', 'rejected', 'rtt', 'answers', 'TTLs'], inplace=False)

# View the first 5 records
df_dns.iloc[:5]

ts	id.orig_h	id.orig_p	id.resp_h	query
0	1.696122e+09	10.0.0.4	45084	172.17.17.202	4.0.0.10.in-addr.arpa
1	1.696122e+09	10.0.0.4	45084	172.17.17.202	4.0.0.10.in-addr.arpa
2	1.696122e+09	10.0.0.4	50481	172.17.17.202	37.0.0.10.in-addr.arpa
3	1.696122e+09	10.0.0.4	50481	172.17.17.202	37.0.0.10.in-addr.arpa
4	1.696122e+09	10.0.0.4	57870	172.17.17.202	2.0.0.10.in-addr.arpa

Here is the full example of one of these times

df_dns.ts[1]

1696122000.366851

Let's get this time into a format we can understand. More specifically, put it into a time that gives us the seconds.

df_dns.ts[1].astype(dtype='datetime64[s]')
numpy.datetime64('2023-10-01T01:00:00')

Changing all the times to more human readable time

df_dns['ts'] = df_dns['ts'].astype(dtype='datetime64[s]')
df_dns

ts	id.orig_h	id.orig_p	id.resp_h	query
0	2023-10-01 01:00:00	10.0.0.4	45084	172.17.17.202	4.0.0.10.in-addr.arpa
1	2023-10-01 01:00:00	10.0.0.4	45084	172.17.17.202	4.0.0.10.in-addr.arpa
2	2023-10-01 01:00:00	10.0.0.4	50481	172.17.17.202	37.0.0.10.in-addr.arpa
3	2023-10-01 01:00:00	10.0.0.4	50481	172.17.17.202	37.0.0.10.in-addr.arpa
4	2023-10-01 01:00:00	10.0.0.4	57870	172.17.17.202	2.0.0.10.in-addr.arpa
...	...	...	...	...	...
8212	2023-10-01 01:59:57	10.0.0.8	45024	172.17.17.198	3.pool.ntp.org
8213	2023-10-01 01:59:57	10.0.0.8	47334	172.17.17.198	time.google.com
8214	2023-10-01 01:59:58	10.0.0.9	40965	10.0.0.2	somedomain.securitynik.local
8215	2023-10-01 01:59:59	10.0.0.24	36625	10.0.0.2	i.ytimg.com
8216	2023-10-01 01:59:59	10.0.0.24	58969	10.0.0.2	youtubei.googleapis.com
8217 rows × 5 columns

I would like this data to be between 01:00 - 02:00 AM.  Primary reason is, it is easier for me to monitor my sampling period. Let's verify there is no data outside of this range. This returns one record. Not a major concern but I will still drop it.

df_dns[df_dns.ts < '2023-10-01 01:00:00' ]

ts	id.orig_h	id.orig_p	id.resp_h	query
48	2023-10-01 00:59:55	10.0.0.10	5353	224.0.0.251	_googlecast._tcp.local

Dropping the one record above

df_dns.drop(df_dns[df_dns.ts < '2023-10-01 01:00:00' ].index, inplace=True)

Any records greater than 1:59?. Looks like there is none.

ts	id.orig_h	id.orig_p	id.resp_h	query

Sort the timestamp (ts) column. Start from 01:00 am to get to 1:59 am

df_dns.sort_values(by='ts', ascending=True)
df_dns

s	id.orig_h	id.orig_p	id.resp_h	query
0	2023-10-01 01:00:00	10.0.0.4	45084	172.17.17.202	4.0.0.10.in-addr.arpa
1	2023-10-01 01:00:00	10.0.0.4	45084	172.17.17.202	4.0.0.10.in-addr.arpa
2	2023-10-01 01:00:00	10.0.0.4	50481	172.17.17.202	37.0.0.10.in-addr.arpa
3	2023-10-01 01:00:00	10.0.0.4	50481	172.17.17.202	37.0.0.10.in-addr.arpa
4	2023-10-01 01:00:00	10.0.0.4	57870	172.17.17.202	2.0.0.10.in-addr.arpa
...	...	...	...	...	...
8212	2023-10-01 01:59:57	10.0.0.8	45024	172.17.17.198	3.pool.ntp.org
8213	2023-10-01 01:59:57	10.0.0.8	47334	172.17.17.198	time.google.com
8214	2023-10-01 01:59:58	10.0.0.9	40965	10.0.0.2	somedomain.securitynik.local
8215	2023-10-01 01:59:59	10.0.0.24	36625	10.0.0.2	i.ytimg.com
8216	2023-10-01 01:59:59	10.0.0.24	58969	10.0.0.2	youtubei.googleapis.com
8216 rows × 5 columns

Visualize the time period

fig = px.histogram(data_frame=df_dns, x='ts', title='Originator IP Bytes Between 1 and 2 AM')
fig.show()
The sampling rate must be at least 2* the highest frequency we're trying to find.
Above, the time span is 1 hour or 60 minutes or 3600 seconds
We then need to sample this signal at a rate of at least 2 times the highest frequency
Since this is in seconds, the highest frequency is 3600
Hence we need to sample preferably uniformly at a rate of at least 2*3600
Sampling at a rate of at least 2*3600 allows us to be able to reconstruct the original signal in the time domain, from the frequency domain if needed

sampling_period = 3600
sampling_period

3600

The sampling rate is every 1 second. Hence we do 1./3600 to get the frequency per second

1./sampling_period

0.0002777777777777778

To get the frequency per minute or per 60 seconds, we do (1/.3600) * 60

(1./sampling_period) * 60

0.016666666666666666

Which also means, to get any frequency in between, we just multiply by that number of seconds.
Or for 2 seconds

(1./sampling_period) * 2

0.0005555555555555556

Extract the timestamp column and add it to its own Pandas series

tmp_data = df_dns['ts']
tmp_data, type(tmp_data) (0 2023-10-01 01:00:00 1 2023-10-01 01:00:00 2 2023-10-01 01:00:00 3 2023-10-01 01:00:00 4 2023-10-01 01:00:00 ... 8212 2023-10-01 01:59:57 8213 2023-10-01 01:59:57 8214 2023-10-01 01:59:58 8215 2023-10-01 01:59:59 8216 2023-10-01 01:59:59 Name: ts, Length: 8216, dtype: datetime64[s], pandas.core.series.Series)

Replace the index column with the timestamp

tmp_data.index = tmp_data
tmp_data

ts
2023-10-01 01:00:00   2023-10-01 01:00:00
2023-10-01 01:00:00   2023-10-01 01:00:00
2023-10-01 01:00:00   2023-10-01 01:00:00
2023-10-01 01:00:00   2023-10-01 01:00:00
2023-10-01 01:00:00   2023-10-01 01:00:00
                              ...        
2023-10-01 01:59:57   2023-10-01 01:59:57
2023-10-01 01:59:57   2023-10-01 01:59:57
2023-10-01 01:59:58   2023-10-01 01:59:58
2023-10-01 01:59:59   2023-10-01 01:59:59
2023-10-01 01:59:59   2023-10-01 01:59:59
Name: ts, Length: 8216, dtype: datetime64[s]

Using knowledge of 2 seconds as was seen via the tcpdump as my guide
You can try to use 1 second but I don't think it will find anything meaningful. I can be wrong!
I don't think 1 second would be representative of a real problem
Set my period of 2 seconds 

best_period = '2s'
best_period

'2s'

Get a count of the data points occurring every 2 seconds and print the first 10 entries

counts_per_period = tmp_data.resample(best_period).count()

# Print the first 10 entries
counts_per_period[:10], len(counts_per_period)


(ts
 2023-10-01 01:00:00    30
 2023-10-01 01:00:02    13
 2023-10-01 01:00:04     7
 2023-10-01 01:00:06     2
 2023-10-01 01:00:08     4
 2023-10-01 01:00:10     4
 2023-10-01 01:00:12    15
 2023-10-01 01:00:14     2
 2023-10-01 01:00:16     1
 2023-10-01 01:00:18     1
 Freq: 2S, Name: ts, dtype: int64,
 1800)


Confirm the type is a Pandas Series

type(counts_per_period)

pandas.core.series.Series

Take a look inside the keys. This shows the 2 second periods

counts_per_period.keys()

DatetimeIndex(['2023-10-01 01:00:00', '2023-10-01 01:00:02',
               '2023-10-01 01:00:04', '2023-10-01 01:00:06',
               '2023-10-01 01:00:08', '2023-10-01 01:00:10',
               '2023-10-01 01:00:12', '2023-10-01 01:00:14',
               '2023-10-01 01:00:16', '2023-10-01 01:00:18',
               ...
               '2023-10-01 01:59:40', '2023-10-01 01:59:42',
               '2023-10-01 01:59:44', '2023-10-01 01:59:46',
               '2023-10-01 01:59:48', '2023-10-01 01:59:50',
               '2023-10-01 01:59:52', '2023-10-01 01:59:54',
               '2023-10-01 01:59:56', '2023-10-01 01:59:58'],
              dtype='datetime64[s]', name='ts', length=1800, freq='2S')

Extract the values occurring at those timestamps
Let's call it x for now

x = counts_per_period.values
x

array([30, 13,  7, ...,  1, 19,  3], dtype=int64)

Get the length of x. Because the sampling was done for 1 hour or 3600 seconds, by looking at the data from 2 seconds perspective, we now have 1800 data points

len(x)

1800

Plot the values in x

plt.title('Plot of of the values in x')
plt.plot(x)
plt.xlabel(xlabel='Time in 2secs window')
plt.ylabel(ylabel='Counts Per Period')
plt.show()
Definitely from above we can see some spikes. This suggest some 2 seconds period have a large amount of counts. 

Get the Fourier Transform of the signal. Notice the result is a complex number, consisting of the real and imaginary component

fourier = np.fft.fft(x)
fourier, len(fourier)

(array([ 8216.           +0.j        ,  1913.98722741 -956.73902893j,
           18.18684807-1246.50554465j, ..., -1694.41853886 +611.36477164j,
           18.18684807+1246.50554465j,  1913.98722741 +956.73902893j]),
 1800)

Plot the values as is before finding the absolute values. Even though we used Fourier Transform, the x axis is still the number of samples rather than the frequency. This can be confirmed by the 1800 of the x axis. Notice above, there is 1800 at the bottom of the cell

plt.title(label='Plot before finding the absolute values')
plt.plot(fourier)
plt.xlabel(xlabel='samples')
plt.ylabel(ylabel='amplitude before normalize')
plt.show()

C:\Users\SecurityNik\AppData\Roaming\Python\Python39\site-packages\matplotlib\cbook\__init__.py:1340: ComplexWarning:

Casting complex values to real discards the imaginary part


Plot the values as is after finding the absolute values. We can see the symmetry in both the graph below and the one above. Even though we used Fourier Transform, the x axis is still the number of samples rather than the frequency. Notice the Y axis also goes to negative values.


plt.title(label='Amplitude - After finding the absolute values')
plt.plot(np.abs(fourier))
plt.xlabel(xlabel='samples')
plt.ylabel(ylabel='amplitude before normalize')
plt.show()


Let's normalize the FFT output. 
Remember, Shannon Nyquist states if we sample a signal at a rate of at least 2 times the highest frequency, the analog signal can be recovered perfectly

At the same time, setup the sampling period. These logs are for an hour 01:00 to 01:59. I am keeping this because my original log was for that period. When we resampled the data above by 2 seconds, it returned 1800 records.

N = len(x)
normalize = N/2
sampling_period = 3600
len(x), N, normalize, sampling_period

(1800, 1800, 900.0, 3600)

Plot the absolute value of the amplitude

plt.title(label='Normalize amplitude values')
plt.plot(np.abs(fourier)/normalize)
plt.xlabel(xlabel='samples')
plt.ylabel(ylabel='amplitude after normalization')
plt.show()

Need to fix the frequency. We are sampling at every one second in the hour. This is where I am using the 3600 rather than the 1800


frequency_rate = 1./sampling_period
frequency_rate

0.0002777777777777778

Get the frequency axis

frequency_axis = np.fft.fftfreq(n=N, d=frequency_rate)
frequency_axis, len(frequency_axis)

(array([ 0.,  2.,  4., ..., -6., -4., -2.]), 1800)


With the frequency axis in place, let's plot the frequency axis on its own for now. Notice the Y axis is both positive and negative. Notice it goes from 0 to 1800 which is half of 3600 which is basically half our sampling period. Also notice it goes from 0 to -1800. Did you see the symmetry?

plt.title('Plot showing both frequency in both negative and positive values') plt.plot(frequency_axis, lw=3, c='r') plt.ylabel('amplitude') plt.xlabel('count of samples');



Looking at the symmetry from another way. With the frequency axis in place, let's plot the frequency axis on its own for now. Notice the Y axis is both positive and negative. Notice it goes from 0 to 1800 which is half of 3600 which is basically half our sampling period. Also notice it goes from 0 to -1800.
You should be able to see the symmetry now? Basically same as you saw above. Just from a different perspective

norm_amplitude = np.abs(fourier)/normalize
plt.title('Plot showing symmetry of frequencies') plt.plot(frequency_axis, norm_amplitude) plt.ylabel('amplitude') plt.xlabel('Frequencies')
Just print the length and frequency values as a refresher for me

N, frequency_rate

(1800, 0.0002777777777777778)

Just getting a better understanding of the lengths

len(np.fft.rfft(x)), len(2*np.abs(np.fft.rfft(x))), len(np.abs(np.fft.rfft(x))), N
(901, 901, 901, 1800)

Finalize this code

We see that we have also gotten rid of the symmetry and now only have the positive half on the line

plt.plot(np.fft.rfftfreq(n=N, d=frequency_rate), 2*np.abs(np.fft.rfft(x))/N)

Compute the FFT values returned for the counts per second
Use the sampling period of 3600

fft = abs(np.fft.rfft(counts_per_period)) dvalue = int(best_period.rstrip("s")) frequencies = np.fft.rfftfreq(n=len(counts_per_period), d=dvalue/sampling_period) # Print the first 10 entries frequencies[:10] array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])


Get any signal spikes over CONST * stdev over the rest of the noise.  This will be the interesting stuff to look at.  The amplitudes (y-values) come from the FFT array found above.

Find the standard deviation of the remaining data, so we can use it to find the strongest signals present.  
Strip off the first 10% of the frequencies found, which will remove the DC component of the signal, leaving you with just the actual signal spikes.


print(f'Max frequency: {max(frequencies)}')
print(f'10% of the max frequency value: {0.1*max(frequencies)}')
print(f'Here are the frequencies - the lower 10%: \n\t {frequencies[frequencies > 0.1*max(frequencies)][:10]}')

Max frequency: 900.0
10% of the max frequency value: 90.0
Here are the frequencies - the lower 10%: 
	 [ 91.  92.  93.  94.  95.  96.  97.  98.  99. 100.]


With the above being made clear, save these new frequencies to a variable

stripped_frequencies = frequencies[ frequencies > 0.1 * max(frequencies) ]

# Print the first 10 entries
stripped_frequencies[:10]

array([ 91.,  92.,  93.,  94.,  95.,  96.,  97.,  98.,  99., 100.])

print(f'[*] Size of stripped frequencies: {stripped_frequencies.size}')
print(f'[*] Length of the fft transformed data: {len(fft)}') print(f'[*] New FFT: {fft[len(fft) - stripped_frequencies.size:][:10]}') [*] Size of stripped frequencies: 810 [*] Length of the fft transformed data: 901 [*] New FFT: [1143.47208739 473.94896724 304.70114392 420.31706819 219.34075832 581.26586592 572.50777759 136.43847641 1108.424958 1136.18872268]

Get the stripped FFT. Print the first 10 entries

stripped_fft = fft[len(fft) - stripped_frequencies.size:]

stripped_fft[:10]

array([1143.47208739,  473.94896724,  304.70114392,  420.31706819,
        219.34075832,  581.26586592,  572.50777759,  136.43847641,
       1108.424958  , 1136.18872268])

Leverage descriptive statistics. Get the standard deviation

std_dev = np.std(stripped_fft)

# Get the mean
mean = np.mean(stripped_fft)

# Set a threshold
threshold = mean + 2*std_dev

print(f'Standard Deviation: {std_dev} | Mean: {mean} | Threshold: {threshold}')


Standard Deviation: 240.6914745391128 | Mean: 369.67931016529883 | Threshold: 851.0622592435244

Add the strong signals to a list

1./sampling_period

strong_signals = [] for signal in stripped_fft: if (signal > threshold): # print(f"adding signal: {str(signal)}") strong_signals.append(signal) # Print the first 10 entries strong_signals[:10] [1143.4720873935075, 1108.4249580037538, 1136.188722679384, 978.1350685678566, 1309.8618870200787, 1265.7223903589352, 1214.0629560494137, 1747.6746509763254, 1440.277194109987, 1079.5542043630226]

Plot the frequency data after removing the DC component

fig = px.line(
    x=stripped_frequencies,
    y=(abs(stripped_fft)),
    labels=dict(x="Frequency (cycle/sec)", y="Connection Information"),
    title="Connection Information by Frequency With DC Removed; Sampling Period: " + best_period
)
fig.show()


For each strong signal: find the array index from the FFT array

signal_indices = []
i = 0
while (i < len(strong_signals)):
    matching_index = np.where(fft == np.float64(strong_signals[i]))[0][0]
    #print(f'Matching Index: {matching_index}')
    signal_indices.append(matching_index)
    i += 1

signal_indices[:10]


[91, 99, 100, 103, 104, 105, 106, 107, 108, 109]

Create a new array of the same size as the FFT array.  Zero it out, except for the indices you just found, which are the strong signals we want to find the times for.

strong_signal_frequencies = np.zeros(len(fft))
for index in signal_indices:
    strong_signal_frequencies[index] = frequencies[index]
    
strong_signal_amplitudes = np.zeros(len(fft))
for index in signal_indices:
    strong_signal_amplitudes[index] = fft[index]

Graph the data in the time domain, by your 2 seconds sampling period. Clearly we can see below there spikes of interest

fig = px.line(
    counts_per_period,
    labels=dict(x="Timestamp", y="DNS Log Information"),
    title="DNS By Timestamp; Sampling Period: " + best_period
)
fig.show()


De-noise the data by filtering. Make an effective bandpass filter by zeroing out all the frequencies except the strong ones found above.  Plot just the strong signal frequencies vs their amplitudes.

Use the Inverse FFT to flip just the strong signals back to time-domain

inverse_fft = np.fft.irfft(strong_signal_amplitudes, len(counts_per_period))

fig = px.line(
    x=counts_per_period.to_frame().index,
    y=inverse_fft,
    labels=dict(x="Timestamp", y="DNS Log"),
        title="Periodic Signal"
)

fig.show()


OK.  Now, for each of our strong signals, we need to identify domains from our original data set that had a count of DNS requests "near" our signal strengths.  (It won't be spot-on, due to sample frequency bin width and signal jitter.)  This will be the shortlist of IP for further investigation.

shortlist = []
newdf = df_dns.groupby(['id.orig_h']).size().reset_index(name='counts')
for amplitude in strong_signals:
    shortlist.append(newdf[ (newdf['counts'] > (amplitude*0.8)) & (newdf['counts'] < (amplitude*1.2)) ])
    
results = pd.concat(shortlist, ignore_index=True)
#print(results)
results[['id.orig_h','counts']]


id.orig_h	counts
0	10.0.0.24	1927
1	10.0.0.9	1770

Just as we expected, this started off with us recognizing via tcpdump that the host at 10.0.0.9 is sending beacons every two seconds. Not only are we able to find that host but we also are seeing another host that is exhibiting similar behaviour. Let's now go back into our Pandas DataFrame and isolate traffic from these two hosts.

df_dns[(df_dns['id.orig_h'] == '10.0.0.9') | (df_dns['id.orig_h'] == '10.0.0.24') ]

ts	id.orig_h	id.orig_p	id.resp_h	query
21	2023-10-01 01:00:00	10.0.0.9	40520	10.0.0.2	somedomain.securitynik.local
28	2023-10-01 01:00:00	10.0.0.24	41626	10.0.0.2	assets-sncust.securitynik.com
29	2023-10-01 01:00:00	10.0.0.24	39327	10.0.0.2	assets-sncust.securitynik.com
35	2023-10-01 01:00:02	10.0.0.9	33415	10.0.0.2	somedomain.securitynik.local
37	2023-10-01 01:00:03	10.0.0.24	61312	10.0.0.2	s.update.3lift.com
...	...	...	...	...	...
8194	2023-10-01 01:59:45	10.0.0.24	5353	224.0.0.251	_googlecast._tcp.local
8209	2023-10-01 01:59:56	10.0.0.9	55148	10.0.0.2	somedomain.securitynik.local
8214	2023-10-01 01:59:58	10.0.0.9	40965	10.0.0.2	somedomain.securitynik.local
8215	2023-10-01 01:59:59	10.0.0.24	36625	10.0.0.2	i.ytimg.com
8216	2023-10-01 01:59:59	10.0.0.24	58969	10.0.0.2	youtubei.googleapis.com
3697 rows × 5 columns


At this point, we can convert this notebook to a python script that we can run in our environment.
Also once again, big thanks to Joe Petroske for doing the initial heavy lifting.


Monday, October 2, 2023

Beginning SiLK - Systems for Internet Level Knowledge - working with network flow data

Silk is one of the tools used to analyze network flow data and something we teach in the SANS SEC503, Network Monitoring and Threat Detection. In this post, I am walking through some of the tools within the SiLK suite, to show their basic and somewhat common usage. There is no specific order to their usage and at times, you may even see the same tool being used multiple times but in different ways.

Get SiLK version, compile information, etc. via silk_config.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
sans@sec503:~/nik$ silk_config
silk-version: 3.19.2
compiler: gcc
cflags: -I/usr/local/include -DNDEBUG -D_ALL_SOURCE=1 -D_GNU_SOURCE=1  -I/usr/local/include -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include  -fno-strict-aliasing     -O3
include: -I/usr/local/include -DNDEBUG -D_ALL_SOURCE=1 -D_GNU_SOURCE=1  -I/usr/local/include -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include
libsilk-libs:  -L/usr/local/lib -lsilk  -lz -lm
libsilk-thrd-libs:  -L/usr/local/lib -lsilk-thrd -lsilk   -lz -lm
libflowsource-libs:  -L/usr/local/lib -lflowsource -lsilk-thrd -lsilk -L/usr/local/lib -lfixbuf -lpthread -lgthread-2.0 -pthread -lglib-2.0   -lz -lm
data-rootdir: /data
python-site-dir: /usr/lib/python3/dist-packages

Get information about the sensors in the site via rwsiteinfo.

1
2
3
4
5
6
sans@sec503:~$ rwsiteinfo --fields sensor,describe-sensor
   Sensor|    Sensor-Description|
 Internal|          Backbone ERS|
Perimeter|   Perimeter collector|
      ERS|Avaya ERS Switch Stack|
 internal|           STIFortunes|

A different view of the sensors information

1
2
3
sans@sec503:~$ rwsiteinfo --fields=sensor:list
                    Sensor:list|
Internal,Perimeter,ERS,internal|

Get information from a particular sensor via rwsiteinfo.

1
2
3
4
5
6
7
sans@sec503:~$ rwsiteinfo --sensor=Internal --fields type,repo-file-count,repo-start-date,repo-end-date
   Type|File-Count|         Start-Date|           End-Date|
     in|      5828|2018/10/01T17:00:00|2022/07/03T19:00:00|
    out|      5093|2018/10/01T01:00:00|2022/07/03T19:00:00|
  inweb|      5059|2018/10/04T22:00:00|2022/07/03T19:00:00|
 outweb|      1781|2018/10/03T08:00:00|2019/05/03T14:00:00|
 ...

Get information on classes, types and their default values. The "+" mark rows for the default class and "*" mark rows for a default type

1
2
3
4
5
6
sans@sec503:~/nik$ rwsiteinfo --sensor=Perimeter --fields class,type,mark-default
Class|   Type|Defaults|
  all|     in|      +*|
  all|    out|      +*|
  all|  inweb|      +*|
  all| outweb|      +*|

Get the start and end date of the repo.

1
2
3
sans@sec503:~/nik$ rwsiteinfo --fields=repo-start,repo-end
         Start-Date|           End-Date|
2018/10/01T01:00:00|2022/07/03T19:00:00|

Leverage rwcount, to count the number of flow records, their bytes and packets.

1
2
3
4
sans@sec503:~/nik$  rwcount /tmp/attack-trace.rw
               Date|        Records|               Bytes|          Packets|
2019/04/20T03:28:00|           2.60|             2218.09|            16.36|
2019/04/20T03:28:30|           9.40|           176342.91|           331.64|

Leverage rwfilter, to retrieve information based on start and end date for all IP protocols relating to all traffic types and specifically for the host with address 8.8.8.8. Match on the first successful 100 records and save those to a file named 8.rw.

1
sans@sec503:~/nik$ rwfilter --start=2022/01/05 --end=2022/07/01T23 --protocol=0- \
--type=all --any-address=8.8.8.8 --max-pass=100 --pass=8.rw

Get information on the 8.rw file.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
sans@sec503:~/nik$ rwfileinfo 8.rw
8.rw:
  format(id)          FT_RWIPV6ROUTING(0x0c)
  version             16
  byte-order          littleEndian
  compression(id)     none(0)
  header-length       176
  record-length       88
  record-version      1
  silk-version        3.19.2
  count-records       100
  file-size           8976
  command-lines
                   1  rwfilter --start=2022/01/05 --end=2022/07/01T23 --protocol=0- --type=all --any-address=8.8.8.8 --max-pass=100 --pass=8.rw

Accessing the file just saved, by using the rwcut tool, while view a few fields.

1
2
3
4
5
6
sans@sec503:~/nik$ rwcut 8.rw --fields sip,sPort,dIP,dPort 
                                    sIP|sPort|                                    dIP|dPort|
                                8.8.8.8|   53|                          172.28.10.137|56213|
                                8.8.8.8|   53|                          172.28.10.137|55171|
                                8.8.8.8|   53|                          172.28.10.137|54512|
				....

Confirming the number of records in the file 8.rw.

1
2
sans@sec503:~/nik$ rwcut 8.rw --no-title | wc --lines
100

Using rwcut, to get more details from a flow file named attack-trace.rw.

1
2
3
4
sans@sec503:~/nik$ rwcut attack-trace.rw --fields=sIP,sPort,dIP,dPort,bytes,stime --num-recs=2
                                    sIP|sPort|                                    dIP|dPort|     bytes|                  sTime|
                         98.114.205.102| 1821|                         192.150.11.111|  445|       168|2019/04/20T03:28:28.374|
                         192.150.11.111|  445|                         98.114.205.102| 1821|       128|2019/04/20T03:28:28.375|

Removing the space to the left with ipv6=policy-ignore. We could have also set the environment variable SILK_IPV6_POLICY=ignore.

1
2
3
4
sans@sec503:~/nik$ rwcut attack-trace.rw --fields=sIP,sPort,dIP,dPort,bytes,stime --num-recs=2 --ipv6-policy=ignore
            sIP|sPort|            dIP|dPort|     bytes|                  sTime|
 98.114.205.102| 1821| 192.150.11.111|  445|       168|2019/04/20T03:28:28.374|
 192.150.11.111|  445| 98.114.205.102| 1821|       128|2019/04/20T03:28:28.375|

rwcut can be used without specifying fields. In the example below, it shows 12 fields by default.

1
2
3
4
sans@sec503:~/nik$ rwfilter --start=2022/01/05 --end=2022/07/01T23 --protocol=0- --type=all --any-address=8.8.8.8 --max-pass=100 --pass=stdout | rwcut --num-recs=2
                                    sIP|                                    dIP|sPort|dPort|pro|   packets|     bytes|   flags|                  sTime| duration|                  eTime|   sensor|
                                8.8.8.8|                          172.28.10.137|   53|56213| 17|         1|       218|        |2022/02/08T14:26:40.723|    0.001|2022/02/08T14:26:40.724| Internal|
                                8.8.8.8|                          172.28.10.137|   53|55171| 17|         1|       102|        |2022/02/08T14:27:10.329|    0.013|2022/02/08T14:27:10.342| Internal|

Using rwcut, to get a CSV file from the retrieved data. Maybe you want to get this data in your machine learning algorithms, something we teach in the SANS SEC595: Applied Data Science and AI/Machine Learning for Cybersecurity Professionals  or maybe you would like to import them into Pandas or Excel.

1
2
3
4
sans@sec503:~/nik$ rwcut attack-trace.rw --fields=sIP,sPort,dIP,dPort,bytes,stime \
--num-recs=2 --ipv6-policy=ignore --no-columns --delimited=, --no-final-delimiter
sIP,sPort,dIP,dPort,bytes,sTime
98.114.205.102,1821,192.150.11.111,445,168,2019/04/20T03:28:28.374
192.150.11.111,445,98.114.205.102,1821,128,2019/04/20T03:28:28.375

Get information on a particular bytes-range.

1
2
3
4
5
6
7
8
sans@sec503:~/nik$ rwfilter --start=2022/01/05T0 --end=2022/07/01 --protocol=0- --pass=stdout --type=all --bytes=0-30 \
--max-pass=5 | rwuniq --fields=sIP,dIP,bytes,packets
                                    sIP|                                    dIP|     bytes|   packets|   Records|
                           10.200.223.7|                            172.28.10.1|        28|         1|         1|
                           10.200.223.7|                            172.28.20.1|        28|         1|         1|
                           10.200.223.7|                             172.28.1.1|        28|         1|         1|
                           10.200.223.7|                           172.28.30.64|        28|         1|         1|
                           10.200.223.7|                           172.28.30.65|        28|         1|         1|

Group data in 24 hours bin/buckets

1
2
3
4
5
6
7
8
sans@sec503:~/nik$ rwfilter --start=2022/01/05T0 --end=2022/07/01 --protocol=0- \
--pass=stdout --type=all --bytes=0-30 | rwuniq --bin-time=86400 --fields stime,type \
--values=records --sort-output
              sTime|   type|   Records|
2022/02/12T00:00:00|     in|      4136|
2022/02/12T00:00:00|    out|        52|
2022/02/13T00:00:00|     in|      2469|
2022/02/14T00:00:00|     in|      4307|
2022/02/14T00:00:00|    out|         7|
...

Grouping data in 1 hour bins/buckets.

1
2
3
4
5
6
7
8
sans@sec503:~/nik$ rwfilter --start=2022/01/05T0 --end=2022/07/01 --protocol=0- \
--pass=stdout --type=all --bytes=0-30 | rwuniq --bin-time=3600 --fields stime,type \
--values=records --sort-output
              sTime|   type|   Records|
2022/02/12T19:00:00|     in|         2|
2022/02/12T20:00:00|     in|      3674|
2022/02/12T20:00:00|    out|        52|
2022/02/12T21:00:00|     in|        14|
2022/02/12T22:00:00|     in|       446|
....

Get the number of bytes within the hours.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
sans@sec503:~/nik$ rwfilter --start=2022/01/05T0 --end=2022/07/01 --protocol=0- \
--pass=stdout --type=all --bytes=0-30 | rwuniq --bin-time=86400 --fields stime,type \
--values=bytes --sort-output | head --lines=10
              sTime|   type|               Bytes|
2022/02/12T00:00:00|     in|              115808|
2022/02/12T00:00:00|    out|                1456|
2022/02/13T00:00:00|     in|               69132|
2022/02/14T00:00:00|     in|              120596|
2022/02/14T00:00:00|    out|                 196|
2022/02/16T00:00:00|    out|                 120|
2022/02/17T00:00:00|     in|             5527373|
2022/02/17T00:00:00|    out|                 882|
2022/02/18T00:00:00|     in|                  29|

Extending further, grabbing the count of the distinct source and destination IPs.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
sans@sec503:~/nik$ rwfilter --start=2022/01/05T0 --end=2022/07/01 --protocol=0- --pass=stdout \
--type=all --bytes=0-30 | rwuniq --bin-time=86400 --fields stime,type --values=bytes,sip,dip \
--sort-output | head --lines=10
              sTime|   type|               Bytes|        sIP-Distinct|        dIP-Distinct|
2022/02/12T00:00:00|     in|              115808|                   2|                3268|
2022/02/12T00:00:00|    out|                1456|                   1|                   1|
2022/02/13T00:00:00|     in|               69132|                   1|                2387|
2022/02/14T00:00:00|     in|              120596|                   1|                4199|
2022/02/14T00:00:00|    out|                 196|                   7|                   1|
2022/02/16T00:00:00|    out|                 120|                   1|                   4|
2022/02/17T00:00:00|     in|             5527373|                   3|                  13|
2022/02/17T00:00:00|    out|                 882|                   7|                  21|
2022/02/18T00:00:00|     in|                  29|                   1|                   1|

By default rwuniq has a value of records, ie --value=records. This represents which values are counted in the bin.

1
2
3
4
sans@sec503:~/nik$ rwuniq attack-trace.rw --fields sIP
                                    sIP|   Records|
                         192.150.11.111|         6|
                         98.114.205.102|         6|

Above is the same as --value=records means the records are counted in the bin.

1
2
3
4
sans@sec503:~/nik$ rwuniq attack-trace.rw --fields sIP --values=records
                                    sIP|   Records|
                         192.150.11.111|         6|
                         98.114.205.102|         6|

Expand rwuniq to extract the stime and source IP fields. Group by the bytes and sort the output.

1
2
3
4
sans@sec503:~/nik$ rwuniq attack-trace.rw --fields stime,sip --values=bytes --sort-output --bin-time=600
              sTime|                                    sIP|               Bytes|
2019/04/20T03:20:00|                         98.114.205.102|              171264|
2019/04/20T03:20:00|                         192.150.11.111|                7297|

Group by packets with a bin size of 10 minutes

1
2
3
4
sans@sec503:~/nik$ rwuniq attack-trace.rw --fields stime,sip --values=packets \
--sort-output --bin-time=600
              sTime|                                    sIP|        Packets|
2019/04/20T03:20:00|                         98.114.205.102|            195|
2019/04/20T03:20:00|                         192.150.11.111|            153|

Group by both packets and bytes

1
2
3
4
sans@sec503:~/nik$ rwuniq attack-trace.rw --fields stime,sip --values=bytes,packets --sort-output --bin-time=600
              sTime|                                    sIP|               Bytes|        Packets|
2019/04/20T03:20:00|                         98.114.205.102|              171264|            195|
2019/04/20T03:20:00|                         192.150.11.111|                7297|            153|

Assuming the input has been sorted, we can pass --presorted-input to the rwuiq command.

1
2
3
4
5
6
7
sans@sec503:~/nik$ rwuniq attack-trace.rw --fields sip,stime --values=bytes,packets --presorted-input --bin-time=600
                                    sIP|              sTime|               Bytes|        Packets|
                         98.114.205.102|2019/04/20T03:20:00|                 168|              4|
                         192.150.11.111|2019/04/20T03:20:00|                 128|              3|
                         98.114.205.102|2019/04/20T03:20:00|                4777|             14|
                         192.150.11.111|2019/04/20T03:20:00|                1590|             17|
			...

Once again, use --ipv6-policy=true to remove the space on the left.

1
2
3
4
5
6
sans@sec503:~/nik$ rwuniq attack-trace.rw --fields sip,stime --values=bytes,packets \
--presorted-input --bin-time=600 --ipv6-policy=ignore
            sIP|              sTime|               Bytes|        Packets|
 98.114.205.102|2019/04/20T03:20:00|                 168|              4|
 192.150.11.111|2019/04/20T03:20:00|                 128|              3|
 98.114.205.102|2019/04/20T03:20:00|                4777|             14|

Finding the most commonly used protocols with rwstats.
rwstats group records into time bin either by field or fields.
rwstats can count the top N and lower N number of bins. rwuniq cannot do this.
rwstats can also compute summary percentage.

Find the top 10 protocols in a 10 minute span.

1
2
3
4
5
sans@sec503:~/nik$ rwstats attack-trace.rw --fields=protocol,stime \
--count=10 --bin-time=600 --values=bytes
INPUT: 12 Records for 1 Bin and 178561 Total Bytes
OUTPUT: Top 10 Bins by Bytes
pro|              sTime|               Bytes|    %Bytes|   cumul_%|
  6|2019/04/20T03:20:00|              178561|100.000000|100.000000|

Grab the top 5 bins within a 5 minutes span. Group by bytes.

1
2
3
4
5
6
7
8
9
sans@sec503:~/nik$ rwfilter --protocol=0- --start-date=2022/01/01 \
--end-date=2022/05/01 --pass=stdout --max-pass=100 | rwstats --field=stime,sIP \
--count=5 --values=bytes --bin-time=300
INPUT: 100 Records for 5 Bins and 17964 Total Bytes
OUTPUT: Top 5 Bins by Bytes
              sTime|                                    sIP|               Bytes|    %Bytes|   cumul_%|
2022/02/08T14:40:00|                                8.8.8.8|               13785| 76.736807| 76.736807|
2022/02/08T14:35:00|                                8.8.8.8|                2166| 12.057448| 88.794255|
2022/02/08T14:25:00|                                8.8.8.8|                1101|  6.128925| 94.923180|
2022/02/08T14:30:00|                                8.8.8.8|                 836|  4.653752| 99.576932|
2022/02/08T14:40:00|                          17.253.26.125|                  76|  0.423068|100.000000|

Top 5 records by bytes. 

1
2
3
4
5
6
7
8
sans@sec503:~/nik$ rwfilter --protocol=0- --start-date=2022/01/01 --end-date=2022/05/01 \
--pass=stdout --max-pass=1000 | rwstats --field=protocol --count=5 --values=bytes \
--bin-time=300
INPUT: 1000 Records for 4 Bins and 5287991 Total Bytes
OUTPUT: Top 5 Bins by Bytes
pro|               Bytes|    %Bytes|   cumul_%|
  6|             5142342| 97.245665| 97.245665|
 17|              134037|  2.534743| 99.780408|
  1|               11500|  0.217474| 99.997882|
 58|                 112|  0.002118|100.000000|

Top 5 records by packets.

1
2
3
4
5
6
7
8
sans@sec503:~/nik$ rwfilter --protocol=0- --start-date=2022/01/01 --end-date=2022/05/01 \
--pass=stdout --max-pass=1000 | rwstats --field=protocol --count=5 --values=packets \
--bin-time=300
INPUT: 1000 Records for 4 Bins and 10277 Total Packets
OUTPUT: Top 5 Bins by Packets
pro|        Packets|  %Packets|   cumul_%|
  6|           9235| 89.860854| 89.860854|
 17|            915|  8.903376| 98.764231|
  1|            125|  1.216308| 99.980539|
 58|              2|  0.019461|100.000000|

Top 5 records, by records which are the default when no values are specified.

1
2
3
4
5
6
7
8
sans@sec503:~/nik$ rwfilter --protocol=0- --start-date=2022/01/01 --end-date=2022/05/01 \
--pass=stdout --max-pass=1000 | rwstats --field=protocol --count=5 --bin-time=300
INPUT: 1000 Records for 4 Bins and 1000 Total Records
OUTPUT: Top 5 Bins by Records
pro|   Records|  %Records|   cumul_%|
 17|       886| 88.600000| 88.600000|
  6|       109| 10.900000| 99.500000|
  1|         3|  0.300000| 99.800000|
 58|         2|  0.200000|100.000000|

Get the overall stats via summary parameters

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
sans@sec503:~/nik$ rwstats attack-trace.rw --overall-stats | more
FLOW STATISTICS--ALL PROTOCOLS:  12 records
*BYTES min 40; max 165088
  quartiles LQ 150.00000 Med 504.00000 UQ 4000.00000 UQ-LQ 3850.00000
   interval_max|count<=max|%_of_input|   cumul_%|
             40|         1|  8.333333|  8.333333|
             60|         1|  8.333333| 16.666667|
            100|         0|  0.000000| 16.666667|
            150|         1|  8.333333| 25.000000|
            256|         2| 16.666667| 41.666667|
           1000|         3| 25.000000| 66.666667|
          10000|         3| 25.000000| 91.666667|
         100000|         0|  0.000000| 91.666667|
        1000000|         1|  8.333333|100.000000|
     4294967295|         0|  0.000000|100.000000|
*PACKETS min 1; max 159
  quartiles LQ 3.00000 Med 10.00000 UQ 17.50000 UQ-LQ 14.50000
   interval_max|count<=max|%_of_input|   cumul_%|
              3|         3| 25.000000| 25.000000|
              4|         1|  8.333333| 33.333333|
             10|         2| 16.666667| 50.000000|
             20|         4| 33.333333| 83.333333|
             50|         0|  0.000000| 83.333333|
            100|         0|  0.000000| 83.333333|
            500|         2| 16.666667|100.000000|
           1000|         0|  0.000000|100.000000|
          10000|         0|  0.000000|100.000000|
     4294967295|         0|  0.000000|100.000000|
...

Look at the top 5 by bytes, this time includes the "distinct"/unique source and destination IPs.

1
2
3
4
5
6
7
8
9
sans@sec503:~/nik$ rwstats attack-trace.rw --count=5 --fields=bytes --values=bytes,distinct:sip,dip
INPUT: 12 Records for 12 Bins and 178561 Total Bytes
OUTPUT: Top 5 Bins by Bytes
     bytes|               Bytes|        sIP-Distinct|        dIP-Distinct|    %Bytes|   cumul_%|
    165088|              165088|                   1|                   1| 92.454679| 92.454679|
      4777|                4777|                   1|                   1|  2.675276| 95.129956|
      4488|                4488|                   1|                   1|  2.513427| 97.643382|
      1590|                1590|                   1|                   1|  0.890452| 98.533834|
       801|                 801|                   1|                   1|  0.448586| 98.982421|

Set a threshold for the number of records that must be found before a flow can be reported.

1
2
3
4
5
6
sans@sec503:~/nik$ rwstats attack-trace.rw --threshold=6 --fields=sIP
INPUT: 12 Records for 2 Bins and 12 Total Records
OUTPUT: Top 2 bins by Records (threshold 6)
                                    sIP|   Records|  %Records|   cumul_%|
                         98.114.205.102|         6| 50.000000| 50.000000|
                         192.150.11.111|         6| 50.000000|100.000000|

Set a threshold for the number of bytes that must be match in a flow to 1500.

1
2
3
4
5
6
sans@sec503:~/nik$ rwstats attack-trace.rw --threshold=1500 --fields=sIP --values=bytes
INPUT: 12 Records for 2 Bins and 178561 Total Bytes
OUTPUT: Top 2 bins by Bytes (threshold 1500)
                                    sIP|               Bytes|    %Bytes|   cumul_%|
                         98.114.205.102|              171264| 95.913441| 95.913441|
                         192.150.11.111|                7297|  4.086559|100.000000|

Both records above match that criterion. Let's change this to a threshold of 7298 to get just one record.

1
2
3
4
5
sans@sec503:~/nik$ rwstats attack-trace.rw --threshold=7298 --fields=sIP --values=bytes
INPUT: 12 Records for 2 Bins and 178561 Total Bytes
OUTPUT: Top 1 bins by Bytes (threshold 7298)
                                    sIP|               Bytes|    %Bytes|   cumul_%|
                         98.114.205.102|              171264| 95.913441| 95.913441|

Above shows, with our threshold, only one record was returned. Removing the two right most columns. The percentage fields.

1
2
3
4
5
sans@sec503:~/nik$ rwstats attack-trace.rw --threshold=7298 --fields=sIP \
--values=bytes --no-percents
INPUT: 12 Records for 2 Bins and 178561 Total Bytes
OUTPUT: Top 1 bins by Bytes (threshold 7298)
                                    sIP|               Bytes|
                         98.114.205.102|              171264|

Characterizing traffic by time. view records in 20 seconds buckets.

1
2
3
4
sans@sec503:~/nik$ rwcount attack-trace.rw --bin-size=20
               Date|        Records|               Bytes|          Packets|
2019/04/20T03:28:20|           8.26|           100551.36|           212.18|
2019/04/20T03:28:40|           3.74|            78009.64|           135.82|

rwcount default bin size is 30 seconds.

1
2
3
4
sans@sec503:~/nik$ rwcount attack-trace.rw
               Date|        Records|               Bytes|          Packets|
2019/04/20T03:28:00|           2.60|             2218.09|            16.36|
2019/04/20T03:28:30|           9.40|           176342.91|           331.64|

You can skip flows with zero bytes, flows or packets by using --skip-zeroes. I don't have any 0s below. At the same time, I've changed the --bin-size to 20 seconds rather than the default 30.

1
2
3
4
sans@sec503:~/nik$ rwcount attack-trace.rw --bin-size=20 --skip-zeroes
               Date|        Records|               Bytes|          Packets|
2019/04/20T03:28:20|           8.26|           100551.36|           212.18|
2019/04/20T03:28:40|           3.74|            78009.64|           135.82|

Reverse sort all records by destination IP, protocol and bytes. rwsort binary output cannot be written to the screen. Hence the pipe to rwcut

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
sans@sec503:~/nik$ rwsort attack-trace.rw --fields=dip,protocol,bytes \
--reverse | rwcut --fields=dip,protocol,bytes,stime
--num-recs=10 --ipv6-policy=ignore
            dIP|pro|     bytes|                  sTime|
 192.150.11.111|  6|    165088|2019/04/20T03:28:34.516|
 192.150.11.111|  6|      4777|2019/04/20T03:28:28.509|
 192.150.11.111|  6|       798|2019/04/20T03:28:33.576|
 192.150.11.111|  6|       381|2019/04/20T03:28:30.466|
 192.150.11.111|  6|       168|2019/04/20T03:28:28.374|
 192.150.11.111|  6|        52|2019/04/20T03:28:44.593|
 98.114.205.102|  6|      4488|2019/04/20T03:28:34.517|
 98.114.205.102|  6|      1590|2019/04/20T03:28:28.509|
 98.114.205.102|  6|       801|2019/04/20T03:28:33.457|
 98.114.205.102|  6|       250|2019/04/20T03:28:30.466|

Perform the reverse sort based on the bytes.

1
2
3
4
5
6
7
8
sans@sec503:~/nik$ rwsort attack-trace.rw --fields=bytes,dip,protocol --reverse | \
rwcut --fields=dip,protocol,bytes,stime --num-recs=10 --ipv6-policy=ignore
            dIP|pro|     bytes|                  sTime|
 192.150.11.111|  6|    165088|2019/04/20T03:28:34.516|
 192.150.11.111|  6|      4777|2019/04/20T03:28:28.509|
 98.114.205.102|  6|      4488|2019/04/20T03:28:34.517|
 98.114.205.102|  6|      1590|2019/04/20T03:28:28.509|
 98.114.205.102|  6|       801|2019/04/20T03:28:33.457|
 192.150.11.111|  6|       798|2019/04/20T03:28:33.576|

Create a set of IP addresses from flow data using a combination of rwfilter and rwset. This can be used for export from flow and import into other security tools such as SIEM, Firewall, etc.

1
sans@sec503:~/nik$ rwfilter --type=all --pass=stdout --proto=0- \
--start-date=2022/04/1T00 --end-date=2022/04/04 --bytes-per-packet=70 \
--max-pass=100 | rwset --any-file=ip_from_flow.set

Validate the exported records, by leveraging rwsetcat.

1
2
3
4
5
6
7
8
9
sans@sec503:~/nik$ rwsetcat ip_from_flow.set
8.8.8.8
18.118.192.126
34.193.254.175
35.168.220.189
172.28.10.137
172.28.30.2
172.28.50.2
192.225.158.1

Reverse this process, using rwsetbuild. Create a set of IPs from a txt file. This can be used for ignoring future flows via an allow/permit list.

1
sans@sec503:~/nik$ rwsetbuild --ip-ranges ip.txt ip.set

Read the created set via rwsetcat.

1
2
3
4
5
6
sans@sec503:~/nik$ rwsetcat ip.set
1.1.1.1
2.2.2.2
3.3.3.3
4.4.4.4
5.5.5.5

Get statistics on the IP addresses.

1
2
3
4
5
6
7
8
9
sans@sec503:~/nik$ rwsetcat --print-statistics ip.set
Network Summary
        minimumIP =         1.1.1.1
        maximumIP =         5.5.5.5
                 5 hosts (/32s),    0.000000% of 2^32
                 5 occupied /8s,    1.953125% of 2^8
                 5 occupied /16s,   0.007629% of 2^16
                 5 occupied /24s,   0.000030% of 2^24
                 5 occupied /27s,   0.000004% of 2^27

Get a snapshot view of the network structure with rwsetcat.

1
2
sans@sec503:~/nik$ rwsetcat ip.set --network-structure
TOTAL| 5 hosts in 5 /8s, 5 /16s, 5 /24s, and 5 /27s

Get a different view of the network structure with rwsetcat.

1
2
3
4
5
6
7
sans@sec503:~/nik$ rwsetcat ip.set --network-structure=24
        1.1.1.0/24| 1
        2.2.2.0/24| 1
        3.3.3.0/24| 1
        4.4.4.0/24| 1
        5.5.5.0/24| 1

Do a resolve IP addresses to host names using rwresolve, taking the data from rwsetcat output.

1
2
3
4
5
6
sans@sec503:~/nik$ rwsetcat ip.set | rwresolve
one.one.one.one
2.2.2.2
3.3.3.3
4.4.4.4
dynamic-005-005-005-005.5.5.pool.telefonica.de

About to do another resolve. Review the data first via rwcut.

1
2
3
4
5
6
7
sans@sec503:~/nik$ rwcut 8.rw --fields=sip,dip --num-recs=5 --ipv6-policy=ignore
            sIP|            dIP|
        8.8.8.8|  172.28.10.137|
        8.8.8.8|  172.28.10.137|
        8.8.8.8|  172.28.10.137|
        8.8.8.8|  172.28.10.137|
        8.8.8.8|  172.28.10.137|

Doing the resolve by specifying the getnameinfo resolver.

1
2
3
4
5
6
7
sans@sec503:~/nik$ rwcut 8.rw --fields=sip,dip --num-recs=5 --ipv6-policy=ignore | \
rwresolve --ip-fields=1,2 --resolver=getnameinfo
            sIP|            dIP|
dns.google|  172.28.10.137|
dns.google|  172.28.10.137|
dns.google|  172.28.10.137|
dns.google|  172.28.10.137|
dns.google|  172.28.10.137|

Find the top 5 DNS Servers seen within the flows using rwfilter and rwstats.
Interesting that a public DNS server is seen as the device with highest number of packets. I was expecting to see an internal DNS server. Then again, it could be the location of this sensor.

1
2
3
4
5
6
7
8
9
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  \
--protocol=17 --pass=stdout  --type=in --sport=53 | rwstats --values=packets \
--fields sIP --count=5 --ipv6-policy=ignore
INPUT: 1215737 Records for 276 Bins and 1348651 Total Packets
OUTPUT: Top 5 Bins by Packets
            sIP|        Packets|  %Packets|   cumul_%|
        8.8.8.8|        1330414| 98.647760| 98.647760|
   199.212.0.63|           9296|  0.689281| 99.337041|
 199.180.180.63|           2011|  0.149112| 99.486153|
  204.61.216.50|           1272|  0.094316| 99.580470|
 205.251.199.83|            586|  0.043451| 99.623920|

Looking at the DNS communication from the bytes perspective using rwfilter and rwstats.

1
2
3
4
5
6
7
8
9
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  \
--protocol=17 --pass=stdout  --type=in --sport=53 | rwstats --values=bytes \
--fields sIP --count=5 --ipv6-policy=ignore
INPUT: 1215737 Records for 276 Bins and 212797321 Total Bytes
OUTPUT: Top 5 Bins by Bytes
            sIP|               Bytes|    %Bytes|   cumul_%|
        8.8.8.8|           209509379| 98.454895| 98.454895|
   199.212.0.63|              758309|  0.356353| 98.811248|
 199.180.180.63|              710009|  0.333655| 99.144903|
  204.61.216.50|              449465|  0.211217| 99.356120|
     193.0.9.10|              205880|  0.096749| 99.452870|

Looking at it from the number of records packets using rwfilter and rwstats.

1
2
3
4
5
6
7
8
9
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  \
--protocol=17 --pass=stdout  --type=in --sport=53 | rwstats --values=records \
--fields sIP --count=5 --ipv6-policy=ignore
INPUT: 1215737 Records for 276 Bins and 1215737 Total Records
OUTPUT: Top 5 Bins by Records
            sIP|   Records|  %Records|   cumul_%|
        8.8.8.8|   1200021| 98.707286| 98.707286|
   199.212.0.63|      7931|  0.652361| 99.359648|
 199.180.180.63|      2008|  0.165167| 99.524815|
  204.61.216.50|      1271|  0.104546| 99.629361|
     193.0.9.10|       581|  0.047790| 99.677151|

Above relates to traffic coming in the enterprise. What about traffic going out to DNS Servers?

Looking at it from a different perspective using rwfilter and rwstats.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  \
--protocol=17 --pass=stdout  --type=out,outweb --dport=53 | rwstats --values=bytes \
--fields sIP --count=5 --ipv6-policy=ignore
INPUT: 1225733 Records for 10 Bins and 110746579 Total Bytes
OUTPUT: Top 5 Bins by Bytes
            sIP|               Bytes|    %Bytes|   cumul_%|
  172.28.10.137|           110716457| 99.972801| 99.972801|
    172.28.30.2|               13824|  0.012483| 99.985284|
    172.28.20.3|               12528|  0.011312| 99.996596|
    172.28.20.5|                 960|  0.000867| 99.997463|
    172.28.30.5|                 680|  0.000614| 99.998077|

The number of flow records using rwfilter and rwstats.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  --protocol=17 --pass=stdout  --type=out,outweb --dport=53
 | rwstats --values=records --fields sIP --count=5 --ipv6-policy=ignore
INPUT: 1225733 Records for 10 Bins and 1225733 Total Records
OUTPUT: Top 5 Bins by Records
            sIP|   Records|  %Records|   cumul_%|
  172.28.10.137|   1225669| 99.994779| 99.994779|
    172.28.30.2|        29|  0.002366| 99.997145|
    172.28.20.3|        24|  0.001958| 99.999103|
   172.28.10.89|         3|  0.000245| 99.999347|
    172.28.30.5|         2|  0.000163| 99.999510|

Digging deeper to see what the host at 172.28.10.137 is doing, using rwfilter and rwstats

1
2
3
4
5
6
7
8
9
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  \
--protocol=17 --pass=stdout  --type=out,outweb --dport=53 | rwstats --values=bytes \
--fields sIP,dip,dport --count=5 --ipv6-policy=ignore
INPUT: 1225733 Records for 359 Bins and 110746579 Total Bytes
OUTPUT: Top 5 Bins by Bytes
            sIP|            dIP|dPort|               Bytes|    %Bytes|   cumul_%|
  172.28.10.137|        8.8.8.8|   53|           108039369| 97.555491| 97.555491|
  172.28.10.137|   199.212.0.63|   53|             1382580|  1.248418| 98.803909|
  172.28.10.137|   192.175.48.6|   53|              300216|  0.271084| 99.074993|
  172.28.10.137|  192.175.48.42|   53|              299097|  0.270073| 99.345066|
  172.28.10.137| 199.180.180.63|   53|              153126|  0.138267| 99.483333|

What's your conclusion of above?

What sensor is this traffic coming from? Using rwfilter and rwstats

1
2
3
4
5
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  
--protocol=17 --pass=stdout  --type=out,outweb --dport=53 | rwstats --values=records \
--fields sensor --count=5 --ipv6-policy=ignore --no-percent
INPUT: 1225733 Records for 1 Bin and 1225733 Total Records
OUTPUT: Top 5 Bins by Records
   sensor|   Records|
 Internal|   1225733|

Looking at it from a different perspective via rwfilter and rwuniq.

1
2
3
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  \
--protocol=17 --pass=stdout  --type=out,outweb --dport=53  | rwuniq --values=flows \
--fields=sensor
   sensor|   Records|
 Internal|   1225733|

Looking at the address 172.28.10.137 to identify all communication. Find the combination of unique source and destination IP and source and destination ports. Sort the results. Doing this once again, via rwfilter and rwuniq.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  \
--protocol=0- --pass=stdout  --type=all --any-address=172.28.10.137 | \
rwuniq --values=flows,distinct:sip,distinct:dip,distinct:sport,distinct:dport \
--fields type,protocol --sort
   type|pro|   Records|        sIP-Distinct|        dIP-Distinct|sPort|dPort|
     in|  1|       160|                   8|                   1|    1|    1|
     in|  6|    408473|                   8|                   1|21465|65477|
     in|  8|         1|                   1|                   1|    1|    1|
     in| 17|   1245841|                 280|                   1|  279|14462|
     in| 63|         1|                   1|                   1|    1|    1|
    out|  1|       139|                   1|                   8|    1|    1|
    out|  6|     12106|                   1|                   8|   21| 9732|
    out| 17|   1230067|                   1|                 355| 8708|   99|
  inweb|  6|     10791|                 294|                   1|  185| 7857|
 

Finding the top 3 unique destination ports for traffic going outbound using rwfilter and rwuniq.

1
2
3
4
5
6
7
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  \
--protocol=0- --pass=stdout  --type=out,outweb | rwstats --value=flows \
--fields=dport --count=3
INPUT: 1894243 Records for 31811 Bins and 1894243 Total Records
OUTPUT: Top 3 Bins by Records
dPort|   Records|  %Records|   cumul_%|
   53|   1225733| 64.708329| 64.708329|
  443|    120219|  6.346546| 71.054875|
 9573|      9755|  0.514981| 71.569857|

Since I've done some work with port 53 above, let's look at port 443.

Find the top 5 source IP communicating via port 443 with traffic greater than 250 bytes in their flows.  Note the rwfilter --bytes-per-packet=250-.  Once again, using rwfilter and rwstats.

1
2
3
4
5
6
7
8
9
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  \
--protocol=0- --pass=stdout  --type=out,outweb --dport=443 --bytes-per-packet=250- | \
rwstats --value=bytes --fields=sip --count=5 --ipv6-policy=ignore
INPUT: 120213 Records for 10 Bins and 691383585 Total Bytes
OUTPUT: Top 5 Bins by Bytes
            sIP|               Bytes|    %Bytes|   cumul_%|
    172.28.20.6|           130274858| 18.842631| 18.842631|
    172.28.30.5|           114087108| 16.501275| 35.343906|
    172.28.30.2|           104686092| 15.141536| 50.485442|
    172.28.30.3|            90222518| 13.049560| 63.535002|
    172.28.20.3|            72788328| 10.527922| 74.062925|

Looking at flow records with 0-250 bytes per packet. Note the rwfilter --bytes-per-packet=0-250.  Adding the duration to this activity. Interesting this activity all have 0 time. Scanning?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  \
--protocol=0- --pass=stdout  --type=out,outweb --dport=443 --bytes-per-packet=0-250 | \
rwstats --value=bytes --fields=sip,dip,sport,dport,packets,duration --count=5 \
--ipv6-policy=ignore --no-percent
INPUT: 6 Records for 6 Bins and 744 Total Bytes
OUTPUT: Top 5 Bins by Bytes
            sIP|            dIP|sPort|dPort|   packets|durat|               Bytes|
    172.28.30.4|  23.58.146.215|57496|  443|         1|    0|                 124|
    172.28.30.3|  23.58.146.216|65523|  443|         1|    0|                 124|
    172.28.30.3|  23.58.146.216|56311|  443|         1|    0|                 124|
    172.28.20.6|  23.58.146.215|49308|  443|         1|    0|                 124|
    172.28.30.4|  184.51.157.69|58644|  443|         1|    0|                 124|

Find flows where the duration is 0 and the bytes-per-packet is less than 250 using rwfilter and rwstats.

1
2
3
4
5
6
7
8
9
sans@sec503:~$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  --protocol=0- --pass=stdout  \
--type=out,outweb --dport=443 --bytes-per-packet=0-250 --duration=0 | rwstats --value=bytes \
--fields=stime,sip,dip,sport,dport,packets,duration --count=5 --ipv6-policy=ignore --no-percent --bin=3600
INPUT: 6 Records for 6 Bins and 744 Total Bytes
OUTPUT: Top 5 Bins by Bytes
              sTime|            sIP|            dIP|sPort|dPort|   packets|durat|               Bytes|
2022/02/18T18:00:00|    172.28.20.6|  23.58.146.215|49308|  443|         1|    0|                 124|
2022/02/17T17:00:00|    172.28.20.6|  23.58.146.216|56289|  443|         1|    0|                 124|
2022/03/24T15:00:00|    172.28.30.4|  184.51.157.69|58644|  443|         1|    0|                 124|
2022/03/04T17:00:00|    172.28.30.3|  23.58.146.216|56311|  443|         1|    0|                 124|
2022/03/06T21:00:00|    172.28.30.3|  23.58.146.216|65523|  443|         1|    0|                 124|

Add the type column to validate the direction of the traffic. Still using rwfilter and rwstats.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
sans@sec503:~$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  --protocol=0- --pass=stdout  --type=out,outweb \
--dport=443 --bytes-per-packet=0-250 --duration=0 | rwstats --value=bytes --fields=stime,sip,dip,sport,dport,packets,duration,type \
--count=5 --ipv6-policy=ignore --no-percent --bin=3600
INPUT: 6 Records for 6 Bins and 744 Total Bytes
OUTPUT: Top 5 Bins by Bytes
              sTime|            sIP|            dIP|sPort|dPort|   packets|durat|   type|               Bytes|
2022/03/04T17:00:00|    172.28.30.3|  23.58.146.216|56311|  443|         1|    0|    out|                 124|
2022/03/24T15:00:00|    172.28.30.4|  184.51.157.69|58644|  443|         1|    0|    out|                 124|
2022/02/17T17:00:00|    172.28.20.6|  23.58.146.216|56289|  443|         1|    0|    out|                 124|
2022/02/18T18:00:00|    172.28.20.6|  23.58.146.215|49308|  443|         1|    0|    out|                 124|
2022/03/06T21:00:00|    172.28.30.3|  23.58.146.216|65523|  443|         1|    0|    out|                 124|

Do we have similar traffic on the inside? Removing the type from rwfilter.

1
2
3
4
5
6
7
8
9
sans@sec503:~$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  --protocol=0- --pass=stdout --dport=443 \
--bytes-per-packet=0-250 --duration=0 | rwstats --value=bytes --fields=stime,sip,dip,sport,dport,packets,duration,type \
--count=5 --ipv6-policy=ignore --no-percent --bin=3600
INPUT: 147455 Records for 147297 Bins and 92351786 Total Bytes
OUTPUT: Top 5 Bins by Bytes
              sTime|            sIP|            dIP|sPort|dPort|   packets|durat|   type|               Bytes|
2022/03/17T21:00:00|   10.200.223.2|   172.28.3.173|34796|  443|        32|    0|  inweb|                1920|
2022/03/17T21:00:00|   10.200.223.2|   172.28.2.183|54576|  443|        32|    0|  inweb|                1920|
2022/03/17T21:00:00|   10.200.223.2|   172.28.14.48|33192|  443|        32|    0|  inweb|                1920|
2022/03/17T21:00:00|   10.200.223.2|  172.28.12.198|58278|  443|        32|    0|  inweb|                1920|
2022/03/17T21:00:00|   10.200.223.2|   172.28.14.69|48240|  443|        32|    0|  inweb|                1920|

Taking a different view. Looking for smaller outbound transfers. Note the --type=out,outweb. Maybe beaconing? We also talk about detecting beaconing in SANS SEC595: Applied Data Science and AI/Machine Learning for Cybersecurity Professionals using Fast Fourier Transform. The bytes below are all consistent for the 3 unique hosts.

1
2
3
4
5
6
7
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  \
--protocol=0- --pass=stdout  --type=out,outweb --dport=443 --bytes-per-packet=0-250 | \
rwstats --value=bytes --fields=sip --count=5 --ipv6-policy=ignore
INPUT: 6 Records for 3 Bins and 744 Total Bytes
OUTPUT: Top 5 Bins by Bytes
            sIP|               Bytes|    %Bytes|   cumul_%|
    172.28.20.6|                 248| 33.333333| 33.333333|
    172.28.30.4|                 248| 33.333333| 66.666667|
    172.28.30.3|                 248| 33.333333|100.000000|

Get some additional protocol statistics via rwfilter and rwstats. Note the --print-statistics for rwfilter.

1
2
3
4
5
6
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  \
--protocol=0- --pass=stdout  --type=out,outweb --dport=443 --bytes-per-packet=0-250 \
--print-statistics | rwstats --value=bytes --fields=sip --count=5 --ipv6-policy=ignore \
--detail-proto-stat
s=6 | grep "min"
Files  1235.  Read    1894243.  Pass          6. Fail     1894237.
*BYTES min 124; max 124
*PACKETS min 1; max 1
*BYTES/PACKET min 124; max 124

Revisiting the source IPs with low byte count. What destination are they communicating with? Adding the destination field to rwstats.

1
2
3
4
5
6
7
8
9
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  \
--protocol=0- --pass=stdout  --type=out,outweb --dport=443 --bytes-per-packet=0-250 | \
rwstats --value=bytes --fields=sip,dip,sport,dport,packets --count=5 --ipv6-policy=ignore \
--no-percent
INPUT: 6 Records for 6 Bins and 744 Total Bytes
OUTPUT: Top 5 Bins by Bytes
            sIP|            dIP|sPort|dPort|   packets|               Bytes|
    172.28.20.6|  23.58.146.216|56289|  443|         1|                 124|
    172.28.30.3|  23.58.146.216|56311|  443|         1|                 124|
    172.28.30.4|  184.51.157.69|58644|  443|         1|                 124|
    172.28.20.6|  23.58.146.215|49308|  443|         1|                 124|
    172.28.30.4|  23.58.146.215|57496|  443|         1|                 124|

Obviously something is wrong above. There is just too much commonality there.  Let's see, 20 (IP header length) + (assume) 20 (TCP header) = 84 bytes. Each of these packets have ~84 bytes of IP TCP data. Looks at the IPs also to find the commonality.

Resolve those IP addresses of the hosts above, using rwresolve.

1
2
3
4
5
6
7
8
9
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  --protocol=0- --pass=stdout  --type=out,outweb \
--dport=443 --bytes-per-packet=0-250 | rwstats --value=bytes --fields=sip,dip,sport,dport,packets --count=5 --ipv6-policy=ignore \
--no-percent | rwresolve
INPUT: 6 Records for 6 Bins and 744 Total Bytes
OUTPUT: Top 5 Bins by Bytes
            sIP|            dIP|sPort|dPort|   packets|               Bytes|
    172.28.20.6|a23-58-146-216.deploy.static.akamaitechnologies.com|56289|  443|         1|                 124|
    172.28.30.3|a23-58-146-216.deploy.static.akamaitechnologies.com|56311|  443|         1|                 124|
    172.28.30.4|a184-51-157-69.deploy.static.akamaitechnologies.com|58644|  443|         1|                 124|
    172.28.20.6|a23-58-146-215.deploy.static.akamaitechnologies.com|49308|  443|         1|                 124|
    172.28.30.4|a23-58-146-215.deploy.static.akamaitechnologies.com|57496|  443|         1|                 124|

Focus on one particular address using the --any-address flag with rwfilter. Pipe the output to rwstats.

1
2
3
4
5
6
7
8
sans@sec503:~$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  --protocol=0- --pass=stdout --dport=443 \
--bytes-per-packet=0-250 --duration=0- --any-address=23.58.146.216 | rwstats --value=bytes \
--fields=stime,sip,dip,sport,dport,packets,duration,type,proto --count=5 --ipv6-policy=ignore --no-percent \
--bin=3600
INPUT: 3 Records for 3 Bins and 372 Total Bytes
OUTPUT: Top 5 Bins by Bytes
              sTime|            sIP|            dIP|sPort|dPort|   packets|durat|   type|pro|               Bytes|
2022/03/06T21:00:00|    172.28.30.3|  23.58.146.216|65523|  443|         1|    0|    out| 17|                 124|
2022/03/04T17:00:00|    172.28.30.3|  23.58.146.216|56311|  443|         1|    0|    out| 17|                 124|
2022/02/17T17:00:00|    172.28.20.6|  23.58.146.216|56289|  443|         1|    0|    out| 17|                 124|

Above is interesting, as the traffic is all on UDP 443 rather than TCP.. QUIC?

Keeping it simple by finding the first 10 records that match a particular query using rwfilter and rwuniq.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  --protocol=6  --pass-destination=stdout \
--max-pass=10 | rwuniq --fields sip,sport,dip,dport
                                    sIP|sPort|                                    dIP|dPort|   Records|
                           52.109.88.36|  443|                           172.28.10.89|56674|         1|
                           10.200.223.4|50494|                            172.28.10.5|   22|         1|
                          142.250.72.10|  443|                            172.28.20.5|53715|         1|
                            172.28.10.5|   22|                           10.200.223.4|50494|         1|
                           52.109.88.36|  443|                           172.28.10.89|56673|         1|
                           10.200.223.4|50673|                             172.28.1.1|   22|         1|
                          142.250.72.35|  443|                            172.28.30.5|53821|         1|
                            20.50.73.10|  443|                           172.28.10.89|56669|         1|
                          20.189.173.13|  443|                           172.28.10.80|64499|         1|
                          142.250.72.35|  443|                            172.28.20.5|53714|         1|

Find the first 10 records that fails (think grep --invert-match or grep -v) the query, using rwfilter and rwuniq. Notice rather than --pass-destination it is now --fail-destination.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
sans@sec503:~$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  \
--protocol=6  --fail-destination=stdout --max-fail=10 | rwuniq \
--fields sip,sport,dip,dport --ipv6-policy=ignore
            sIP|sPort|            dIP|dPort|   Records|
        8.8.8.8|   53|  172.28.10.137|56104|         1|
        8.8.8.8|   53|  172.28.10.137|54512|         1|
        8.8.8.8|   53|  172.28.10.137|55382|         1|
        8.8.8.8|   53|  172.28.10.137|55171|         1|
        8.8.8.8|   53|  172.28.10.137|56350|         1|
        8.8.8.8|   53|  172.28.10.137|55339|         1|
        8.8.8.8|   53|  172.28.10.137|54864|         1|
        8.8.8.8|   53|  172.28.10.137|56290|         1|
        8.8.8.8|   53|  172.28.10.137|55359|         1|
        8.8.8.8|   53|  172.28.10.137|56213|         1|


Combining the rwfilter --pass-destination and --fail-destination as well as writing --pass-destination and --fail-destination to file.

1
2
3
4
5
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/04/01  --protocol=6  --pass-destination=stdout \
--max-pass=10 --fail-destination=6-fail.rw --max-fail=10 | rwfilter stdin --aport=443 --fail-destination=stdout \
--pass-destination=pass-443  | rwuniq --fields sip,sport,dip,dport
                                    sIP|sPort|                                    dIP|dPort|   Records|
                           10.200.223.4|50494|                            172.28.10.5|   22|         1|
                            172.28.10.5|   22|                           10.200.223.4|50494|         1|
                           10.200.223.4|50673|                             172.28.1.1|   22|         1|

Find 5 unique sessions that were initiated by the client. That is the device sending the SYN packet. Note the --flags-initial with rwfilter. S/SA means we are looking to see if the SYN flag is set while testing the SYN and ACK flags.

1
2
3
4
5
6
7
sans@sec503:~$ rwfilter --start-date=2019/04/01T0 --end-date=2022/05/01  --protocol=6  --pass-destination=stdout --aport=443 --flags-initial=S/SA \
--max-pass=5 | rwuniq --fields  stime,sIP,dIP,flags,initialflags,duration --values=records
              sTime|                                    sIP|                                    dIP|   flags|initialF|durat|   Records|
2019/05/02T16:30:15|                           172.16.10.13|                            13.107.5.88| SRPA   | S      |    0|         1|
2019/05/02T16:29:56|                           172.16.10.13|                           65.55.44.108|FSRPA   | S      |  132|         1|
2019/05/02T16:31:04|                           172.16.10.13|                           65.55.44.109| SRPA   | S      |    4|         1|
2019/05/02T16:30:45|                           172.16.10.13|                         157.55.135.128|FS PA   | S      |   19|         1|
2019/05/02T16:30:15|                           172.16.10.13|                           13.107.3.128| SRPA   | S      |    0|         1|

Similarly find the devices acting as a server. Meaning, the device responded to a SYN with a SYN/ACK. Notice the rwfilter --flags-initial=SA/SA now shows test SYN/ACK to see if both SYN and ACK are set.

1
2
3
4
5
6
7
sans@sec503:~$ rwfilter --start-date=2019/04/01T0 --end-date=2022/05/01  --protocol=6  --pass-destination=stdout --aport=443 --flags-initial=SA/SA \
--max-pass=5 | rwuniq --fields  stime,sIP,dIP,flags,initialflags,duration --values=records
              sTime|                                    sIP|                                    dIP|   flags|initialF|durat|   Records|
2019/05/02T16:30:15|                           13.107.3.128|                           172.16.10.13| S  A   | S  A   |    0|         1|
2019/05/02T16:30:45|                         157.55.135.128|                           172.16.10.13|FSRPA   | S  A   |   19|         1|
2019/05/02T16:31:04|                           65.55.44.109|                           172.16.10.13| S PA   | S  A   |    4|         1|
2019/05/02T16:29:56|                           65.55.44.108|                           172.16.10.13| S PA   | S  A   |  132|         1|
2019/05/02T16:30:15|                            13.107.5.88|                           172.16.10.13| S  A   | S  A   |    0|         1|

Find 5 unique sessions that seems to have been fully completed. Notice the rwfilter --flags-
all=SAFP/FSRPA tests the FIN, SYN, RST, PUSH and ACK flags to see if SYN, ACK, FIN and PUSH are set.

1
2
3
4
5
6
7
sans@sec503:~$ rwfilter --start-date=2019/04/01T0 --end-date=2022/05/01  --protocol=6  --pass-destination=stdout --dport=22,80,443,4444 \
--flags-all=SAFP/FSRPA --max-pass=5 | rwuniq --fields  stime,sIP,dIP,dport,flags,type --values=records
              sTime|                                    sIP|                                    dIP|dPort|   flags|   type|   Records|
2019/05/02T16:38:30|                           172.16.10.13|                         192.96.162.110|   80|FS PA   | outweb|         1|
2019/05/02T16:38:32|                           172.16.10.13|                          192.96.162.33|   80|FS PA   | outweb|         2|
2019/05/02T16:38:32|                           172.16.10.13|                          23.33.106.133|   80|FS PA   | outweb|         1|
2019/05/02T16:30:45|                           172.16.10.13|                         157.55.135.128|  443|FS PA   | outweb|         1|

Look at the last 5 sessions again, this time add duration field to rwuniq. Added flows and bytes to the --values.

1
2
3
4
5
6
sans@sec503:~$ rwfilter --start-date=2019/04/01T0 --end-date=2022/05/01  --protocol=6  --pass-destination=stdout --dport=22,80,443,4444 --flags-all=SAFP/FSRPA --max-pass=5 | \
rwuniq --fields  stime,sIP,dIP,dport,flags,type,duration --values=flows,bytes,packets
              sTime|                                    sIP|                                    dIP|dPort|   flags|   type|durat|   Records|               Bytes|        Packets|
2019/05/02T16:38:32|                           172.16.10.13|                          192.96.162.33|   80|FS PA   | outweb|   78|         2|                 977|             14|
2019/05/02T16:38:32|                           172.16.10.13|                          23.33.106.133|   80|FS PA   | outweb|   78|         1|                 505|              7|
2019/05/02T16:30:45|                           172.16.10.13|                         157.55.135.128|  443|FS PA   | outweb|   19|         1|                6297|             16|
2019/05/02T16:38:30|                           172.16.10.13|                         192.96.162.110|   80|FS PA   | outweb|  108|         1|                 575|              7|

Leveraging rwbag. Preparing the data via rwfilter, then redirect it to rwbag.

1
2
sans@sec503:~/nik$ rwfilter --start-date=2019/04/01T0 --end-date=2022/05/01  --protocol=6  \
--pass-destination=stdout --dport=22,80,443,4444 --max-pass=5 | \
rwbag --bag-file=sipv4,sum-bytes,/tmp/test.bag

Viewing the contents in the bag created via rwbag.

1
2
3
sans@sec503:~/nik$ rwbagcat test.bag
   172.16.10.13|                8106|
   172.16.40.12|                  80|

Leveraging rwscan to identify potential scanning IPs.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/05/01  --protocol=6  --pass-destination=stdout  | \
rwsort --fields sip,protocol,dip | rwscan --scan-model=2
             sip| proto|                   stime|                   etime|     flows|   packets|     bytes|
    10.200.223.2|     6|     2022-03-07 11:59:46|     2022-04-30 23:49:43|   1061401|  57208212|3107887558|
    10.200.223.3|     6|     2022-02-09 12:51:11|     2022-04-30 16:36:16|    413308|  10588224| 583916721|
    10.200.223.4|     6|     2022-02-08 14:26:13|     2022-04-30 00:14:41|   3749647|  65736155|4056928153|
    10.200.223.5|     6|     2022-02-11 20:55:22|     2022-04-30 15:11:11|   2776970|   7508499| 406143689|
    10.200.223.7|     6|     2022-02-08 15:50:32|     2022-03-24 22:18:35|    149108|   4259383| 241338192|
    10.200.223.8|     6|     2022-02-11 21:06:29|     2022-04-30 03:01:23|    232009|   3673446| 177412489|
     172.28.20.3|     6|     2022-02-18 16:23:53|     2022-04-30 23:15:50|       299|      1430|    181320|
     172.28.20.4|     6|     2022-02-24 18:32:19|     2022-04-29 23:05:43|       224|      1207|    170436|
     172.28.20.6|     6|     2022-02-16 16:05:19|     2022-04-21 01:39:55|      8202|     24551|   1342732|
     172.28.30.2|     6|     2022-02-16 16:26:12|     2022-04-26 16:05:56|       544|      2724|    351448|
     172.28.30.3|     6|     2022-02-10 17:42:20|     2022-03-19 18:53:36|       168|       497|     25844|
     172.28.30.4|     6|     2022-02-08 20:47:57|     2022-04-28 16:56:30|       525|      2626|    350572|
     172.28.30.5|     6|     2022-02-10 18:05:52|     2022-04-30 15:16:06|       446|      2428|    334660|
     172.28.50.2|     6|     2022-02-10 18:41:21|     2022-04-29 15:55:32|       580|      2883|    367628|

Narrowing above down to only the IPs and storing them in the bag. First get the data via rwfilter, rwsort and rwscan. The pipe this data into cut.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/05/01  \
--protocol=6  --pass-destination=stdout  | rwsort --fields sip,protocol,dip | \
rwscan --scan-model=2 --no-title --output-path=stdout | cut --fields=1,5 --delimiter='|'
    10.200.223.2|   1061401
    10.200.223.3|    413308
    10.200.223.4|   3749647
    10.200.223.5|   2776970
    10.200.223.7|    149108
    10.200.223.8|    232009
     172.28.20.3|       299
     172.28.20.4|       224
     172.28.20.6|      8202
     172.28.30.2|       544
     172.28.30.3|       168
     172.28.30.4|       525
     172.28.30.5|       446
     172.28.50.2|       580

Create the bag consisting of the IPs shows above. Reading directly from rwfilter. Pipe it into rwsort, then rwscan then rwbagbuild. After building the bag, use rwbagcat to view the records.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/05/01  \
--protocol=6  --pass-destination=stdout  | rwsort --fields sip,protocol,dip | \
rwscan --scan-model=2 --no-title --output-path=stdout | \
cut --fields=1,5 --delimiter='|' | rwbagbuild --bag-input=stdin --key-type=sipv4 \
--counter-type=records | rwbagcat
   10.200.223.2|             1061401|
   10.200.223.3|              413308|
   10.200.223.4|             3749647|
   10.200.223.5|             2776970|
   10.200.223.7|              149108|
   10.200.223.8|              232009|
    172.28.20.3|                 299|
    172.28.20.4|                 224|
    172.28.20.6|                8202|
    172.28.30.2|                 544|
    172.28.30.3|                 168|
    172.28.30.4|                 525|
    172.28.30.5|                 446|
    172.28.50.2|                 580|

Alternatively, group the bags by IPs. Notice --bin-ips to rwbagcat.

1
2
sans@sec503:~/nik$ rwfilter --start-date=2022/01/01T0 --end-date=2022/05/01  \
--protocol=6  --pass-destination=stdout  | rwsort --fields sip,protocol,dip | \
rwscan --scan-model=2 --no-title --output-path=stdout | cut --fields=1 \
--delimiter='|' | sort --unique | rwbagbuild --bag-input=stdin --key-type=sipv4 \
--counter-type=records | rwbagcat  --bin-ips
                   1|                  14|


Introducing rwnetmask. Maybe you have a network where communication looks like this.

1
2
3
4
5
6
7
sans@sec503:~/nik$ rwfilter --start-date=2022/04/01T0 --end-date=2022/05/01  --protocol=6  \
--pass-destination=stdout --max-pass=5  | rwuniq --fields sip,dip
                                    sIP|                                    dIP|   Records|
                           52.167.17.97|                            172.28.20.4|         1|
                          20.72.205.209|                            172.28.30.3|         1|
                           52.109.88.35|                            172.28.30.4|         1|
                           52.167.17.97|                            172.28.30.4|         1|
                          20.72.205.209|                           172.28.10.10|         1|
 
Rather than getting the full IP, you decide you would like to have a 24 bit mask of the IP address. Using the rwnetmask, we see we were able to change the IP address to /24 networks.

1
2
3
4
5
6
7
sans@sec503:~/nik$ rwfilter --start-date=2022/04/01T0 --end-date=2022/05/01  \
--protocol=6  --pass-destination=stdout --max-pass=5  | rwnetmask \
--4sip-prefix-length=24 --4dip-prefix-length=24 | rwcut --fields sip,dip
                                    sIP|                                    dIP|
                            52.167.17.0|                            172.28.20.0|
                            52.167.17.0|                            172.28.30.0|
                            20.72.205.0|                            172.28.10.0|
                            52.109.88.0|                            172.28.30.0|
                            20.72.205.0|                            172.28.30.0|
 
Find the well-known TCP ports on the network which seems to be the busiest, via rwfilter and rwuniq.

1
2
3
4
5
6
7
sans@sec503:~/nik$ rwfilter --protocol=6 --dport=0-1023 \
--start-date=2022/01/01 --end-date=2022/05/01 --pass=stdout --max-pass=1000 | \
rwuniq --fields dport --values flow,bytes,packets --sort
dPort|   Records|               Bytes|        Packets|
   21|         1|                 156|              3|
   22|        39|           248109713|        4847875|
   25|        17|             1047287|            809|
   80|         4|                6315|            104|
  443|       939|              912196|          20670|

More detail to understand the type of data and the sensor involved.

1
2
3
4
5
6
7
8
sans@sec503:~/nik$ rwfilter --protocol=6 --dport=0-1023 --start-date=2022/01/01 \
--end-date=2022/05/01 --pass=stdout --max-pass=1000 | rwuniq \
--fields dport,type,sensor --values flow,bytes,packets --sort
dPort|   type|   sensor|   Records|               Bytes|        Packets|
   21|     in| Internal|         1|                 156|              3|
   22|     in| Internal|        33|           247919188|        4845368|
   22|    out| Internal|         6|              190525|           2507|
   25|     in| Internal|        17|             1047287|            809|
   80|  inweb| Internal|         4|                6315|            104|
  443|  inweb| Internal|       939|              912196|          20670|


I find it quite interesting, that the majority of this traffic is on port 22, typically associated with SSH.

So far I've been specific about fields such as --fields=sip. How about grabbing all the fields with rwcut.

1
2
3
4
sans@sec503:~/nik$ rwcut --all-fields attack-trace.rw --num-recs=2
                                    sIP|                                    dIP|sPort|dPort|pro|   packets|     bytes|   flags|                  sTime| duration|                  eTime|   sensor|   in|  out|                                   nhIP|initialF|sessionF|attribut|appli|cla|   type|             sTime+msec|             eTime+msec| dur+msec|iTy|iCo|
                         98.114.205.102|                         192.150.11.111| 1821|  445|  6|         4|       168|FS  A   |2019/04/20T03:28:28.374|    0.354|2019/04/20T03:28:28.728| Internal|    0|    0|                                0.0.0.0| S      |F   A   |        |    0|all|     in|2019/04/20T03:28:28.374|2019/04/20T03:28:28.728|    0.354|   |   |
                         192.150.11.111|                         98.114.205.102|  445| 1821|  6|         3|       128|FS  A   |2019/04/20T03:28:28.375|    0.353|2019/04/20T03:28:28.728| Internal|    0|    0|                                0.0.0.0| S  A   |F   A   |        |    0|all|     in|2019/04/20T03:28:28.375|2019/04/20T03:28:28.728|    0.353|   |   |

Revisit the timestamps via rwcut.

1
2
3
4
5
6
7
sans@sec503:~/nik$ rwcut --fields stime attack-trace.rw --num-recs=5
                  sTime|
2019/04/20T03:28:28.374|
2019/04/20T03:28:28.375|
2019/04/20T03:28:28.509|
2019/04/20T03:28:28.509|
2019/04/20T03:28:30.466|

Use the legacy timestamp instead with rwcut, rather than the default.

1
2
3
4
5
6
7
sans@sec503:~/nik$ rwcut --fields stime attack-trace.rw --legacy-timestamp --num-recs=5
              sTime|
04/20/2019 03:28:28|
04/20/2019 03:28:28|
04/20/2019 03:28:28|
04/20/2019 03:28:28|
04/20/2019 03:28:30|

Or maybe get rwcut to produce the time in epoch time.

1
2
3
4
5
6
sans@sec503:~$ rwfilter --start=2022/01/05T0 --end=2022/07/01 --protocol=0- \
--pass=stdout --type=all --bytes=0-30 | rwuniq --bin-time=86400 --fields stime,type \
--values=records --sort-output --timestamp-format=epoch | head --lines=5
     sTime|   type|   Records|
1644624000|     in|      4136|
1644624000|    out|        52|
1644710400|     in|      2469|
1644796800|     in|      4307

Revisit rwcut formatting.

1
2
3
4
5
6
7
sans@sec503:~/nik$ rwcut --fields stime,duration,sip,dport attack-trace.rw --num-recs=5
                  sTime| duration|                                    sIP|dPort|
2019/04/20T03:28:28.374|    0.354|                         98.114.205.102|  445|
2019/04/20T03:28:28.375|    0.353|                         192.150.11.111| 1821|
2019/04/20T03:28:28.509|    4.938|                         98.114.205.102|  445|
2019/04/20T03:28:28.509|    4.938|                         192.150.11.111| 1828|
2019/04/20T03:28:30.466|    3.100|                         98.114.205.102| 1957|

Remove the columns, make it pipe delimited.

1
2
3
4
5
6
7
sans@sec503:~/nik$ rwcut --fields stime,duration,sip,dport attack-trace.rw --num-recs=5 --no-columns
sTime|duration|sIP|dPort|
2019/04/20T03:28:28.374|0.354|98.114.205.102|445|
2019/04/20T03:28:28.375|0.353|192.150.11.111|1821|
2019/04/20T03:28:28.509|4.938|98.114.205.102|445|
2019/04/20T03:28:28.509|4.938|192.150.11.111|1828|
2019/04/20T03:28:30.466|3.100|98.114.205.102|1957|

 Revisit creating a file from rwfilter. This time, set the --compression-method to none.

1
sans@sec503:~/nik$ rwfilter --start=2022/01/05 --end=2022/07/01T23 --protocol=0- \
--type=all --max-pass=100 --compression-method=none --pass=uncompressed.rw

Leveraging rwfilter compression when creating files. Set the --compression-method to best.

1
2
sans@sec503:~/nik$ rwfilter --start=2022/01/05 --end=2022/07/01T23 --protocol=0- \
--type=all --max-pass=100 --compression-method=best --pass=compressed.rw

Review the files to created by rwfilter, confirm the compression

1
2
3
sans@sec503:~/nik$ ls *compressed* -l
-rw-rw-r-- 1 sans sans 1177 Jun 13 01:51 compressed.rw
-rw-rw-r-- 1 sans sans 8976 Jun 13 01:51 uncompressed.rw

Find echo replies by leveraging rwfilter --icmp-type and --icmp-code parameters. Specifically look at ICMP type 0 and code 0.

1
2
3
4
5
6
sans@sec503:~$ rwfilter --start=2022/01/05 --end=2022/07/01T23 --icmp-type=0 --icmp-code=0 --type=all --max-pass=100000 \
--pass-destination=stdout | rwcut --num-recs=4 --fields sip,dip,proto,packets,bytes --icmp-type-and-code
                                    sIP|                                    dIP|pro|   packets|     bytes|sPort|dPort|
               fe80::250:56ff:fead:e8b6|                                ff02::2| 58|         1|        56|    0|    0|
                fe80::250:56ff:fead:445|                                ff02::2| 58|         1|        56|    0|    0|
                            66.35.60.78|                            172.28.30.5|  1|        10|       920|    0|    0|
                            66.35.60.78|                            172.28.30.2|  1|        15|      1380|    0|    0|

Note above in the --fields section, I do not have sport or dport. However, we see these values for the ICMP type and codes. Do note, ICMP does not use the concept of ports. Trick question I ask at interviews, "What protocol and port does Ping use TCP or UDP?" :-) 

Are there any echo requests to match those replies?

1
2
sans@sec503:~$ rwfilter --start=2022/01/05 --end=2022/07/01T23 --icmp-type=8 --icmp-code=0 --type=all --max-pass=100000 \
--pass-destination=stdout | rwcut --num-recs=4 --fields sip,dip,proto,packets,bytes --icmp-type-and-code
                                    sIP|                                    dIP|pro|   packets|     bytes|sPort|dPort|

That's interesting! No records returned for echo requests. How can that be?! Very interesting! Did I miss something? Leave me a note in the comment section.

Looking at the rwfilter --print-volume-statistics to see if there are any clues as to why no packets were returned

1
2
3
4
5
sans@sec503:~$ rwfilter --start=2022/01/05 --end=2022/07/01T23 --icmp-type=8 --icmp-code=0 --type=all --max-pass=100000 --print-volume-statistics
     |              Recs|           Packets|               Bytes|     Files|
Total|          25713809|         688813257|        402383908447|     13064|
 Pass|                 0|                 0|                   0|          |
 Fail|          25713809|         688813257|        402383908447|          |

Going back further in time, just to see what the ICMP echo request output looks like.

1
2
3
4
5
6
7
sans@sec503:~$ rwfilter --start=2012/01/05 --end=2022/07/01T23 --icmp-type=8 --icmp-code=0 --type=all --max-pass=100000
--pass-destination=stdout | rwcut --num-recs=4 --fields sip,dip,proto,packets,bytes --icmp-type-and-code
                                    sIP|                                    dIP|pro|   packets|     bytes|sPort|dPort|
                          192.168.2.166|                            192.168.2.1|  1|        31|      2604|    8|    0|
                          192.168.2.166|                            192.168.2.1|  1|         9|       756|    8|    0|
                          192.168.2.166|                            192.168.2.1|  1|        10|       840|    8|    0|
                          192.168.2.166|                            192.168.2.1|  1|        30|      2520|    8|    0|

We now see four records of ICMP Type 8 and Code 0.

Get the filenames via rwfilter --print-file-names.

1
2
3
4
5
6
sans@sec503:~$ rwfilter --start=2022/01/05 --end=2022/07/01T23 \
--icmp-type=8 --icmp-code=0 --type=all --max-pass=100000 --print-volume-statistics \
--print-filenames | more
/data/in/2022/02/08/in-internal_20220208.14
/data/out/2022/02/08/out-internal_20220208.14
/data/inweb/2022/02/08/iw-internal_20220208.14
/data/ext2ext/2022/02/08/ext2ext-internal_20220208.14
...

Find the missing files via rwfilter --print-missing-files.

1
2
3
4
5
6
7
8
9
sans@sec503:~$ rwfilter --start=2022/01/05 --end=2022/07/01T23 --icmp-type=8 \
--icmp-code=0 --type=all --max-pass=100000 --print-volume-statistics \
--print-missing-files | more

Missing /data/out/2022/01/20/out-Internal_20220120.13
Missing /data/out/2022/01/20/out-Perimeter_20220120.13
Missing /data/out/2022/01/20/out-ERS_20220120.13
Missing /data/out/2022/01/20/out-internal_20220120.13
Missing /data/inweb/2022/01/20/iw-Internal_20220120.13
Missing /data/inweb/2022/01/20/iw-Perimeter_20220120.13
...
 
Leveraging rwfglob.

1
2
sans@sec503:~/nik$ rwfglob --start-date=2012/01/01 --end-date=2022/07/01 \
--no-file-names
globbed 24574 files; 0 on tape

Revisiting rwcount bin sizes from the time perspective. Below shows the time is at 30 minutes interval. This seems to be the default --bin-size.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
sans@sec503:~/nik$ rwfilter --start=2022/01/05 --end=2022/07/01T23 --dport=443 \
--type=all --max-pass=5 --pass-destination=stdout | rwcount
               Date|        Records|               Bytes|          Packets|
2022/02/08T15:28:30|           1.00|             6390.00|             5.00|
2022/02/08T15:29:00|           0.00|                0.00|             0.00|
2022/02/08T15:29:30|           0.00|                0.00|             0.00|
2022/02/08T15:30:00|           0.00|                0.00|             0.00|
2022/02/08T15:30:30|           0.00|                0.00|             0.00|
2022/02/08T15:31:00|           0.00|                0.00|             0.00|
2022/02/08T15:31:30|           1.00|             6390.00|             5.00|
2022/02/08T15:32:00|           0.00|                0.00|             0.00|
2022/02/08T15:32:30|           3.00|            19170.00|            15.00|

Adjusting the rwcount --bin-size by using terminal to do arithmetic. Changing the --bin-size to two minutes interval.

1
2
3
4
5
sans@sec503:~/nik$ rwfilter --start=2022/01/05 --end=2022/07/01T23 --dport=443 \
--type=all --max-pass=5 --pass-destination=stdout | rwcount --bin-size=$((2*60))
               Date|        Records|               Bytes|          Packets|
2022/02/08T15:28:00|           1.00|             6390.00|             5.00|
2022/02/08T15:30:00|           1.00|             6390.00|             5.00|
2022/02/08T15:32:00|           3.00|            19170.00|            15.00|

Changing the --bin-size to 5 minutes interval.

1
2
3
4
sans@sec503:~/nik$ rwfilter --start=2022/01/05 --end=2022/07/01T23 --dport=443 \
--type=all --max-pass=5 --pass-destination=stdout | rwcount --bin-size=$((5*60))
               Date|        Records|               Bytes|          Packets|
2022/02/08T15:25:00|           1.00|             6390.00|             5.00|
2022/02/08T15:30:00|           4.00|            25560.00|            20.00|

Using the time as a range via  rwfilter --stime.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
sans@sec503:~$ rwfilter --start-date=2022/02/09T16 --stime=2022/02/09T16:00:00-2022/02/09T16:02:00   \
--type=all --pass-destination=stdout --protocol=0- | rwcut --fields=stime,sip,dip
                  sTime|                                    sIP|                                    dIP|
2022/02/09T16:00:19.850|                                8.8.8.8|                          172.28.10.137|
2022/02/09T16:00:35.377|                          17.253.26.125|                          172.28.10.137|
2022/02/09T16:01:29.106|                                8.8.8.8|                          172.28.10.137|
2022/02/09T16:01:29.817|                                8.8.8.8|                          172.28.10.137|
2022/02/09T16:00:19.850|                          172.28.10.137|                                8.8.8.8|
2022/02/09T16:00:35.377|                          172.28.10.137|                          17.253.26.125|
2022/02/09T16:01:29.106|                          172.28.10.137|                                8.8.8.8|
2022/02/09T16:01:29.817|                          172.28.10.137|                                8.8.8.8|
2022/02/09T16:01:29.214|                           52.109.20.75|                            172.28.30.4|
2022/02/09T16:01:29.892|                            52.109.8.20|                            172.28.30.4|
2022/02/09T16:01:29.892|                            52.109.8.20|                            172.28.30.4|
2022/02/09T16:00:19.852|                           72.21.81.240|                           172.28.10.25|

Find completed flows, by looking at the SYN, ACK, FIN and RST flags. Note the --flags-all=SAF/SAF,SAR/SAR parameters for rwfilter.

1
2
3
4
5
6
7
sans@sec503:~$ rwfilter --start-date=2022/02/09T16   --type=all --pass-destination=stdout --protocol=6 --flags-all=SAF/SAF,SAR/SAR | \
rwcut --fields=stime,sip,dip,flags --num-recs=5
                  sTime|                                    sIP|                                    dIP|   flags|
2022/02/09T16:00:19.852|                           72.21.81.240|                           172.28.10.25|FS PA   |
2022/02/09T16:05:25.018|                           52.167.17.97|                            172.28.30.5|FS PA   |
2022/02/09T16:02:31.816|                         52.167.249.196|                           172.28.10.89|FS PA E |
2022/02/09T16:02:39.571|                          142.250.72.10|                            172.28.30.5|FS PA   |
2022/02/09T16:03:15.843|                          142.250.72.35|                            172.28.50.2|FS PA   |

Print rwcut TCP flags as integers via --integer-tcp-flags.

1
2
3
4
5
6
7
sans@sec503:~$ rwfilter --start-date=2022/02/09T16   --type=all --pass-destination=stdout --protocol=6 --flags-all=SAF/SAF,SAR/SAR | \
rwcut --fields=stime,sip,dip,flags --num-recs=5 --integer-tcp-flags
                  sTime|                                    sIP|                                    dIP|fla|
2022/02/09T16:00:19.852|                           72.21.81.240|                           172.28.10.25| 27|
2022/02/09T16:05:25.018|                           52.167.17.97|                            172.28.30.5| 27|
2022/02/09T16:02:31.816|                         52.167.249.196|                           172.28.10.89| 91|
2022/02/09T16:02:39.571|                          142.250.72.10|                            172.28.30.5| 27|
2022/02/09T16:03:15.843|                          142.250.72.35|                            172.28.50.2| 27|

Change rwcut format of the IP address to decimal via --ip-format=decimal.

1
2
3
4
5
6
7
sans@sec503:~$ rwfilter --start-date=2022/02/09T16   --type=all --pass-destination=stdout \
--protocol=6 | rwcut --fields=sip,dip --num-recs=5 --ip-format=decimal
                                    sIP|                                    dIP|
                              879563851|                             2887523844|
                              879560724|                             2887523844|
                              879560724|                             2887523844|
                              879563851|                             2887523844|
                              879870852|                             2887523844|

Convert the decimal values by to dotted notation via num2dot --ip-field

1
2
3
4
5
6
7
sans@sec503:~$ rwfilter --start-date=2022/02/09T16   \
--type=all --pass-destination=stdout --protocol=6 | \
rwcut --fields=sip,dip --num-recs=5 --ip-format=decimal | num2dot --ip-field=1,2
            sIP|            dIP|
   52.109.20.75|    172.28.30.4|
    52.109.8.20|    172.28.30.4|
    52.109.8.20|    172.28.30.4|
   52.109.20.75|    172.28.30.4|
 52.113.195.132|    172.28.30.4|

Show the IP addresses as hexadecimal via rwcut --ip-format=hexadecimal

1
2
3
4
5
6
7
sans@sec503:~$ rwfilter --start-date=2022/02/09T16   --type=all \
--pass-destination=stdout --protocol=6 | rwcut --fields=sip,dip --num-recs=5 \
--ip-format=hexadecimal
                             sIP|                             dIP|
                        346d144b|                        ac1c1e04|
                        346d0814|                        ac1c1e04|
                        346d0814|                        ac1c1e04|
                        346d144b|                        ac1c1e04|
                        3471c384|                        ac1c1e04|

Leveraging rwaddrcount to get information about the records in the file.

1
2
3
4
sans@sec503:~/nik$ rwaddrcount attack-trace.rw --print-recs
            sIP|               Bytes|   Packets|   Records|          Start_Time|            End_Time|
 192.150.11.111|                7297|       153|         6| 2019/04/20T03:28:28| 2019/04/20T03:28:44|
 98.114.205.102|              171264|       195|         6| 2019/04/20T03:28:28| 2019/04/20T03:28:44|

Get some additional file statistics via rwaddrcount.

1
2
3
sans@sec503:~/nik$ rwaddrcount attack-trace.rw --print-stat
          |  sIP_Uniq|               Bytes|        Packets|        Records|
     Total|         2|              178561|            348|             12|

What are the 2 actual unique source IP values in that file? Continuing with rwaddrcount.

1
2
3
4
sans@sec503:~/nik$ rwaddrcount attack-trace.rw --print-ips
            sIP
 192.150.11.111
 98.114.205.102

Leveraging rwappend, to create a new flow file, consisting of 2 existing flow files.

Use rwfilter to create a file consisting of TCP flows.
1
sans@sec503:~/nik$ rwfilter --start-date=2022/02/09T16 --max-pass=2 --type=all \
--pass-destination=tcp_file.rw --protocol=6
 
Use rwfilter to create a file consisting of UDP flows.

1
sans@sec503:~/nik$ rwfilter --start-date=2022/02/09T16 --max-pass=2  --type=all \
--pass-destination=udp_file.rw --protocol=17

Combine the TCP and UDP flow files created by rwfilter using rwappend.

1
sans@sec503:~/nik$ rwappend --create tcp_udp.rw tcp_file.rw udp_file.rw

Use rwcut to see the contents of the rwappend merged files.

1
2
3
4
5
6
7
sans@sec503:~/nik$ rwcut tcp_udp.rw --num-recs=5
                                    sIP|                                    dIP|sPort|dPort|pro|   packets|     bytes|   flags|                  sTime| duration|                  eTime|   sensor|
                           52.109.20.75|                            172.28.30.4|  443|50137|  6|         9|      8048| S PA   |2022/02/09T16:01:29.214|    1.191|2022/02/09T16:01:30.405| Internal|
                            52.109.8.20|                            172.28.30.4|  443|50138|  6|         7|      6533| S PA   |2022/02/09T16:01:29.892|    0.513|2022/02/09T16:01:30.405| Internal|
                                8.8.8.8|                          172.28.10.137|   53|55874| 17|         1|       289|        |2022/02/09T16:00:19.850|    0.002|2022/02/09T16:00:19.852| Internal|
                          17.253.26.125|                          172.28.10.137|  123|  123| 17|         1|        76|        |2022/02/09T16:00:35.377|    0.034|2022/02/09T16:00:35.411| Internal|

Deduplicating two files into one via rwdedupe.

1
sans@sec503:~/nik$ rwdedupe --buffer-size=88000 8.rw attack-trace.rw \
--output=deduped-data.rw

Use rwfilter --ip-version to track IPv6 addresses.

1
2
3
4
5
6
7
8
sans@sec503:~/nik$ rwfilter --start=2022/01/05 --end=2022/07/01T23 --ip-version=6 \
--type=all --max-pass=5 --pass-destination=stdout | rwcut --fields=sip,dip,dport
                                    sIP|                                    dIP|dPort|
               fe80::250:56ff:fead:e8b6|                                ff02::2|    0|
                fe80::250:56ff:fead:445|                                ff02::2|    0|
               fe80::250:56ff:fead:e8b6|                                ff02::2|    0|
                fe80::250:56ff:fead:445|                                ff02::2|    0|
               fe80::250:56ff:fead:e8b6|                                ff02::2|    0|

Leveraging rwpcut to convert .pcap files to ASCII.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
sans@sec503:~/nik$ rwpcut attack-trace.pcap 2>/dev/null | more
{'version': False, 'columns': False, 'delimiter': '|', 'epoch_time': False, 'fields': ['time
', 'sip', 'dip', 'sport', 'dport', 'proto', 'payhex'], 'integer_ips': False, 'zero_pad_ips':
 False, 'files': ['attack-trace.pcap']}
reading from file attack-trace.pcap, link-type EN10MB (Ethernet), snapshot length 65535

time|sip|dip|sport|dport|proto|payhex|
2019-04-20 03:28:28.374595|98.114.205.102|192.150.11.111|1821|445|6|450000303b9f40007106d24a
6272cd66c0960b6f071d01bd08cb8066000000007002faf0fa440000020405b401010402|
2019-04-20 03:28:28.375059|192.150.11.111|98.114.205.102|445|1821|6|450000300000400040063eea
c0960b6f6272cd6601bd071d5c3ba87408cb8067701216d0d9a40000020405b401010402|
2019-04-20 03:28:28.493653|98.114.205.102|192.150.11.111|1821|445|6|450000283bad40007106d244
6272cd66c0960b6f071d01bd08cb80675c3ba8755010faf022480000000000000000|
2019-04-20 03:28:28.508770|98.114.205.102|192.150.11.111|1821|445|6|450000283bae40007106d243
6272cd66c0960b6f071d01bd08cb80675c3ba8755011faf022470000000000000000|
...

Ooops, that looks nasty. Making it cleaner by leveraging --fields, --columnar and --delimiter.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
sans@sec503:~/nik$ rwpcut attack-trace.pcap --fields=sip,sport,dip,dport --columnar --delimiter=" |    " \
--zero-pad-ips 2>/dev/null| more
{'version': False, 'columns': True, 'delimiter': ' |    ', 'epoch_time': False, 'fields': ['
sip', 'sport', 'dip', 'dport'], 'integer_ips': False, 'zero_pad_ips': True, 'files': ['attac
k-trace.pcap']}
reading from file attack-trace.pcap, link-type EN10MB (Ethernet), snapshot length 65535

            sip |    sport |                dip |    dport |
098.114.205.102 |     1821 |    192.150.011.111 |      445 |
192.150.011.111 |      445 |    098.114.205.102 |     1821 |
098.114.205.102 |     1821 |    192.150.011.111 |      445 |
098.114.205.102 |     1821 |    192.150.011.111 |      445 |
098.114.205.102 |     1828 |    192.150.011.111 |      445 |
192.150.011.111 |      445 |    098.114.205.102 |     1828 |
192.150.011.111 |      445 |    098.114.205.102 |     1821 |
...

Converting a pcap to SiLK flow via rwptoflow. Then redirect the output to rwcut.

1
2
3
4
5
6
7
sans@sec503:~/nik$ rwptoflow attack-trace.pcap | rwcut --num-recs=5 --fields=sip,dip,flags
                                    sIP|                                    dIP|   flags|
                         98.114.205.102|                         192.150.11.111| S      |
                         192.150.11.111|                         98.114.205.102| S  A   |
                         98.114.205.102|                         192.150.11.111|    A   |
                         98.114.205.102|                         192.150.11.111|F   A   |
                         98.114.205.102|                         192.150.11.111| S      |

Write the rwptoflow converted flow data to a file. At the same time, for the records that were used to create the flow, create another pcap file. Get the statistics when everything is done. Add a comment also.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
sans@sec503:~/nik$ rwptoflow attack-trace.pcap --flow-output rwp_flow_file.rw \
--note-add "Converted from attacktrace.pcap" --compression-method=zlib \
--packet-pass-output=rwp.pcap --print-statistics --set-sensorid=1
Packet count statistics for attack-trace.pcap
                         348 read
                           0 rejected: too short to get information
                           0 rejected: not IPv4

                         348 total written
                           0 total fragmented packets
                           0 zero-packet of a fragment
                           0 incomplete (no ports and/or flags)

Validate the rwp.pcap file.

1
2
sans@sec503:~/nik$ file rwp.pcap
rwp.pcap: pcap capture file, microsecond ts (little-endian) - version 2.4 (Ethernet, capture length 65535)

Leveraging rwrandomizeip to randomize IPs.

First take 5 IPs from the 8.rw file.

1
2
3
4
5
6
7
sans@sec503:~/nik$ rwcut 8.rw --fields=sip,dip --num-recs=5
                                    sIP|                                    dIP|
                                8.8.8.8|                          172.28.10.137|
                                8.8.8.8|                          172.28.10.137|
                                8.8.8.8|                          172.28.10.137|
                                8.8.8.8|                          172.28.10.137|
                                8.8.8.8|                          172.28.10.137|

Now randomize the first 5 records via rwrandomizeip.

1
2
3
4
5
6
7
sans@sec503:~/nik$ rwrandomizeip 8.rw | rwcut --fields=sip,dip --num-recs=5
                                    sIP|                                    dIP|
                          10.255.111.99|                           10.39.63.221|
                         10.215.197.155|                           10.56.240.34|
                         10.189.217.143|                          10.192.119.61|
                             10.12.82.4|                          10.251.82.128|
                           10.78.26.161|                           10.173.1.103|

Convert SiLK flow data to IPFIX using rwsilk2ipfix.

1
2
sans@sec503:~/nik$ rwsilk2ipfix 8.rw --ipfix-output rw-2-2ipfix.dat --print-statistics
rwsilk2ipfix: Wrote 100 IPFIX records to 'rw-2-2ipfix.dat'

View a sample of the rwsilk2ipfix converted data using yafscii.

1
2
3
4
5
6
7
sans@sec503:~/nik$ yafscii --in=rw-2-2ipfix.dat --out=-  | more
2022-02-08 14:26:40.723 - 14:26:40.724 (0.001 sec) udp 8.8.8.8:53 => 172.28.10.137:56213 (1/218 ->)
2022-02-08 14:27:10.329 - 14:27:10.342 (0.013 sec) udp 8.8.8.8:53 => 172.28.10.137:55171 (1/102 ->)
2022-02-08 14:27:43.431 - 14:27:43.433 (0.002 sec) udp 8.8.8.8:53 => 172.28.10.137:54512 (1/213 ->)
2022-02-08 14:28:29.633 - 14:28:29.646 (0.013 sec) udp 8.8.8.8:53 => 172.28.10.137:55359 (1/100 ->)
2022-02-08 14:28:30.328 - 14:28:30.396 (0.068 sec) udp 8.8.8.8:53 => 172.28.10.137:54864 (1/108 ->)
...

Taking a different view of the IPFIX record information via ipfixDump.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
sans@sec503:~/nik$ ipfixDump --yaf --in=rw-2-2ipfix.dat --out=- | more
--- Message Header ---
export time: 2023-06-15 16:04:04        observation domain id: 0
message length: 952                     sequence number: 0 (0)

--- template record ---
header:
        tid: 40404 (0x9dd4)    field count:    21    scope:     0
fields:
        ent:     0  id:   152  type: millisec  len:     8     flowStartMilliseconds
        ent:     0  id:   153  type: millisec  len:     8     flowEndMilliseconds
        ent:     0  id:     2  type: uint64    len:     4     packetDeltaCount
        ent:     0  id:     1  type: uint64    len:     4     octetDeltaCount
        ent:     0  id:    10  type: uint32    len:     2     ingressInterface
        ent:     0  id:    14  type: uint32    len:     2     egressInterface
        ent:  6871  id:    33  type: uint16    len:     2     silkAppLabel
        ent:  6871  id:    31  type: uint16    len:     2     silkFlowSensor
        ent:  6871  id:    30  type: uint8     len:     1     silkFlowType
        ent:  6871  id:    32  type: uint8     len:     1     silkTCPState
        ent:     0  id:     4  type: uint8     len:     1     protocolIdentifier
        ent:     0  id:   210  type: octet     len:     1     paddingOctets
        ent:     0  id:     7  type: uint16    len:     2     sourceTransportPort
        ent:     0  id:    11  type: uint16    len:     2     destinationTransportPort
        ent:     0  id:   210  type: octet     len:     1     paddingOctets
        ent:     0  id:     6  type: uint16    len:     1     tcpControlBits
        ent:  6871  id:    14  type: uint16    len:     1     initialTCPFlags
        ent:  6871  id:    15  type: uint16    len:     1     unionTCPFlags
        ent:     0  id:     8  type: ipv4      len:     4     sourceIPv4Address
        ent:     0  id:    12  type: ipv4      len:     4     destinationIPv4Address
        ent:     0  id:    15  type: ipv4      len:     4     ipNextHopIPv4Address
--- template record ---
header:
        tid: 40657 (0x9ed1)    field count:    17    scope:     0
fields:
        ent:     0  id:   152  type: millisec  len:     8     flowStartMilliseconds
        ent:     0  id:   153  type: millisec  len:     8     flowEndMilliseconds
        ent:     0  id:     2  type: uint64    len:     4     packetDeltaCount
        ent:     0  id:     1  type: uint64    len:     4     octetDeltaCount
        ent:     0  id:    10  type: uint32    len:     2     ingressInterface
        ent:     0  id:    14  type: uint32    len:     2     egressInterface
        ent:  6871  id:    33  type: uint16    len:     2     silkAppLabel
        ent:  6871  id:    31  type: uint16    len:     2     silkFlowSensor
        ent:  6871  id:    30  type: uint8     len:     1     silkFlowType
        ent:  6871  id:    32  type: uint8     len:     1     silkTCPState
        ent:     0  id:     4  type: uint8     len:     1     protocolIdentifier
        ent:     0  id:   210  type: octet     len:     1     paddingOctets
        ent:     0  id:   210  type: octet     len:     2     paddingOctets
        ent:     0  id:   139  type: uint16    len:     2     icmpTypeCodeIPv6
        ent:     0  id:    27  type: ipv6      len:    16     sourceIPv6Address
        ent:     0  id:    28  type: ipv6      len:    16     destinationIPv6Address
        ent:     0  id:    62  type: ipv6      len:    16     ipNextHopIPv6Address
...

Convert the IPFIX file back to SiLK format using rwipfix2silk. Rather than writing the output to a file, write instead to stdout and use rwcut to see the values.

1
2
3
4
5
6
7
8
sans@sec503:~/nik$ rwipfix2silk --silk-output=- rw-2-2ipfix.dat | rwcut --fields sip,dip --num-recs=5
                                    sIP|                                    dIP|
                                8.8.8.8|                          172.28.10.137|
                                8.8.8.8|                          172.28.10.137|
                                8.8.8.8|                          172.28.10.137|
                                8.8.8.8|                          172.28.10.137|
                                8.8.8.8|                          172.28.10.137|
...

Split a flow file into multiple files with rwsplit.

1
sans@sec503:~/nik$ rwsplit --basename=nik_split_ --compression=best --flow-limit=4 \
--max-outputs=2 --note-add="Files created with rwsplit" attack-trace.rw

Validate the rwsplit files were created.

1
2
sans@sec503:~/nik$ ls nik_split_.0000000*
nik_split_.00000000.rwf  nik_split_.00000001.rwf

Use rwfileinfo to get information on one of the rwsplit created files.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
sans@sec503:~/nik$ rwfileinfo nik_split_.00000001.rwf
nik_split_.00000001.rwf:
  format(id)          FT_RWIPV6ROUTING(0x0c)
  version             16
  byte-order          littleEndian
  compression(id)     zlib(1)
  header-length       264
  record-length       88
  record-version      1
  silk-version        3.19.2
  count-records       4
  file-size           404
  command-lines
                   1  rwsplit --basename=nik_split_ --compression=best --flow-limit=4 --max-outputs=2 --note-add=Files created with rwsplit attack-trace.rw
  annotations
                   1  Files created with rwsplit

Changing the byte order of the file with rwswapbytes.

Get the current byte order of the file 8.rw

1
2
3
sans@sec503:~/nik$ rwfileinfo 8.rw --fields=byte-order
8.rw:
  byte-order          littleEndian

Change the byte order using rwswapbytes

1
sans@sec503:~/nik$ rwswapbytes --big-endian \
--note-add="Byte order swapped from little endian" 8.rw 8-swappped.rwf

Validate the byte order has been changed.

1
2
3
sans@sec503:~/nik$ rwfileinfo 8-swappped.rwf --fields=byte-order
8-swappped.rwf:
  byte-order          BigEndian

Get some totals with rwtotal. Looking at the first 8 bytes of the destination IPs.

1
2
3
4
5
sans@sec503:~/nik$ rwtotal attack-trace.rw --summation --skip-zero --dip-first-8
 dIP_First8|        Records|               Bytes|          Packets|
         98|              6|                7297|              153|
        192|              6|              171264|              195|
     TOTALS|             12|              178561|              348|

Instead look at the first 24 bytes of the source IP.
1
2
3
4
5
sans@sec503:~/nik$ rwtotal attack-trace.rw --summation --skip-zero --sip-first-24
sIP_First24|        Records|               Bytes|          Packets|
 98.114.205|              6|              171264|              195|
192.150. 11|              6|                7297|              153|
     TOTALS|             12|              178561|              348|

Use rwtotal to learn what are the protocols seen on the network?

1
2
3
4
sans@sec503:~/nik$ rwtotal attack-trace.rw --proto --summation --skip-zero
   protocol|        Records|               Bytes|          Packets|
          6|             12|              178561|              348|
     TOTALS|             12|              178561|              348|

Looking at the destination ports

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
sans@sec503:~/nik$ rwtotal attack-trace.rw --dport --summation --skip-zero --print-filenames
attack-trace.rw
      dPort|        Records|               Bytes|          Packets|
        445|              2|                4945|               18|
       1080|              1|              165088|              159|
       1821|              1|                 128|                3|
       1828|              1|                1590|               17|
       1924|              1|                 250|                6|
       1957|              1|                 381|                6|
       2152|              1|                4488|              112|
       8884|              2|                 841|               15|
      36296|              2|                 850|               12|
     TOTALS|             12|              178561|              348|