Wednesday, August 18, 2021

TShark : Finding data with "contains" and "matches" (Regular Expression)

Recently, I've been working with the SANS Institute on some Livestream sessions, promoting the SEC503: Intrusion Detection In Depth class. As a result, I produced some videos using TShark. In the first of those videos, we did an intro to TShark by focusing on reconnaissance at the IP layer. In the second session, we focused on reconnaissance at the transport layer and working with some common application protocols. In the 3rd session, we extracted suspicious and malicious content from PCAPS.

In a session prior to these, I focused on Full Packet Capturing with TShark for Continuous Monitoring & Threat Intel via IP, Domains, & URLS. While I did not do blog posts for those (and I wish I had thought about it before),  I've chosen to do a blog post for the TShark and working with regular expressions

Many times, when looking at packets or logs, I leverage "grep --perl-regexp". However, when looking at packets for patterns, sequence of bytes, etc., do we really need to leverage grep or another external tool? Let's see.

In session three in which I exported suspicious and malicious content, I used the following for example to identify the name of the malicious file:

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r attack-trace.pcap -V | grep ssms.exe
0030  63 68 6f 20 67 65 74 20 73 73 6d 73 2e 65 78 65   cho get ssms.exe
0070  26 73 73 6d 73 2e 65 78 65 0d 0a                  &ssms.exe..
0000  73 73 6d 73 2e 65 78 65 0d 0a                     ssms.exe..
0000  52 45 54 52 20 73 73 6d 73 2e 65 78 65 0d 0a      RETR ssms.exe..

... and the following example to identify bytes within the suspicious file.

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# xxd -groupsize 1 -u decode-as.pcap | grep '0A 25 25 45'
0192b620: 72 65 66 0A 32 38 34 30 32 38 32 34 0A 25 25 45  ref.28402824.%%E

Let's now see how TShark can help us out here. First let's leverage the "contains" display filter:

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'frame contains WWW.SecurityNik.com'

Oooops!! Looks like we are starting off on the wrong foot. No result was returned. Well the reason no result was returned, is because contains is case sensitive. Let's try this again.

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'frame contains www.securitynik.com' -x | more                                        
0000  08 00 00 00 00 00 00 02 00 01 04 06 08 00 27 0e   ..............'.
0010  34 8d 00 00 45 00 00 ae 95 8e 40 00 40 06 e9 5e   4...E.....@.@..^
0020  0a 00 02 0f 8e fb 20 53 e1 cc 00 50 14 d2 bd 46   ...... S...P...F
0030  00 00 fa 02 50 18 fa f0 bb fd 00 00 47 45 54 20   ....P.......GET 
0040  2f 32 30 31 38 2f 30 37 2f 68 6f 73 74 2d 62 61   /2018/07/host-ba
0050  73 65 64 2d 74 68 72 65 61 74 2d 68 75 6e 74 69   sed-threat-hunti
0060  6e 67 2d 77 69 74 68 2e 68 74 6d 6c 20 48 54 54   ng-with.html HTT
0070  50 2f 31 2e 31 0d 0a 48 6f 73 74 3a 20 77 77 77   P/1.1..Host: www
0080  2e 73 65 63 75 72 69 74 79 6e 69 6b 2e 63 6f 6d   .securitynik.com
0090  0d 0a 55 73 65 72 2d 41 67 65 6e 74 3a 20 53 65   ..User-Agent: Se
00a0  63 75 72 69 74 79 4e 69 6b 20 54 65 73 74 69 6e   curityNik Testin
00b0  67 0d 0a 41 63 63 65 70 74 3a 20 2a 2f 2a 0d 0a   g..Accept: */*..
00c0  0d 0a                                             ..

Much better! Important take away, is that contains is case sensitive. 

In the previous example, we looked at contents from the frame level. Let's move up to the IP layer.

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'ip contains sans.org' -x | more                                                      
0000  08 00 00 00 00 00 00 02 00 01 04 06 08 00 27 0e   ..............'.
0010  34 8d a8 d6 45 00 00 7c 54 40 40 00 40 06 8d cf   4...E..|T@@.@...
0020  0a 00 02 0f 2d 3c 1f 22 a3 0a 00 50 68 1a f9 d1   ....-<."...Ph...
0030  00 0b b8 02 50 18 fa f0 58 db 00 00 47 45 54 20   ....P...X...GET 
0040  2f 20 48 54 54 50 2f 31 2e 31 0d 0a 48 6f 73 74   / HTTP/1.1..Host
0050  3a 20 77 77 77 2e 73 61 6e 73 2e 6f 72 67 0d 0a   : www.sans.org..
0060  55 73 65 72 2d 41 67 65 6e 74 3a 20 53 65 63 75   User-Agent: Secu
0070  72 69 74 79 4e 69 6b 20 54 65 73 74 69 6e 67 0d   rityNik Testing.
0080  0a 41 63 63 65 70 74 3a 20 2a 2f 2a 0d 0a 0d 0a   .Accept: */*....

Making progress! Similarly, I look at the TCP layer

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'tcp contains siriuscom.com' -x | more                                                
0000  08 00 00 00 00 00 00 02 00 01 04 06 08 00 27 0e   ..............'.
0010  34 8d e6 73 45 00 00 81 e1 00 40 00 40 06 ca c4   4..sE.....@.@...
0020  0a 00 02 0f d1 3b b1 67 c5 ba 00 50 af 30 ea 13   .....;.g...P.0..
0030  00 08 ca 02 50 18 fa f0 8f 25 00 00 47 45 54 20   ....P....%..GET 
0040  2f 20 48 54 54 50 2f 31 2e 31 0d 0a 48 6f 73 74   / HTTP/1.1..Host
0050  3a 20 77 77 77 2e 73 69 72 69 75 73 63 6f 6d 2e   : www.siriuscom.
0060  63 6f 6d 0d 0a 55 73 65 72 2d 41 67 65 6e 74 3a   com..User-Agent:
0070  20 53 65 63 75 72 69 74 79 4e 69 6b 20 54 65 73    SecurityNik Tes
0080  74 69 6e 67 0d 0a 41 63 63 65 70 74 3a 20 2a 2f   ting..Accept: */
0090  2a 0d 0a 0d 0a                                    *....

And finally, let's look at the application layer.

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'http.host contains "www.siriuscom.com"' -x | more
0000  08 00 00 00 00 00 00 02 00 01 04 06 08 00 27 0e   ..............'.
0010  34 8d e6 73 45 00 00 81 e1 00 40 00 40 06 ca c4   4..sE.....@.@...
0020  0a 00 02 0f d1 3b b1 67 c5 ba 00 50 af 30 ea 13   .....;.g...P.0..
0030  00 08 ca 02 50 18 fa f0 8f 25 00 00 47 45 54 20   ....P....%..GET 
0040  2f 20 48 54 54 50 2f 31 2e 31 0d 0a 48 6f 73 74   / HTTP/1.1..Host
0050  3a 20 77 77 77 2e 73 69 72 69 75 73 63 6f 6d 2e   : www.siriuscom.
0060  63 6f 6d 0d 0a 55 73 65 72 2d 41 67 65 6e 74 3a   com..User-Agent:
0070  20 53 65 63 75 72 69 74 79 4e 69 6b 20 54 65 73    SecurityNik Tes
0080  74 69 6e 67 0d 0a 41 63 63 65 70 74 3a 20 2a 2f   ting..Accept: */
0090  2a 0d 0a 0d 0a                                    *....

Contains is a really a hex filter. If there is no colon after the first byte, the input is considered as ASCII.

Let's see some different ways we can detect "sans".

A similar (not the same) display filter may look like: 'dns.qry.name == "www.sans.org"'. Do note, I say similar because the first one is not fully www.sans.org but just the string sans.

First up, using hex escaped characters.

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'dns.qry.name contains "\x73\x61\x6e\x73"' -x | more
0000  08 00 00 00 00 00 00 02 00 01 04 06 08 00 27 0e   ..............'.
0010  34 8d 00 00 45 00 00 3a e9 8a 40 00 40 11 05 0c   4...E..:..@.@...
0020  0a 00 02 0f 40 47 ff c6 e1 93 00 35 00 26 4c 54   ....@G.....5.&LT
0030  da 6f 01 00 00 01 00 00 00 00 00 00 03 77 77 77   .o...........www
0040  04 73 61 6e 73 03 6f 72 67 00 00 01 00 01         .sans.org.....

Next up, using a combination of ASCII and hex escaped characters.

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'dns.qry.name contains "www.\x73\x61\x6e\x73.org"' -x | more
0000  08 00 00 00 00 00 00 02 00 01 04 06 08 00 27 0e   ..............'.
0010  34 8d 00 00 45 00 00 3a e9 8a 40 00 40 11 05 0c   4...E..:..@.@...
0020  0a 00 02 0f 40 47 ff c6 e1 93 00 35 00 26 4c 54   ....@G.....5.&LT
0030  da 6f 01 00 00 01 00 00 00 00 00 00 03 77 77 77   .o...........www
0040  04 73 61 6e 73 03 6f 72 67 00 00 01 00 01         .sans.org.....

Finally, looking at the bytes separated by colons

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'dns.qry.name contains 73:61:6e:73' -x | more                                         
0000  08 00 00 00 00 00 00 02 00 01 04 06 08 00 27 0e   ..............'.
0010  34 8d 00 00 45 00 00 3a e9 8a 40 00 40 11 05 0c   4...E..:..@.@...
0020  0a 00 02 0f 40 47 ff c6 e1 93 00 35 00 26 4c 54   ....@G.....5.&LT
0030  da 6f 01 00 00 01 00 00 00 00 00 00 03 77 77 77   .o...........www
0040  04 73 61 6e 73 03 6f 72 67 00 00 01 00 01         .sans.org.....


Let's now look at regular expression using matches;

When using matches, the filter expression is processed twice. Once by the Wireshark display filter engine and the second by PCRE library

Because of above, you are better of using \\. rather than \. when using matches for the dot/period.

While contains is good for finding a particular string, what about if you want to find a particular pattern. This is where matches is helpful. To see the power of matches, let's look at it first through the lens of "contains".

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y '(http.request.method contains GET) || (http.request.method contains POST)' | more 
   13   5.134106    10.0.2.15 → 142.251.32.83 HTTP 194 GET /2018/07/host-based-threat-hunting-with.html HTTP/1.1 
  344  47.459625    10.0.2.15 → 142.251.41.83 HTTP 194 GET /2018/07/understanding-ip-fragmentation.html HTTP/1.1 
  634  64.722770    10.0.2.15 → 209.59.177.103 HTTP 149 GET / HTTP/1.1 
  722  84.262193    10.0.2.15 → 45.60.31.34  HTTP 144 GET / HTTP/1.1 
  809 163.016781    10.0.2.15 → 45.60.31.34  HTTP 145 POST / HTTP/1.1 
  861 174.261670    10.0.2.15 → 209.59.177.103 HTTP 150 POST / HTTP/1.1 
  917 186.636330    10.0.2.15 → 142.251.33.179 HTTP 195 POST /2018/07/understanding-ip-fragmentation.html HTTP/1.1 
  933 200.366293    10.0.2.15 → 172.217.165.19 HTTP 195 POST /2018/07/host-based-threat-hunting-with.html HTTP/1.1 

As can be seen above, contains was able to help us find the match. However, it took a little bit more bytes. A little bit more typing. Let's see what matches.

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'http.request.method matches "(GET|POST)"' | more
   13   5.134106    10.0.2.15 → 142.251.32.83 HTTP 194 GET /2018/07/host-based-threat-hunting-with.html HTTP/1.1 
  344  47.459625    10.0.2.15 → 142.251.41.83 HTTP 194 GET /2018/07/understanding-ip-fragmentation.html HTTP/1.1 
  634  64.722770    10.0.2.15 → 209.59.177.103 HTTP 149 GET / HTTP/1.1 
  722  84.262193    10.0.2.15 → 45.60.31.34  HTTP 144 GET / HTTP/1.1 
  809 163.016781    10.0.2.15 → 45.60.31.34  HTTP 145 POST / HTTP/1.1 
  861 174.261670    10.0.2.15 → 209.59.177.103 HTTP 150 POST / HTTP/1.1 
  917 186.636330    10.0.2.15 → 142.251.33.179 HTTP 195 POST /2018/07/understanding-ip-fragmentation.html HTTP/1.1 
  933 200.366293    10.0.2.15 → 172.217.165.19 HTTP 195 POST /2018/07/host-based-threat-hunting-with.html HTTP/1.1 

As can been seen above, matches have allowed us to simplify the process using regular expression. Above, we simply looked for GET or POST. That was easy!

If you remember from above, contains is case sensitive. Matches, is however case insensitive.

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'http.request.method matches "(get|post)"'
   13   5.134106    10.0.2.15 → 142.251.32.83 HTTP 194 GET /2018/07/host-based-threat-hunting-with.html HTTP/1.1 
  344  47.459625    10.0.2.15 → 142.251.41.83 HTTP 194 GET /2018/07/understanding-ip-fragmentation.html HTTP/1.1 
  634  64.722770    10.0.2.15 → 209.59.177.103 HTTP 149 GET / HTTP/1.1 
  722  84.262193    10.0.2.15 → 45.60.31.34  HTTP 144 GET / HTTP/1.1 
  809 163.016781    10.0.2.15 → 45.60.31.34  HTTP 145 POST / HTTP/1.1 
  861 174.261670    10.0.2.15 → 209.59.177.103 HTTP 150 POST / HTTP/1.1 
  917 186.636330    10.0.2.15 → 142.251.33.179 HTTP 195 POST /2018/07/understanding-ip-fragmentation.html HTTP/1.1 
  933 200.366293    10.0.2.15 → 172.217.165.19 HTTP 195 POST /2018/07/host-based-threat-hunting-with.html HTTP/1.1 

As seen above, even those get and post are in lowercase, we still got results returned. This is unlike what was experienced with contains.

If we wanted to enforce the case sensitivity, we can use (?-i). We know from the previous command that both GET and POST methods are in this PCAP and in uppercase. Let's look for uppercase GET and lowercase POST. Remember we are showing how to handle case sensitivity not insensitivity.

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'http.request.method matches "(?-i)(GET|post)"'
   13   5.134106    10.0.2.15 → 142.251.32.83 HTTP 194 GET /2018/07/host-based-threat-hunting-with.html HTTP/1.1 
  344  47.459625    10.0.2.15 → 142.251.41.83 HTTP 194 GET /2018/07/understanding-ip-fragmentation.html HTTP/1.1 
  634  64.722770    10.0.2.15 → 209.59.177.103 HTTP 149 GET / HTTP/1.1 
  722  84.262193    10.0.2.15 → 45.60.31.34  HTTP 144 GET / HTTP/1.1 

From the results returned, we can see only GET and not post. This is because we enforced case sensitivity as in we asked for GET in uppercase and POST in lowercase

Let's now see if there is any other method other than GET or POST.

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'http.request.method matches "[^(get|post)]"'

No results were returned. This suggests there are no other HTTP methods in the file. Let's confirm that our command is working as expected and that this is not a false negative situation. To confirm this actually works, let's remove the "post". If it works, we should see post as we are negating the get.

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'http.request.method matches "[^(get)]"'
  809 163.016781    10.0.2.15 → 45.60.31.34  HTTP 145 POST / HTTP/1.1 
  861 174.261670    10.0.2.15 → 209.59.177.103 HTTP 150 POST / HTTP/1.1 
  917 186.636330    10.0.2.15 → 142.251.33.179 HTTP 195 POST /2018/07/understanding-ip-fragmentation.html HTTP/1.1 
  933 200.366293    10.0.2.15 → 172.217.165.19 HTTP 195 POST /2018/07/host-based-threat-hunting-with.html HTTP/1.1 

Good stuff! We have results so we know our filter is correct. Sometimes, you need to find other ways to validate your command works.

There might be times when you know the first or first few and probably the last or last few letters. Matches can help here too! Let's say we are aware of a DNS request or response starting and ending with "s", has 2 characters in the middle but you not sure what those characters are. We can use the following:

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'dns matches "s..s"'
  715  84.199441    10.0.2.15 → 64.71.255.198 DNS 78 Standard query 0xda6f A www.sans.org
  716  84.199465    10.0.2.15 → 64.71.255.198 DNS 78 Standard query 0x686d AAAA www.sans.org
  717  84.222652 64.71.255.198 → 10.0.2.15    DNS 165 Standard query response 0x686d AAAA www.sans.org SOA ns-1746.awsdns-26.co.uk

What about those times when it has x or more characters in the middle? Below it has 5 or more characters

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'dns matches "sec.{5,}com"'                                                           
    3   0.000235    10.0.2.15 → 64.71.255.198 DNS 85 Standard query 0x4872 A www.securitynik.com
    4   0.000241    10.0.2.15 → 64.71.255.198 DNS 85 Standard query 0x0d7d AAAA www.securitynik.com
    5   0.150729 64.71.255.198 → 10.0.2.15    DNS 166 Standard query response 0x4872 A www.securitynik.com CNAME www.securitynik.com.ghs.googlehosted.com CNAME ghs.googlehosted.com A 172.217.165.19

Similarly, we can say we would only like to see results where there is a minimum of 1 or a maximum of 3 characters after the s:

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'dns matches "s.{1,3}\.org"'
  715  84.199441    10.0.2.15 → 64.71.255.198 DNS 78 Standard query 0xda6f A www.sans.org
  716  84.199465    10.0.2.15 → 64.71.255.198 DNS 78 Standard query 0x686d AAAA www.sans.org
  717  84.222652 64.71.255.198 → 10.0.2.15    DNS 165 Standard query response 0x686d AAAA www.sans.org SOA ns-1746.awsdns-26.co.uk

Let's say, we have a PCAP file with the following IP addresses:

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap  -T fields -e ip.src | sort | uniq
10.0.2.15
10.0.2.2
142.251.32.83
142.251.33.179
142.251.41.83
172.217.1.19
172.217.165.19
209.59.177.103
45.60.31.34
64.71.255.198

What we need to do now, is to extract the IPs where octet 1 starts with 142. Octet 2 only contains the number 1, 2 or 5 and up to 3 numbers. Octet 3 can only be 32 or 33. Octet 4 can only have be 3 numbers anywhere between 0 and 9.

Let's say we to look for source IPs that match a particular pattern. In this case let's just say 142.

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'ip.src matches "142"' | more
tshark: ip.src (type=IPv4 address) cannot participate in 'matches' comparison.

Ooops! Looks like we got an error about type mismatch. Let's convert this IPv4 address type field to a string and build out our filter at the same time. Our filter will look for a source IP address which starts with 142 in the first octet. The second octet should only consist of the number 1, 2 or 5. The third octet has to be either the number 32 or 33 and the final octet can be any 3 digit number between 0 and 9.

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'string(ip.src) matches "^142\\.[1,2,5]{1,3}\\.(32|33)\\.[0-9]{3}"' -T fields -e ip.src | sort | uniq
142.251.33.179

A little bit more detail of the same filter.

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'string(ip.src) matches "^142\\.[1,2,5]{1,3}\\.(32|33)\\.[0-9]{3}"'
  915 186.636198 142.251.33.179 → 10.0.2.15    TCP 66 80 → 37398 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460
  918 186.636629 142.251.33.179 → 10.0.2.15    TCP 66 80 → 37398 [ACK] Seq=1 Ack=136 Win=65535 Len=0
  919 186.651759 142.251.33.179 → 10.0.2.15    TCP 1490 HTTP/1.0 411 Length Required  [TCP segment of a reassembled PDU]
  921 186.653506 142.251.33.179 → 10.0.2.15    HTTP 355 HTTP/1.0 411 Length Required  (text/html)
  922 186.653509 142.251.33.179 → 10.0.2.15    TCP 66 80 → 37398 [FIN, ACK] Seq=1726 Ack=136 Win=65535 Len=0
  925 186.653877 142.251.33.179 → 10.0.2.15    TCP 66 80 → 37398 [ACK] Seq=1727 Ack=137 Win=65535 Len=0

Similarly, let's look for destinations:

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap  -T fields -e ip.dst | sort | uniq
10.0.0.100
10.0.2.15
142.251.32.83
142.251.33.179
142.251.41.83
172.217.1.19
172.217.165.19
209.59.177.103
45.60.31.34
64.71.255.198

Let's now extract the destinations where we have the first octet starts with 2 numbers between 0 and 9. The second octet is exactly 0. The third octet can only have 1 number and it can only be 0 or 2. Octet 4, ends with either 100 or 15. 

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'string(ip.dst) matches "^[0-9]{2}\\.0\\.[0,2]{1}\\.(100|15)$"' -T fields -e ip.dst | sort | uniq
10.0.0.100
10.0.2.15

Let's now wrap this up by grabbing some frames numbers. First up, the first frame:

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'string(frame.number) matches "^1$"'
    1   0.000000 08:00:27:0e:34:8d →              ARP 48 Who has 10.0.2.2? Tell 10.0.2.15

Next, frames 1 to 9.

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y 'string(frame.number) matches "^[0-9]$"'
    1   0.000000 08:00:27:0e:34:8d →              ARP 48 Who has 10.0.2.2? Tell 10.0.2.15
    2   0.000154 52:54:00:12:35:02 →              ARP 66 10.0.2.2 is at 52:54:00:12:35:02
    3   0.000235    10.0.2.15 → 64.71.255.198 DNS 85 Standard query 0x4872 A www.securitynik.com
    4   0.000241    10.0.2.15 → 64.71.255.198 DNS 85 Standard query 0x0d7d AAAA www.securitynik.com
    5   0.150729 64.71.255.198 → 10.0.2.15    DNS 166 Standard query response 0x4872 A www.securitynik.com CNAME www.securitynik.com.ghs.googlehosted.com CNAME ghs.googlehosted.com A 172.217.165.19
    6   5.004124    10.0.2.15 → 64.71.255.198 DNS 85 Standard query 0x4872 A www.securitynik.com
    7   5.106980 64.71.255.198 → 10.0.2.15    DNS 166 Standard query response 0x4872 A www.securitynik.com CNAME www.securitynik.com.ghs.googlehosted.com CNAME ghs.googlehosted.com A 142.251.32.83
    8   5.107044    10.0.2.15 → 64.71.255.198 DNS 85 Standard query 0x0d7d AAAA www.securitynik.com
    9   5.119889 64.71.255.198 → 10.0.2.15    DNS 178 Standard query response 0x0d7d AAAA www.securitynik.com CNAME www.securitynik.com.ghs.googlehosted.com CNAME ghs.googlehosted.com AAAA 2607:f8b0:400b:807::2013

Ok! I one more. We Took advantage of various fields by their names. Let's instead close this off my look at combination of offset and field.

┌──(rootđź’€securitynik)-[~/tshark-series]
└─# tshark -n -r securitynik_regex.pcap -Y '(udp[25:] matches "s.{10,20}\.com") && (string(ip.src) matches "^[0-9]{2}\\.0\\.[0,2]{1}\\.(15)$")' | more
    3   0.000235    10.0.2.15 → 64.71.255.198 DNS 85 Standard query 0x4872 A www.securitynik.com
    4   0.000241    10.0.2.15 → 64.71.255.198 DNS 85 Standard query 0x0d7d AAAA www.securitynik.com
    6   5.004124    10.0.2.15 → 64.71.255.198 DNS 85 Standard query 0x4872 A www.securitynik.com
    8   5.107044    10.0.2.15 → 64.71.255.198 DNS 85 Standard query 0x0d7d AAAA www.securitynik.com
   17   5.282030    10.0.2.15 → 64.71.255.198 DNS 85 Standard query 0x3e8d A www.securitynik.com
   18   5.282048    10.0.2.15 → 64.71.255.198 DNS 85 Standard query 0x6688 AAAA www.securitynik.com
  337  47.326990    10.0.2.15 → 64.71.255.198 DNS 85 Standard query 0x7c44 A www.securitynik.com
  338  47.327019    10.0.2.15 → 64.71.255.198 DNS 85 Standard query 0x8443 AAAA www.securitynik.com
  348  47.549609    10.0.2.15 → 64.71.255.198 DNS 85 Standard query 0x5543 A www.securitynik.com
  349  47.549690    10.0.2.15 → 64.71.255.198 DNS 85 Standard query 0x677e AAAA www.securitynik.com
  910 186.517304    10.0.2.15 → 64.71.255.198 DNS 85 Standard query 0x49f7 A www.securitynik.com
  911 186.517330    10.0.2.15 → 64.71.255.198 DNS 85 Standard query 0x9bf2 AAAA www.securitynik.com
  926 200.282904    10.0.2.15 → 64.71.255.198 DNS 85 Standard query 0x2bdf A www.securitynik.com
  927 200.282930    10.0.2.15 → 64.71.255.198 DNS 85 Standard query 0x1cd0 AAAA www.securitynik.com

Ok! Well that's it for finding data using TShark's contain and matches. Obviously, we don't have to use additional tools such as grep to find data within packets. However, you may still find grep helpful in many other cases.

References:
securitynik_regex.pcap - PCAP used above 

https://sharkfestus.wireshark.org/assets/presentations16/16.pdf
https://www.wireshark.org/docs/wsug_html_chunked/ChWorkBuildDisplayFilterSection.html
https://www.cellstream.com/reference-reading/tipsandtricks/431-finding-text-strings-in-wireshark-captures
https://www.cellstream.com/resources/2013-09-10-11-55-21/cellstream-public-documents/wireshark-related/83-wireshark-display-filter-cheat-sheet/file
https://www.securityinbits.com/malware-analysis/tools/wireshark-filters/
https://blog.packet-foo.com/2013/05/the-notorious-wireshark-out-of-memory-problem/
https://www.wireshark.org/docs/wsdg_html_chunked/lua_module_GRegex.html
https://luca.ntop.org/gr2021/altre_slides/CorsoWireshark.pdf
https://stackoverflow.com/questions/9655164/regex-ignore-case-sensitivity
https://www.hscripts.com/tutorials/regular-expression/metacharacter-list.php

Thursday, August 5, 2021

Taking a journey to expand my Artificial (AI) Knowledge - My takeaways from AI For everyone

If you are in security, or even from a general perspective Artificial Intelligence (AI) is something you have sure been hearing a lot more about. AI is the future matters or as the saying goes, AI is the new electricity. Therefore, it is important we all start getting our head around this as much as possible I've already dedicated time in the past to getting a better understanding about Machine Learning (ML) and thus decided I should dedicate more time to this area of study. To help me start this new journey, I've decided to start with AI for Everyone by Andrew Ng and plan to follow a series of training over the next few months to expand my knowledge on AI. My blog here is being used as my "notepad" where I jot done things I find interesting as I take the various trainings.

This series consists of a number of videos and to ensure I can reference my materials in the future, below is just my takeaways from the videos.

Intro:
First takeaway, is I like the fact that it starts off from a purely non-technical perspective. While my fascination with AI is mostly from the technical perspective, I like how Mr. Ng shows that within the top 11 industries, retail, travel, transportation, etc., will see greater growth in AI than high tech. Actually high tech came in at number 8. I find that very interesting.

I definitely like Mr. Ng points about the necessary and unnecessary hype surrounding AI. First major takeaway for me. AI really consists of two parts:

    1.    Artificial Narrow Intelligence (ANI)
                a. Focuses on a particular tasks.
                b. One trick ponies which can be very good or very bad at the tasks.
    2.    Artificial General Intelligence (AGI)
                a. Aim is to do anything a human can do
                b. Even more things than humans can do
                c.  May require years, decades or hundred of years to make real progress.

Ensure that your AI projects are technically feasible. Also, ML is what is driving the rise of AI today.

Very funny also that Mr. Ng says by the end of this program, I will have better knowledge and better qualified than most CEOs of large companies. ;-) I hope that is true by the end. ;-)

Understanding Terminology:
    AI: Everything below is really a subset of AI. 
    ML: The ability to learn without being explicitly programmed. 
            This is a subset of the various tools which use AI, ML, Deep Learning, etc.
    Data Science: Extracting knowledge and insights from data.  
        Can be a subset cross cutting all the various AI and ML tools
    Deep Learning: Leverages a Artificial Neuron Network (ANN).
        Deep Learning and Neural Networks are used interchangeably.  
        This is a subset of ML.

On Machine Learning:
The rise of AI has largely been driven mostly by ML. The most common type of ML is supervised learning which deals with an input to output mapping. Some common ML applications are: Spam filtering, online advertising, machine translation, self driving car, visual inspection. Big driver for the growth in supervised learning can be seen from the perspective of increase in available data and computing power.

Data in ML is called a dataset. While data is important, it can also be misused while at times overhyped. Do note that is that more data is typically better but generally you still have to consider garbage in will equal to garbage out. Images, text, audio, etc., are call unstructured data. Structured data on the other hand can be considered as data which might be in a spreadsheet.

While most of the time you may hear folks talking about supervised learning, there are other ML techniques such as unsupervised learning of which clustering is the most popular mechanism. In unsupervised learning no labeling is done and the algorithm is expected to identify things of interest. 


On being great at AI:
To be a great AI company, you have to be able to do the things that AI lets you do very well. Additionally, AI companies can easily spot opportunities to leverage automation. The following is what Mr. Ng recommend for his 5-step playbook to successful AI implementation. 

    1. Gain momentum via pilot projects to learn whether AI is a good fit for your project.
    2. Ultimately, you will need to build an inhouse AI Team.
    3. Ensure the AI team as well as various business leaders get the necessary training.
    4. Develop an AI strategy
    5. Develop communication for both internal and external ensure all stakeholders are updated accordingly.

Before embarking on an AI project, ensure technical diligence is done to assess the practicality and feasibility of the project. While we might hear about only the good news about AI, there are also lots of failures. Machine Learning tends to work well with simple concepts with large data. On the other hand, ML works poorly with complex concepts from small amounts of data and on previously unseen data which is not part of the dataset.

On Deep Learning:
The simplest possible ANN has one neuron, taking a single input and producing a single output. A more complex ANN may take multiple inputs and have multiple (thousands or even tens of thousands of) neurons and layers while producing an output. 

Deep learning models make use of numerical data. 

On Building AI Projects:
First build an AI workflow, then select an AI project, then organize the data and team to execute the AI project. 

Key steps within the project is to collect the data, train the model then ultimately deploy the model. Do note that as you go through the various steps, you may more than likely optimize/tune as you go along. 

Data science projects have different workflows from ML projects. In data science projects, similar to ML projects, you first collect the data. However, in data science, you then analyze the data then suggests hypothesis/actions. 

To make the most out of an AI project, you need AI experts as well as domain experts. Domain experts being the individuals in your business who possess the knowledge about your business. These are also called cross functional teams.. When looking at ML, consider automating a tasks before automating a job. Other questions to ask are what are the main drivers for automation and what are the main pain parts of the business. Note, you can make progress without big data meaning a small dataset. Also don't build some thing for which there is also a good solution in the market.

When working with the AI team, ensure you clearly define your acceptance criteria. This is typically done in a statistical way. The AI expert should also be able to tell how much test data you want. While it is typical to have your data split between 1 training set and 1 test set, it is also possible to have 2 test sets. 

You should never expect your models to be 100% accurate. This can be for a number of reason such as not enough training data or just natural limitations within ML. Maybe mislabeled or ambiguous data. Considering the above, ensure discussions are had with the AI expert to understand what level of accuracy might be acceptable.

While some roles are defined, there are some roles which have not been clearly defined. However, some of the roles to be of are software engineer, ML engineer, data scientist, data engineer. AI Product Manager, etc.

AI Transformation:
Most important thing is to be successful rather than being valuable when selecting that initial project. Try to show returns within 6-12 months and this project could be in-house or outsource.

When building an in-house AI team, have a dedicated that that is available to support the various business units. Focus on building company wide platforms where possible as this can help multiple business units at the same time. Remember to provide broad AI training at various level of the company. While hiring AI engineers from outside, it is a better thing to build that skillset in-house.

AI impact on society

AI and Ethics is something society also have to pay close attention to.

AI can be biased or discriminate on minorities while also being vulnerable to adversarial attacks. Actually learning about some of the bias exhibited by AI was very shocking to me. I always knew of the bias especially around face recognition but some of the others caught me by surprise. It was also nice to see that the AI community is working diligently to address this bias.

Adversarial attacks is where attackers are trying to fool the AI or basically attacking the AI system. Think someone trying to evade a SPAM filter. This is more than likely going to be an arm's race between attackers and defenders.

While AI might be used by adversaries for bad, it may also be used in other adverse ways such as making of deep fakes, oppressive surveillance by regimes, generation of fake comments, etc.

When thinking about AI have a realistic view. Don't be too optimistic or too pessimistic. Consider it the Goldilock's rule. The poridge must be just right. Not too hot not too cold. The lack of AI to properly explain itself maybe a barrier to some implementations.  You may have to instead rely on the AI team for this guidance. 

That's it for this my first set of notes. Join me as I continue this journey as I now pursue the Deep Learning Specialization.


References:
AI For Everyone
AI Transformation Playbook


Beginning Deep Learning - Understanding Back Propagation - Basic Code

This code is associated with my Beginning Deep Learning - Understanding Back Propagation
No effort is made to do anything special here. I just wanted to understand the basics
and thus have kept it as simple enough for me to understand.

I plan to follow another tutorial which creates a neural network from scratch
now that I have a better understanding of the back propagation process. Consider 
this version 1.

Author: Nik Alleyne
Author Blog: www.securitynik.com
Blog Article associated with this code: 

For this I am giving the inputs as 2 and 9. The target value expected to be 
output is 92% (0.92) 

1
2
# Import modules
import numpy as np

1
2
3
4
5
6
7
8
9
''' 
Define the sigmoid activation function
Let's also round the value by 4 decimals
References:
https://en.wikipedia.org/wiki/Sigmoid_function
'''
def my_sigmoid(dot_product_of_neuron):
  print('Applying the sigmoid activation functiion to the weighted sum / dot product ...') 
  return round(1 / (1 + np.exp(-dot_product_of_neuron)), 4)

1
2
3
4
5
# Define my Cost function MSE
# (y_true - y_predicted) ** 2
def my_mse(y_true, y_predicted):
  print('Calculating the cost using MSE ...')
  return round((y_true - y_predicted) ** 2, 4)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# Define our target value
y_true_output = 0.92

# Define the two inputs (features) 
input_layer_features = np.array([2, 9])

# Define the weights from input to hidden layer
weights_input_hidden_neuron0 = np.array([0.15, 0.23])
weights_input_hidden_neuron1 = np.array([0.5, 0.8])
weights_input_hidden_neuron2 = np.array([0.05, -0.05])

# Define the weight from hidden to output
weights_hidden_output = np.array([0.9, -0.5, 0.08])

# Define the bias for the hidden layer neurons
hidden_layer_bias_neuron_0 = 0.5
hidden_layer_bias_neuron_1 = 0.5
hidden_layer_bias_neuron_2 = 0.5

# Define the bias for the output layer neuron
output_layer_bias_neuron_3 = 0.2

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
'''
We can use this calculation to get our values for the first neuron. 
We can also use the same concept to get the other neurons. 
However, we are better off taking advantage of matrix multiplication 
and obtaining the dot product
z0  = (x0 * w0) + (x1 * w1) + b
z0  = (2 * 0.15) + (9 * 0.23) + 0.5
    = 0.3 + 2.07 + 0.5
    = 2.87
'''

1
2
3
4
5
6
# Calculating the dot product of z1 - the first neuron
# Round it down to 4 decimals
dot_product_hidden_neuron_z0 = round(np.dot(input_layer_features, weights_input_hidden_neuron0) + hidden_layer_bias_neuron_0, 4)

# Print the value of z0
dot_product_hidden_neuron_z0

1
2
3
# Calculate the value of O0 - Hidden Layer Neuron 0 value after the activation function has been applied
sigmoid_of_hidden_neuron_O0 = my_sigmoid(dot_product_hidden_neuron_z0)
sigmoid_of_hidden_neuron_O0

1
2
3
4
5
6
# Calculating the dot product of z1 - the second neuron
# Round it down to 4 decimals
dot_product_hidden_neuron_z1 = round(np.dot(input_layer_features, weights_input_hidden_neuron1) + hidden_layer_bias_neuron_1, 4)

# Print the value of z0
dot_product_hidden_neuron_z1

1
2
3
# Calculate the value of O1 - Hidden Layer Neuron 1 value after the activation function has been applied
sigmoid_of_hidden_neuron_O1 = my_sigmoid(dot_product_hidden_neuron_z1)
sigmoid_of_hidden_neuron_O1

1
2
3
4
# Calculating the dot product of z2 - the third neuron
# Round it down to 4 decimals
dot_product_hidden_neuron_z2 = round(np.dot(input_layer_features, weights_input_hidden_neuron2) + hidden_layer_bias_neuron_2, 4)
dot_product_hidden_neuron_z2

1
2
3
# Calculate the value of O2 - Hidden Layer Neuron 2 value after the activation function has been applied
sigmoid_of_hidden_neuron_O2 = my_sigmoid(dot_product_hidden_neuron_z2)
sigmoid_of_hidden_neuron_O2

1
2
3
4
# Create a new array, using the outputs from the hidden layer
# This array 
outputs_from_hidden = np.array([sigmoid_of_hidden_neuron_O0, sigmoid_of_hidden_neuron_O1, sigmoid_of_hidden_neuron_O2])
outputs_from_hidden

1
2
3
4
# Moving on the output layer. 
# Calculating the dot product (z3) by using the output from hidden layer
dot_product_output_neuron_z3 = round(np.dot(outputs_from_hidden , weights_hidden_output) + output_layer_bias_neuron_3, 4)
dot_product_output_neuron_z3

1
2
3
# Apply the activation function to z3
sigmoid_of_output_neuron_O3 = my_sigmoid(dot_product_output_neuron_z3)
sigmoid_of_output_neuron_O3 

1
2
3
# To get the above, to percentage, we simply multiply by 100
sigmoid_of_output_neuron_O3_percent = sigmoid_of_output_neuron_O3  * 100
sigmoid_of_output_neuron_O3_percent

1
2
3
4
# At thisi point, our target output was 92 but the predicted output 64.45. 
# Time to calculate or loss
current_loss = my_mse(y_true_output, sigmoid_of_output_neuron_O3)
current_loss

1
2
3
4
5
6
7
# Finding the partial derivative of the cost as it relates to predicted value (O3)
# Staring off by finding dCost/dz3 
# dCost/dz3 = dCost/dO3 * dO3/dz3
# First, step 1. Find dCost/dO3

dCost_dO3 = round(sigmoid_of_output_neuron_O3 - y_true_output, 4)
dCost_dO3

1
2
3
4
5
# Now finding the dO3/dz3
# dCost/dz3
# Step 2
dO3_dz3 = round(sigmoid_of_output_neuron_O3 * (1 - sigmoid_of_output_neuron_O3), 4)
dO3_dz3 

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Finding now the dCost/dz3 
# dCost/dz3 = dCost/dO3 * dO3/dz3
# Step 3

dCost_dz3 = round((dCost_dO3 * dO3_dz3), 4)

# This value is also the dCost/db3
dCost_db3 = dCost_dz3

dCost_dz3, dCost_db3

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# Calculating dCost/dw6, dCost/dw7, dCost/dw8
# dCost/dw6 = dCost/dz3 * dz3/dw6
# Next, step 4. Find dCost/dw6

dCost_dw6 = round((dCost_dz3 * sigmoid_of_hidden_neuron_O0), 4)

# Step 5
# dCost/dw7 = dCost/dz3 * dz3/dw7
dCost_dw7 = round((dCost_dz3 * sigmoid_of_hidden_neuron_O1), 4)

# Step 6
# dCost/dw7 = dCost/dz3 * dz3/dw8
dCost_dw8 = round((dCost_dz3 * sigmoid_of_hidden_neuron_O2), 4)

# here are the actual values for those calculations above.
dCost_dw6, dCost_dw7, dCost_dw8

1
2
3
4
5
6
# Calculating the d/Cost as it relates to w0, w1, w2, w3, w4 and w5
# First, finding the derivative of the cost as it relates to O0
# dCost_dO0 = dCost/dz3 * dz3/dO0
# Step 7
dCost_dO0 = round(( dCost_dz3 * weights_hidden_output[0]), 4) 
dCost_dO0

1
2
3
4
# Step 8
# dO0/dz0 = O0 * (1 - O0)
dO0_dz0 = round(sigmoid_of_hidden_neuron_O0 * (1 - sigmoid_of_hidden_neuron_O0), 4)
dO0_dz0

1
2
3
4
5
6
7
8
# Step 9 dCost/dz0
dCost_dz0 = round((dCost_dO0 * dCost_dz0), 4)

# The value returned from above, is also the bias of the neuron
dCost_db0 = dCost_dz0 

# Here are the two values
dCost_dz0, dCost_db0 

1
2
3
4
# Step 10 
# dCost/dw0 = dCost/dz0 * dz0/dw0
dCost_dw0 = round((dCost_dz0 * input_layer_features[0]), 4)
dCost_dw0

1
2
3
4
# Step 11 
# dCost/dw1 = dCost/dz0 * dz0/dw1
dCost_dw1 = round((dCost_dz0 * input_layer_features[1]), 4)
dCost_dw1

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Step 12 -  Finding dCost/dO1
# dCost/dO1 = dCost/dz3 * dz3/dO1
# dCost/dz3 * w7

dCost_dO1 = round((dCost_dz3 * weights_hidden_output[1]), 4)

# The value above also represents the dCost/db1
dCost_db1 = dCost_dO1

dCost_dO1, dCost_db1

1
2
3
4
5
# Step 13 : Finding dO0/dz0
# dO0/dz1 = O1 * (1 - O1)
dO1_dz1 = round(sigmoid_of_hidden_neuron_O1 * (1 - sigmoid_of_hidden_neuron_O1), 4)

dO1_dz1

1
2
3
4
# Step 14 : Finding dCost/dz0
# dCost/dz1 = dCost/dO1 * dO1/dz1
dCost_dz1 = round((dCost_dz1  * dO1_dz1), 4)
dCost_dz1 

1
2
3
4
5
# Step 15 : Finding dCost/dw2
# dCost/dw2 = dCost/dz1 * dz0/dw2
# dCost/dw2 = dCost/dz1 * Input0
dCost_dw2 = round((dCost_dz1 * input_layer_features[0]), 4)
dCost_dw2

1
2
3
4
5
# Step 16 : Finding dCost/dw3
# dCost/dw2 = dCost/dz1 * dz0/dw2
# dCost/dw2 = dCost/dz1 * Input0
dCost_dw3 = round((dCost_dz1 * input_layer_features[1]), 4)
dCost_dw3

1
2
3
4
5
6
7
8
9
#Step 17 : Finding dCost/dO2
# dCost/dO2 = dCost/dz3 * dz3/dO2
# dCost/dO2 = dCost/dz3 * w8
dCost_dO2 = round((dCost_dz3 * weights_hidden_output[2]), 4)

# The value above also represents the dCost/db1
dCost_db2 = dCost_dO2

dCost_dO2, dCost_db2

1
2
3
4
5
# Step 18 : Finding dO2/dz2
# dO2/dz2 = O2 * (1 - O2)

dO2_dz2 = round(sigmoid_of_hidden_neuron_O2 * (1 - sigmoid_of_hidden_neuron_O2), 4)
dO2_dz2

1
2
3
4
# Step 19 : Finding dCost/dz2
# dCost/dz2 = dCost/dO2 * dO1/dz2
dCost_dz2 = round((dCost_dO2 * dO2_dz2), 4)
dCost_dz2

1
2
3
4
5
# Step 20 : Finding dCost/dw4
# dCost/dw4 = dCost/dz2 * dz0/dw4
# dCost/dw4 = dCost/dz2 * Input0
dCost_dw4 = round((dCost_dz2 * input_layer_features[0]), 4)
dCost_dw4 

1
2
3
4
5
# Step 21 : Finding dCost/dw5
# dCost/dw5 = dCost/dz2 * dz1/dw5
# dCost/dw5 = dCost/dz2 * Input1
dCost_dw5 = round((dCost_dz2 * input_layer_features[1]), 4)
dCost_dw5

The above represents the code for my basic understanding of the coding. 

I will work on another post later, where I make a more meaningful network. That will at least help to solidify my knowledge. 

References

Beginning Deep Learning - Understanding Back Propagation

My gosh! The last time spent as much time learning a particular topic, was when I wanted to learn more about buffer overflows. Learning about back propagation which is a critical component of deep learning, really required me to dig deep (pun intended). This post reflects my understanding of back propagation. My hope is with this understanding, I can now move forward with what is hopefully easier learning (pun intended again ;-) )  as I learn more about deep learning.

Here is my topology:


The network is being fed 2 & 9 as input and are expecting a value of 92 as output. This 92 can be made into percent such as 0.92 to keep it simple.

Feeding forward!
Here is the topology with inputs, weights, bias, etc. As shown below, our target is 0.92 or 92%. Our network predicted 0.6445 or 64%. Let's see how we got this prediction.


Note: Later in the diagrams I accidentally dropped the bias, but they are b0=0.5, b1=0.5, b2=0.5, b3-02.

The first part of understanding back propagation, is to understand the feed forward process. I believe this is relatively easy. Let's see this in action. Let's first find z0, then O1. This will be followed by z1 then O2, then I then wrap up the hidden layer by doing z2 and O2. Note that zx represent the weighted sum, while Ox represents the output after the activation function has been applied. 

Hidden Layer Node 0:
z0  = (x0 * w0) + (x1 * w1) + b
z0  = (2 * 0.15) + (9 * 0.23) + 0.5
    = 0.3 + 2.07 + 0.5
    = 2.87

With z0 value found, time to compute O0 by applying an activation function. Throughout this post, I am using the Sigmoid activation function. Check out the Wikipedia or casio or  redcrab-software in the reference if you are not sure how to compute the sigmoid.

O0 = sigmoid(z0)
O0 = sigmoid(2.87) = 0.9463

Perform a similar process for Hidden Layer Node 1:

z1  = (x0 * w2) + (x1 * w3) + b
z1  = (2 * 0.5) + (9 * 0.8) + 0.5
    = 2 + 7.2 + 0.5
    = 8.7

O1 = sigmoid(z1)
O1 = sigmoid(8.7) = 0.9998

Completing the hidden layer Hidden Layer Node 2:

z2  = (x0 * w4) + (x1 * w5) + b
z2  = (2 * 0.05) + (9 * -0.05) + 0.5
    = 0.1 + (-0.45) + 0.5
    = 0.15

O2 = sigmoid(z2)
O2 = sigmoid(0.15) = 0.5374

Moving on to the output layer.

Output Layer Node 0:

z3  = (00 * w6) + (00 * w7) + (00 * w8)  + b
z3  = (0.9463 * 0.9) + (0.9998 * -0.5) + (0.5374 * 0.08) + 0.2
    = 0.8517 + (-0.4999) + (0.0430) + 0.2
    = 0.5948

O3 = sigmoid(z3)
O3 = sigmoid(0.5948) = 0.6445

Ok. After the first past this network produced 0.6445 or 64%. I needed it to produce 0.92 or 92% 

Setting up the Cost function, using Mean Squared Error (MSE)
Cost = (actual - predicted)**2

another way of looking at this and what I will use is.

Cost = (y_true - y_predicted)**2

Using this formula, I now plug our values in to get the loss:

Cost = (0.92 - 0.6445) ** 2
    = (0.2755) ** 2
    = 0.0759

At this point, we have a cost of 0.0759. Moving on ...

The real headache (for me) Back Propagation!
I know there are many out there, who would say this is easy stuff to learn. I am not mad at you. However, this took me some time and for many people, they may not even bother to invest that time. I know you might also be saying but at least you (me) were able to figure it out and thus it might not have been as hard as you thought. I guess the end does justify the means.

Let's get down to the meat of the matter. What we need to figure out, is how the weights (w0, w1, w2, w3, w4, w5, w6, w7, w8) impact the cost. To do that, I need to take advantage of the chain rule and computation of partial derivatives. 

I will use 'd' to represent derivatives. Therefore dCost/dO3 means, the derivative of the cost as it relates to the derivative of output 3 (O3). In other words, how does a change in output 3 impact the cost. Let's dig in.

Quick simple note on chain rule. The chain rule says, dCost/dO3 is the same as dCost/dw0 * dw0/dO3.

Breaking it down further, let's say dCost is 3 and dO3 is 2. This is equal to:

dCost/dO3
= 3/2
= 1.5

Now let say dw0 is 10. Note, any number chosen should produce the same results. 

This means.

dCost/dO3 = dCost/dw0 * dw0/dO3
1.5 = 3/10 * 10/2
1.5 = 30/20
1.5 = 1.5

As you can see above, we got 1.5 on both sides of the equal.

With that basic understanding, time to build further.

Here are the first 6 steps, where ultimately, I finding for w6, w7 and w8. Looking at it from a different perspective dCost/dw6, dCost/dw7 and dCost/dw8.



Let's find dCost/dz3 = dCost/dO3 * dO3/dz3

First, step 1:
dCost/dO3 = O3 - y_true
    = 0.6445 - 0.92 
    = -0.2755


Step 2:
dO3/dz3 = O3 * (1 - O3)
    = 0.6445 * (1 - 0.6445)
    = 0.6445 * 0.3555
    = 0.2291

With the above calculated, finding for dCost/dz3 is simply to multiply dCost/dO3 * dO3/dz3.

Step 3:
dCost/dz3 = dCost/dO3 * dO3/dz3
dCost/dz3 = -0.2755 * 0.2291
    = -0.0631

This value also represents dCost/db3
dCost/dBias = -0.0631

With those out of the way, time to compute how w6, w7 and w8 impact the cost. 

Next, step 4. Find dCost/dw6

Weight 6:
dCost/dw6 = dCost/dz3 * dz3/dw6
    = dCost/dz3 * O0
    = -0.0631 * 0.9463
    = -0.0597


Step 5. Find dCost/dw7

Weight 7:
dCost/dw7 = dCost/dz3 * dz3/dw7
    = dCost/dz3 * O1
    = -0.0631 * 0.9998
    = -0.0631


Step 6. Find dCost/dw8

Weight 8:
dCost/dw8 = dCost/dz3 * dz3/dw8
    = dCost/dz3 * O2
    = -0.0631 * 0.5374
    = -0.0339

Next up, how does the w0, w1, w2, w3, w4 and w5 impact the cost.

Step 7 : Finding dCost/dO0
dCost/dO0 = dCost/dz3 * dz3/dO0
    =  dCost/dz3 * w6
    = -0.0631 * 0.9
    = -0.0568


Step 8 : Finding dO0/dz0
    dO0/dz0 = O0 * (1 - O0)
        = 0.9463 * (1 - 0.9463)
        = 0.9463 * 0.0537
        = 0.0508


Step 9 : Finding dCost/dz0
    dCost/dz0 =  -0.0568 * 0.0508
        = -0.0029


Step 10 : Finding dCost/dw0
    dCost/dw0 = dCost/dz0 * dz0/dw0
        = dCost/dz0 * Input0
        = -0.0029 * 2
        = -0.0058


Step 11 : Finding dCost/dw1
    dCost/dw1 = dCost/dw1 = dCost/dz0 * dz0/dw1
        = dCost/dz0 * Input1
        = -0.0029 * 9
        = -0.0261




Step 12 : Finding dCost/dO1
dCost/dO1 = dCost/dz3 * dz3/dO1
    =  dCost/dz3 * w7
    = -0.0631 * -0.5
    = 0.0316



Step 13 : Finding dO0/dz0
    dO0/dz1 = O1 * (1 - O1)
        = 0.9998 * (1 - 0.9998)
        = 0.9998 * 0.0002
        = 0.0002


Step 14 : Finding dCost/dz0
    dCost/dz1 = dCost/dO1 * dO1/dz1
     =  -0.0316 * 0.0002
        = -0.0000


This value is also dCost/db1
dCost/db1 = -0.0000


Step 15 : Finding dCost/dw2
    dCost/dw2 = dCost/dz1 * dz0/dw2
        = dCost/dz1 * Input0
        = -0.0000 * 2
        = -0.0000


Step 16 : Finding dCost/dw3
    dCost/dw3 = dCost/dz1 * dz1/dw3
        = dCost/dz1 * Input1
        = -0.0000 * 9
        = -0.0000




Step 17 : Finding dCost/dO2
dCost/dO2 = dCost/dz3 * dz3/dO2
    =  dCost/dz3 * w8
    = -0.0631 * 0.08
    = -0.0050


Step 18 : Finding dO2/dz2
    dO2/dz2 = O2 * (1 - O2)
        = 0.5374 * (1 - 0.5374)
        = 0.5374 * 0.4626
        = 0.2486


Step 19 : Finding dCost/dz2
    dCost/dz2 = dCost/dO2 * dO2/dz2
     =  -0.0050 * 0.2486
        = -0.0012
        
This value is also dCost/db2
dCost/db2 = -0.0012


Step 20 : Finding dCost/dw4
    dCost/dw4 = dCost/dz2 * dz0/dw4
        = dCost/dz2 * Input0
        = -0.0012 * 2
        = -0.0024


Step 21 : Finding dCost/dw5
    dCost/dw5 = dCost/dz2 * dz2/dw5
        = dCost/dz2 * Input1
        = -0.0012 * 9
        = -0.0108


Here is the final diagram.



Ahhhhhhhhhhhhh! The heavy lifting has been completed. Next up, calculate the new weights and biases.

The formula for the new weights and biases are as follows:
new_weight = old_weight - learning_rate(dCost/dw_n) where dw_n, represents dw0, dw1, dw2,..., dw8. The learning rate is typically a value between 0 and 1. I will use 0.5 Hence the new weights are: 

new_w0 = old_w0 - 0.5(dCost/dw0)
    = 0.15 - 0.5(-0.0058) 
    = 0.15 - (-0.0029)
    = 0.1529

new_w1 = old_w1 - 0.5(dCost/dw1)
    = 0.23 - 0.5(-0.0261) 
    = 0.23 - (-0.01305)
    = 0.2431

new_w2 = old_w2 - 0.5(dCost/dw2)
    = 0.5 - 0.5(-0.000) 
    = 0.5 - (0)
    = 0.5


new_w3 = old_w3 - 0.5(dCost/dw3)
    = 0.8 - 0.5(-0.000) 
    = 0.8 - (0)
    = 0.8


new_w4 = old_w4 - 0.5(dCost/dw4)
    = 0.05 - 0.5(-0.0024) 
    = 0.05 - (-0.0012)
    = 0.0512


new_w5 = old_w5 - 0.5(dCost/dw5)
    = -0.05 - 0.5(-0.0108) 
    = -0.05 - (-0.0054)
    = -0.0446


new_w6 = old_w6 - 0.5(dCost/dw6)
    = 0.9 - 0.5(-0.0597) 
    = 0.9 - (-0.029854)
    = 0.9299


new_w7 = old_w7 - 0.5(dCost/dw7)
    = -0.5 - 0.5(-0.0631) 
    = -0.5 - (-0.03155)
    = -0.4685


new_w8 = old_w8 - 0.5(dCost/dw8)
    = 0.08 - 0.5(-0.0339) 
    = 0.08 - (-0.01695)
    = 0.09695


new_b0 = old_b0 - 0.5(dCost/db0)
    = 0.5 - 0.5(-0.0029) 
    = 0.5 - (-0.00145)
    = 0.50145


new_b1 = old_b1 - 0.5(dCost/db1)
    = 0.5 - 0.5(0.0316) 
    = 0.5 - (0.0158)
    = 0.4842


new_b2 = old_b2 - 0.5(dCost/db2)
    = 0.5 - 0.5(-0.005) 
    = 0.5 - (-0.0025)
    = 0.5025


new_b3 = old_b3 - 0.5(dCost/db3)
    = 0.2 - 0.5(-0.0631) 
    = 0.2 - (-0.03155)
    = 0.23155

Sighhhhhh!!!! This was a tedious process and really took some patience on my part. I'm happy that I was able to learn this. With this completed, I believe my learning process (pun intended) becomes a lot easier.

See you the next post, where I code this just to get a basic understanding.


References