Why this series?
When teaching the SANS SEC595: Applied Data Science and Machine Learning for Cybersecurity Professionals https://www.sans.org/cyber-security-courses/applied-data-science-machine-learning/ I am always asked,
"Will you be sharing your demo notebooks?" or "Can we get a copy of your demo notebooks?" or ... well you get the point.
My answer is always no. Not that I do not want to share, (sharing is caring :-D) , but the demo notebooks by themselves, would not make sense or add real value. Hence, this series!
This is my supplemental work, similar to what I would do in the demos but with a lot more details and references.
This series uses primarily Zeek's conn.log file. Notebooks 23 and 24 uses Zeek's DNS and HTTP logs respectively.
The series includes the following:01 - Beginning Numpy
02 - Beginning Tensorflow
03 - Beginning PyTorch
04 - Beginning Pandas
05 - Beginning Matplotlib
06 - Beginning Data Scaling
07 - Beginning Principal Component Analysis (PCA)
08 - Beginning Machine Learning Anomaly Detection - Isolation Forest and Local Outlier Factor
09 - Beginning Unsupervised Machine Learning - Clustering - K-means and DBSCAN
10 - Beginning Supervise Learning - Machine Learning - Logistic Regression, Decision Trees and Metrics
11 - Beginning Linear Regression - Machine Learning
12 - Beginning Deep Learning - Anomaly Detection with AutoEncoders, Tensorflow
13 - Beginning Deep Learning - Anomaly Detection with AutoEncoders, PyTroch
14 - Beginning Deep Learning - Linear Regression, Tensorflow
15 - Beginning Deep Learning - Linear Regression, PyTorch
16 - Beginning Deep Learning - Classification, Tensorflow
17 - Beginning Deep Learning - Classification, Pytorch
18 - Beginning Deep Learning - Classification - regression - MIMO - Functional API Tensorflow
19 - Beginning Deep Learning - Convolution Networks - Tensorflow
20 - Beginning Deep Learning - Convolution Networks - PyTorch
21 - Beginning Regularization - Early Stopping, Dropout, L2 (Ridge), L1 (Lasso)
22 - Beginning Model TFServing
I choose unsupervised, because there are no labels coming with these data.
23 - Continuing Anomaly Learning - Zeek DNS Log - Machine Learning
24 - Continuing Unsupervised Learning - Zeek HTTP Log - Machine Learning
25 - Beginning - Reading Executables and Building a Neural Network to make predictions on suspicious vs suspicious
With 25 notebooks in this series, it is quite possible there are things I could have or should have done differently.
If you find any thing, you think fits those criteria, drop me a line.
If you find this series beneficial, I would greatly appreciate your feedback.
- SANS-ML-Presentation-SIEM-Alerts-Predictions
- Understanding Decision Tree with dTreeviz
- Beginning SQLAlchemy
- Beginning Fourier Transform for Beacon Detection