HITB CyberWeekApplied Data Science and Machine Learning for Cybersecurity
This interactive course will teach security professionals how to use data science techniques to quickly manipulate and analyze network and security data and ultimately uncover valuable insights from this data.
The course will cover the entire data science process from data preparation, feature engineering and selection, exploratory data analysis, data visualization, machine learning, model evaluation and optimization and finally, implementing at scale—all with a focus on security related problems. Participants will learn how to read in data in a variety of common formats then write scripts to analyze and visualize that data.
Why should you take this course?
Anyone who wishes to incorporate automated data analysis, machine learning and data science into their work.
Key Learning Objectives
- Writing scripts to efficiently read and manipulate CSV, XML, and JSON files
- Quickly and efficiently parsing executables, log files, pcap and extracting * artifacts from them
- Making API calls to merge datasets
- Use the Pandas library to quickly manipulate tabular data
- Effectively visualizing data using Python
- Preprocessing raw security data for machine learning and feature engineering
- Building, applying and evaluating machine learning algorithms to identify potential threats
- Automating the process of tuning and optimizing machine learning models
- Hunting anomalous indicators of compromise and reducing false positives
- Use supervised learning algorithms such as Random Forests, Naive Bayes, K-Nearest Neighbors (K-NN) and Support Vector Machines (SVM) to classify malicious URLs and identify SQL Injection
- Apply unsupervised learning algorithms such as K-Means Clustering to detect anomalous behavior
Students will need to have an understanding of Python.
Hardware / Software Requirements
Students should bring a laptop with either:
- Virtualbox (or VMWare) installed, 6GB of RAM and 10GB of storage.
- Anaconda and IPython installed.
We strongly recommend using the virtual machine we will provide as it will give the best student experience.