Today, the volume and velocity of data is growing exponentially with the Internet of Things dramatically increasing the number of connected devices across the globe. Organizations of every shape, size, industry affiliation, business model, etc., are literally overflowing with data, much of which is virtually unusable in its raw state. That said, there is a veritable treasure trove of digital assets which are very important, proprietary, confidential, and often personal, that are ripe for the picking by nefarious actors. This data can include customer information, trade secrets, intellectual property, financial information, product designs, marketing strategies, etc., to name just a few.
Cybercriminals target this type of data in most ransomware attacks whereby the malware penetrates a target system through phishing, social engineering, and/or numerous other techniques. The target data are then encrypted whereby only the attacker has the key or ability to provide access to the asset(s). In other instances, the malware may simply lock the computer system entirely so that it becomes completely inaccessible to the end user unless a fee, or ransom, is paid to the criminal with no guarantee of the data ever being made accessible/useable again anyway. To make matters worse, over the course of the past few years, ransomware has grown exponentially to the point where “kits” can be purchased by criminals on the dark web and deployed with relative ease.
Enter machine learning, whereby machines typically use massive repetition on data to identify patterns, predict outcomes, distinguish cats from dogs, suggest purchases, etc., using mathematical and statistical algorithms. Machine learning (ML) is a subset of Artificial Intelligence whereby a computer is presented with an enormous amount of labeled or unlabeled data, depending upon the training method(s) employed and the data/resources available, and “learn” to categorize, classify, predict, etc. For example, networks, computers, applications, operating systems, messaging systems, etc., often are quite “chatty” and create a lot of data (usually in the form of logs or as live, streaming-data) about current status, warnings, errors, how many packets, how many logins, how many files, how many queries, how many connections, and on and on such that a “baseline of normalcy” can be established by machine learning algorithms. Once this “normal operating behavior/noise” has been established, the ML program can often identify anomalous events that deviate from that baseline. This technique of spotting and classifying potential outliers, among other ML techniques, can be used to identify potential ransomware and/or other malware infection activity. Ransomware will typically cause unusual CPU activity, file activity, among other signs, which can then be used to prevent further digital asset contamination and imprisonment.
Contact us today to learn more about Machine Learning for Cybersecurity.
We look forward to working with you!