Machine learning-based products can be tricked to classify malware as a legitimate file, new findings show.
Just because your endpoint security product employs machine learning (ML) doesn’t mean it can’t be manipulated to miss malware, new research shows.
A pair of researchers will demonstrate at Black Hat Europe next week how they were able to bypass ML-based, next-generation anti-malware products. Unlike previous research that reverse-engineered the next-generation endpoint tool — such as Skylight’s bypass of Cylance’s endpoint product in 2018 — the researchers instead were able to cheat the so-called static analysis malware classifiers used in some next-gen anti-malware products without reverse engineering them.
They first tested their attacks using the open source, static malware-classifier tool Ember, and then were able to take that attack and use it against several next-gen antivirus (AV) products, which they declined to name. The research advances findings they presented in September at the International Joint Conference on Neural Networks (IJCNN).
They sneaked known malware samples past the products by modifying some of the exploit file’s features, such as checksums and time stamps, which allowed them to fool the static classifier into believing the malware was made up of legitimate files and not malicious. (The malware classifier basically determines whether a binary file contains malware.)
“It was part of research we were doing to protect our own product — to assess the vulnerability of next-generation antivirus to different forms of attack,” says Ishai Rosenberg, head of the deep learning group at Deep Instinct.
“We tried to show that there are other techniques” for bypassing next-gen AV that don’t require reverse-engineering the endpoint protection system, he says.
Rosenberg’s colleague Shai Meir, security researcher and data scientist in deep learning at Deep Instinct, says they only slightly modified features of the malware samples, and that was all it took for the classifier in the next-gen AV systems to label the code as benign and not malicious.
“We [first] started attacking those features and seeing how the classifier scores the changes” in them, Meir explains. “For example, a change in the time-stamp signature of a file could affect the score” and make it look benign, he says.
They say the attack could be pulled off easily, but defending against it is difficult. “There is no silver bullet,” Meir says. Protection requires defense in depth, he says, but even that is no guaranteed prevention technique.
“This is a generic problem from machine learning” and falls into the category of adversarial ML, Rosenberg notes.
Adversarial ML seems a long way off to many businesses. But even ML-based security tools are susceptible to these types of attacks.
“Watch out for this thing because [it is] coming, and as a vendor and an organization, you should be prepared,” Rosenberg warns.
Kelly Jackson Higgins is the Executive Editor of Dark Reading. She is an award-winning veteran technology and business journalist with more than two decades of experience in reporting and editing for various publications, including Network Computing, Secure Enterprise … View Full Bio