Large Language Models (LLMs) have demonstrated advanced capabilities in code generation and manipulation, inspiring growing concerns about their potential to enhance malware obfuscation. Nevertheless, the cybersecurity literature still lacks wide experimental evidence comparing LLM-driven knowledge-based malware evasion to traditional metamorphic engines. This paper addresses this critical research void through three key contributions: (1) a structured methodology leveraging commercial LLMs (GPT, Claude, DeepSeek, Gemini) against traditional metamorphic engines (MetaMe) using YARA rule detection; (2) experimental evidence demonstrating that LLMs produce variants with significantly higher structural diversity and improved evasion against signature-based detection compared to traditional metamorphic engines, generating variants with substantially greater structural diversity and high evasion rates against signature-based detection; and (3) analysis of LLM-transformed real-world malware (Alina-POS) mutations that systematically evade YARA detection signatures while preserving malicious functionality. Our key insight reveals that LLMs leverage semantic understanding rather than syntactic transformations to achieve effective structural variation in code generation. These findings suggest an emerging trend towards knowledge-based evasion techniques, creating a computational asymmetry where generating sophisticated malware variants requires significantly less expertise and effort than detecting them. This emerging imbalance highlights potential challenges for current malware detection approaches and defensive strategies.

Next-Gen metamorphism: Analyzing the potential of LLM-driven knowledge-based malware evasion

Coppolino, L.;Iannaccone, A.;Nardone, R.;Petruolo, A.;Romano, L.
2026-01-01

Abstract

Large Language Models (LLMs) have demonstrated advanced capabilities in code generation and manipulation, inspiring growing concerns about their potential to enhance malware obfuscation. Nevertheless, the cybersecurity literature still lacks wide experimental evidence comparing LLM-driven knowledge-based malware evasion to traditional metamorphic engines. This paper addresses this critical research void through three key contributions: (1) a structured methodology leveraging commercial LLMs (GPT, Claude, DeepSeek, Gemini) against traditional metamorphic engines (MetaMe) using YARA rule detection; (2) experimental evidence demonstrating that LLMs produce variants with significantly higher structural diversity and improved evasion against signature-based detection compared to traditional metamorphic engines, generating variants with substantially greater structural diversity and high evasion rates against signature-based detection; and (3) analysis of LLM-transformed real-world malware (Alina-POS) mutations that systematically evade YARA detection signatures while preserving malicious functionality. Our key insight reveals that LLMs leverage semantic understanding rather than syntactic transformations to achieve effective structural variation in code generation. These findings suggest an emerging trend towards knowledge-based evasion techniques, creating a computational asymmetry where generating sophisticated malware variants requires significantly less expertise and effort than detecting them. This emerging imbalance highlights potential challenges for current malware detection approaches and defensive strategies.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11367/162719
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact