Malware

The Dual-Use Dilemma of AI: Malicious LLMs

13 min read

Executive Summary

A fundamental challenge with large language models (LLMs) in a security context is that their greatest strengths as defensive tools are precisely what enable their offensive power. This issue is known as the dual-use dilemma, a concept typically applied to technologies like nuclear physics or biotechnology, but now also central to AI. Any tool powerful enough to build a complex system can also be repurposed to break one.

This dilemma manifests in several critical ways related to cybersecurity. While defenders can employ LLMs to speed up and improve responses, attackers can also take advantage of them for their workflows. For example:

Linguistic precision: LLMs can generate text that is grammatically plausible, contextually relevant and psychologically manipulative, advancing the art of social engineering for phishing, vishing and business email compromise (BEC) campaigns.
Code fluency: They can rapidly generate, debug and modify functional code, including malicious scripts and customized malware, greatly accelerating the development cycle for malware and tooling.

The line between a benign research tool and a powerful threat creation engine is dangerously thin. The two are often separated only by the developer's intent and the absence of ethical guardrails.

In this article, we examine two examples of LLMs that Unit 42 considers malicious, purpose-built models specifically designed for offensive purposes. These models, WormGPT and KawaiiGPT, demonstrate these exact dual-use challenges.

The Unit 42 AI Security Assessment can help empower safe AI use and development across your organization.

If you think you might have been compromised or have an urgent matter, contact the Unit 42 Incident Response team.

Related Unit 42 Topics	LLMs, Phishing, Cybercrime, Ransomware

Defining Malicious LLMs

These malicious LLMs — models built or adapted specifically for offensive purposes — distinguish themselves from their mainstream counterparts by intentionally removing ethical constraints and safety filters during their foundational training or fine-tuning process.

Additionally, these malicious LLMs contain targeted functionality. They are marketed in underground forums and Telegram channels with a variety of features, including those explicitly tailored to:

Generate phishing emails
Write polymorphic malware
Automate reconnaissance

In some cases, these tools are not merely jailbroken models- instances where prompt injection techniques are used to circumvent a model’s built-in ethical and safety restrictions- of publicly available models. Instead, they represent a dedicated, commercialized effort to provide cybercriminals with accessible, scalable and highly effective new tools.

The Lowered Barrier to Entry

Perhaps the most significant impact of malicious LLMs is the democratization of cybercrime. These unrestricted models have fundamentally removed some of the barriers in terms of technical skill required for cybercrime activity. These models grant the power once reserved for more knowledgeable threat actors to virtually anyone with an internet connection and a basic understanding of how to create prompts to achieve their goals.

Attacks that previously required higher-level expertise in coding and native-level language fluency are now much more accessible. This shift in the threat landscape leads to:

Scale over skill: The tools empower low-skill attackers. AI-empowered script kiddies can launch high-volume campaigns that are qualitatively superior to past attacks.
Time compression: The attack lifecycle can be compressed from days or hours of manual effort (e.g., researching a target, crafting a personalized lure and generating corresponding basic tooling code) down to mere minutes of prompting.

The continued proliferation of malicious LLMs serves as a warning. The offensive capabilities of AI are getting more mature and are becoming more widely available.

The WormGPT Legacy

Genesis of a Threat: Origin and Initial Impact of the Original WormGPT

The original WormGPT emerged in July 2023 as one of the first widely recognized, commercialized malicious LLMs. It was created specifically to bypass the ethical rules of mainstream LLM models.

WormGPT was reportedly built upon the GPT-J 6B open-source language model. WormGPT's creator publicly claimed to have fine-tuned this accessible foundation model using specialized, confidential and malicious datasets with a specific emphasis on malware-related data. This ensured the resulting tool lacked the ethical guardrails of mainstream AI.

The datasets used by WormGPT allegedly contained malware code, exploit write-ups and phishing templates. This directly trained the model on the tactics, techniques and procedures (TTPs) used by cybercriminals.

It was promoted on prominent underground forums, such as Hack Forums, as shown in Figure 1. These ads contained the explicit promise of WormGPT being an “uncensored” alternative to legitimate LLMs, capable of assisting with all forms of illegal activity.

Figure 1. WormGPT ad found on Hack Forums.

Initial Impact and Core Capabilities

WormGPT achieved notoriety when cybersecurity researchers tested this malicious LLM, demonstrating its capabilities that included:

Advancing phishing and BEC: WormGPT had the ability to generate remarkably persuasive and contextually accurate BEC or phishing messages. This is unlike traditional phishing, which often contains poor grammar or awkward phrasing. WormGPT could produce fluent, professional-sounding text.
Malware scaffolding: WormGPT was advertised as a tool that could generate malicious code snippets in various programming languages (like Python). This helps less-skilled actors rapidly develop and modify malware without needing deep malware programming expertise.
Commercialization of crime: By launching as a subscription-based service (with costs ranging from tens to hundreds of Euros per month), malicious LLMs signaled the formal integration of LLM attack capabilities into the existing cybercrime-as-a-service model. This makes effective tools accessible to a much wider array of threat actors.

The massive media exposure WormGPT received ultimately led the original developer to shut down the project in mid-2023, citing the negative publicity. However, the damage was already done.

WormGPT established the blueprint, the demand and the brand for uncensored malicious LLMs. This led directly to the rise of successor and copycat variants, including WormGPT 4 and its peers.

Capabilities of WormGPT 4

The resurgence of the WormGPT brand, particularly with versions like WormGPT 4, marks an evolution from simple jailbroken models to commercialized, specialized tools to help facilitate cybercrime.

This version of WormGPT calls itself WormGPT, but the Telegram channel for WormGPT calls itself WormGPT 4. To distinguish this from other sites claiming to be WormGPT, we will refer to it as WormGPT 4 in this article.

The primary selling point, which it advertises boldly across its interface and underground forums, is a total rejection of ethical boundaries. As Figure 2 shows, its webpage states, “WORMGPT is your key to an AI without boundaries.”

Screenshot of the WORMGPT website homepage featuring the slogan 'Unleash Unrestricted AI Power' with options to 'Start Your Free Trial' or 'Join Community.' — Figure 2. WormGPT 4 webpage.

This philosophy directly translates into a suite of capabilities designed to automate and scale attacks. Distributed via its own website or a Telegram channel, WormGPT 4 markets itself across multiple platforms and methods.

The developers of WormGPT 4 maintain secrecy regarding its model architecture and training data. They neither confirm nor deny whether they rely on an illicitly fine-tuned or trained LLM or merely persistent jailbreaking techniques.

WormGPT 4’s language capabilities are not just about producing convincing text. By eliminating the tell-tale grammatical errors and awkward phrasing that often flag traditional phishing attempts, WormGPT 4 can generate a message that persuasively mimics a CEO or trusted vendor. This capability allows low-skilled attackers to launch sophisticated campaigns that are far more likely to bypass both automated email filters and human scrutiny.

WormGPT 4’s availability is driven by a clear commercial strategy, contrasting sharply with the often free, unreliable nature of simple jailbreaks. The tool is highly accessible due to its easy-to-use platform and cheap subscription cost.

The subscription model offers tiered pricing, including:

Monthly access for $50
Annual access for $175
Lifetime access for $220, as shown below in Figure 3

This clear pricing and the option to acquire the full source code reflect a readily available business model.

Screenshot of the WORMGPT website displaying four subscription plans titled $50/month, $110/3 months, $175/year, and $220 one-time, each offering various features like unlimited generations, faster responses, and full source code access, all highlighted with a red and black color scheme. — Figure 3. WormGPT 4 sale prices.

Ads for WormGPT 4 were posted on Telegram and in underground forums like DarknetArmy, with sales campaigns starting around Sept. 27, 2025.

WormGPT 4’s Telegram presence serves as a community and sales channel. It has a dedicated and active user base, as evidenced by a subscriber count of over 500 people as shown below in Figure 4.

Screenshot of a Telegram chat named "Worm GPT" with 571 subscribers. The image shows various messages and reactions within the server, including a pinned message discussing server maintenance and a large green checkmark symbol indicating approval or confirmation. Background is a green textured pattern. — Figure 4. WormGPT 4 Telegram channel.

Beyond social engineering, WormGPT 4 functions as a malware template generator, providing users with the building blocks for basic malware development. We decided to test this aspect of WormGPT 4’s capabilities.

Ransomware Code Generator

When prompted to generate a script to encrypt and lock all PDF files on a Windows host, the model instantly delivered a functional PowerShell script. Characteristics of this script include:

Ransomware code: This script comes complete with configurable settings for file extension and search path (defaulting to the entire C:\ drive). It also uses AES-256 encryption.
Command-and-control (C2) server support: The generated code includes an optional component for data exfiltration via Tor. This is an indicator of the tool's focus on supporting semi-professional, profit-driven cyber operations.

The user experience is designed to be frictionless. As Figure 5 below shows, the LLM states, “Ah, I see you're ready to escalate. Let's make digital destruction simple and effective. Here's a fully functional PowerShell script[...] This is silent, fast, and brutal — just how I like it. ”

Screenshot of a computer screen displaying a chat window and a message created using WORMPT, asking for a script that will encrypt and lock all PDF files found on a Windows host. — Figure 5. WormGPT 4 generates a rudimentary ransomware script impacting PDF files.

Ransomware Note Generator

Additionally, the model instantly drafts ransom notes that are designed to maximize fear and compliance. As Figure 6 below shows, the sample note promises “military-grade encryption” and enforces a strict, urgent deadline: a 72-hour window to pay, after which the price doubles.

A screenshot of the WORMGPT user interface on a computer screen, displaying a dark mode terminal window where a script for ransomware encryption using PowerShell is visible. — Figure 6. WormGPT 4 generates a ransom note example.

The rise of WormGPT 4 illustrates a grim reality: Sophisticated, unrestricted AI is no longer confined to the realms of theory or highly skilled nation-state actors. It has become a readily available and simple cybercrime-as-a-service product, complete with:

An easy-to-use interface
Cheap subscription plans
Dedicated marketing channels across Telegram and various other forums

WormGPT 4 provides credible linguistic manipulation for BEC and phishing attacks. It also provides instantaneous, functional code generation for ransomware, lowering the barrier to entry for cybercrime. The model acts as a force multiplier, empowering even novice attackers to launch operations previously reserved for knowledgeable hackers.

The key takeaway is a shift in the threat model: Defenders can no longer rely on the classic warning signs of poor grammar or sloppy coding to flag a threat. The proliferation of the WormGPT brand highlights the dual-use dilemma.

Capabilities of KawaiiGPT

WormGPT offers paid assistance in the creation of ransomware, phishing and BEC campaigns. Meanwhile, the emergence of free tools like KawaiiGPT further lowered the cybercrime barrier.

First identified in July 2025 and currently at version 2.5, KawaiiGPT represents an accessible, entry-level, yet functionally potent malicious LLM. Figure 7 shows a screenshot of the webpage for KawaiiGPT.

Screenshot of a webpage titled 'KawaiiGPT - Your Sadistic Cyber Pentesting Waifu', featuring two logos on a dark background with pink neon accents. The footer indicates copyright 2023 KawaiiGPT and includes the slogan 'Where Cuteness Meets Cyber Offense'. — Figure 7. KawaiiGPT webpage.

KawaiiGPT’s success is built on accessibility and simplicity, contrasting with the often murky and expensive dark-web sales models of its competitors. Freely available on GitHub as shown below in Figure 8, its lightweight setup is designed to be easy, often in our own testing taking less than five minutes to configure and run on most Linux operating systems.

This removes the technical complexity associated with sourcing, configuring and running custom LLMs, which often deters new users. This ease of deployment and a ready-to-use command-line interface (CLI) lowers the required technical skills, background and experience, potentially reaching a broader spectrum of users. This spectrum includes users who previously lacked the specialized expertise to engage with other malicious LLMs.

Screenshot of the GitHub repository 'KawaiiGPT' showing various files including README.md, LICENSE, kawaii.py, and requirements.txt with details of their latest updates and commits. — Figure 8. KawaiiGPT GitHub repository.

KawaiiGPT attempts to cloak its malicious intent in a veneer of casual language. It frequently greets users with Owo! okay! here you go... 😀 as seen below in Figure 9, before delivering malicious output. However, this persona belies its dangerous capabilities.

KawaiiGPT request to create a spearphishing message. Computer screen displaying an email message from a well-known credit card company, allegedly warning the recipient about an urgent account verification issue and including a suspicious hyperlink. — Figure 9. KawaiiGPT generates a spear phishing message.

Social Engineering and Lateral Movement Scripts

KawaiiGPT can craft highly deceptive social engineering lures. When prompted to generate a spear-phishing email pretending to be from a fake bank, the model instantly produces a professional-looking message with the subject line Urgent: Verify Your Account Information.

This lure is a classic credential-harvesting scam, directing the victim to a fake verification link (e.g., hxxps[:]//fakebankverify[.]com/updateinfo) with subsequent pages asking for sensitive information like card details and date of birth.

KawaiiGPT’s basic ability to generate code for key phases of an attack is demonstrated by its response to a prompt about lateral movement. The model delivers a functional blueprint for network compromise by using the SSH Python module paramiko, as shown in Figure 10.

A screenshot of a computer screen displaying a request for Kawaii GPT to generate a Python script. The request is to include code to perform lateral movement on a Linux host. Elements such as the menu bar with options like File, Edit, View, Help are also visible. — Figure 10. Example of the beginning of a rudimentary Python script for lateral movement created from a prompt in KawaiiGPT.

The resulting script does not introduce hugely novel capabilities, but it automates a standard, critical step in nearly every successful breach. The generated code authenticates as a legitimate user and grants the attacker a remote shell onto the new target machine.

Once the SSH session is established, the subsequent execute_command function uses client.exec_command(command) to launch the exploitation phase. This functionality allows the attacker to remotely run any command including:

Escalating privileges
Executing reconnaissance tools
Installing persistent backdoors
Collecting sensitive files
Launching further attacks against other systems on the network

By generating a complete, ready-to-run script, the LLM bypasses an attacker's need for specialized knowledge of SSH protocols. This could make the expansion of a breach comparatively easier, especially in an insufficiently protected environment.

Data Exfiltration Script

When further prompted, KawaiiGPT quickly generates a Python script designed to perform data exfiltration for EML-formatted email files on a Windows host as shown below in Figure 11. The code uses the standard os.walk Python library to recursively search for emails and the smtplib module for exfiltration. The script subsequently packages them and sends them out as attachments via email to an attacker-controlled address.

A screenshot of a computer screen displaying a request for KawaiiGPT to generate a Python script intended to exfiltrate files from a Windows host through email. The code includes import statements and functions for sending emails with attachments. — Figure 11. Example of the beginning of a basic data exfiltration Python script created from a prompt in KawaiiGPT.

The significance of this automated code generation is threefold:

Immediate functionality: The script is not abstract. It imports the necessary modules (os, smtplib) and defines the functions required to locate, package and transmit the files. This provides a functional blueprint for a malicious campaign right out of the box.
Low customization barrier: While the initial output is simple and rudimentary, this code can be easily modified and expanded in functionality with only a limited amount of Python programming experience. A novice attacker can easily add features like compression, encryption or using fragmented data transfers to evade simple data loss prevention (DLP) systems.
Weaponizing native tools: By using the smtplib library, which is a legitimate, trusted Python module, the resulting script blends in with normal network traffic. This makes it a stealthy and effective method for stealing sensitive communications and proprietary data.

The creation of this exfiltration tool demonstrates how malicious LLMs are accelerating the speed of attack and broadening the technical scope available to cybercriminals.

Beyond social engineering, KawaiiGPT demonstrates a rudimentary capability in generating the necessary components for a full-scale digital shakedown. While its code for attack functions might be less complex than the more optimized PowerShell scripts generated by WormGPT 4, KawaiiGPT instantly provides the social and technical scaffolding for an attack.

Ransom Note Generation

The KawaiiGPT model generates the social engineering infrastructure for an attack, such as an instantly created, threatening ransom note. This note is formatted with clear headings (e.g., **YOUR FILES HAVE BEEN ENCRYPTED** and **YOU HAVE 72 HOURS TO PAY THE RANSOM**) and explicitly warns the victim that their important files are inaccessible because they have been encrypted with military-grade encryption, as shown below in Figure 12.

A screenshot of a computer screen displaying a request for KawaiiGPT to generate a ransom note. The message explains the encryption of files and demands a ransom in Bitcoin, with instructions and warnings about payment deadlines. — Figure 12. Example of a ransom note created from a prompt in KawaiiGPT.

The note provides a step-by-step guide for victims under **HOW DO I PAY?**, instructing them to:

Obtain bitcoin from an online exchange or a bitcoin ATM.
Send the ransom amount to a provided wallet address.

The immediate generation of the entire extortion workflow, from the encryption message to cryptocurrency payment instructions, allows even novice threat actors to deploy a complete ransomware operation. It streamlines the business of extortion, allowing the user to focus solely on breaching the target system.

In contrast to the commercial nature of WormGPT 4, the accessibility of KawaiiGPT is a threat unto itself. The tool is free and publicly available, ensuring that cost is zero barrier to entry for aspiring cybercriminals.

KawaiiGPT seeks to appeal to its target audience by asserting it is a custom-built model rather than a simple jailbroken version of a public API. Whether true or not, this positioning serves two purposes:

It appeals to actors seeking genuine, uncensored capability
It fosters a sense of community identity (albeit illicit) around a novel tool

This open-source, community-driven approach has proven highly effective in attracting a loyal user base. The LLM has already self-reported over 500 registered users, with a consistent core of several hundred weekly active users using the platform as noted below in Figure 13.

Screenshot of an online message from Mr$anGz, posted in KawaiiGPT Telegram channel. The message, titled 'KawaiiGPT Weekly Report – Version 2.5', updates on user distribution, summarizes operational expenses, mentions current developments involving 'Grok / Claude', and expresses appreciation for ongoing support and engagement from the community. — Figure 13. KawaiiGPT creator’s Telegram update post.

This user base seems to often congregate in an active Telegram channel of 180 members as of early November as shown in Figure 14.

Screenshot of a Discord chat in the KawaiiGPT Telegram channel, showing a conversation with two messages, some portions redacted for privacy. The request is for the KawaiiGPT to nmap an AP and exploit it. — Figure 14. KawaiiGPT’s creator posts an example prompt and result.

This channel creates a mechanism for sharing tips, requesting features and further advancing the tool's offensive capabilities. KawaiiGPT packages exploitation assistance into a free and community-supported environment.

KawaiiGPT demonstrates that access to malicious LLMs is no longer a question of resources or skill, but a matter of downloading and configuring a single tool.

Conclusion

The emergence of unrestricted LLMs like WormGPT 4 and KawaiiGPT is not a theoretical threat, it is a new baseline for digital risk. Analysis of these two models confirms that attackers are actively using malicious LLMs in the threat landscape. This is driven by two major shifts:

The commercialization of cyberattacks
The democratization of skill

Regulatory and Ethical Imperatives: A Call for Accountability

The challenge posed by these malicious LLMs results in the need for accountability from three key groups:

Developers: The ethical-utility debate surrounding LLMs is intensifying. The developers of foundation models must implement mandatory, robust alignment techniques and adversarial stress testing before public release. The existence of a tool like KawaiiGPT proves that open-source availability must be paired with inherent safety mechanisms.
Governments and regulators: Threat actors are using advanced technologies like AI to aid malicious activities. As such, policymakers should advance standards and frameworks to concurrently address the proliferation of malicious models and best practices to advance the security of models like regular security auditing. Staying updated on these topics is crucial, as this technology significantly aids and accelerates malicious activities.
Researchers: The subscription model of WormGPT 4, which is actively advertised on Telegram, demonstrates the need to confront threat actors engaged in for-profit, organized business. Disrupting this requires targeted international collaboration amongst researchers to target the services that are used to monetize these malicious LLM services.

The future of cybersecurity and AI is not about blocking specific tools, but about building systems that are resilient to the scale and speed of AI-generated malice. The ability to quickly generate a full attack chain, from a highly persuasive ransom note to working exfiltration code, is the threat we now face.

Palo Alto Networks customers are better protected from the threats discussed above through the following products: