Consider a sudden increase in sophisticated malware attacks, advanced persistent threats (APTs), and organizational data breaches. Upon investigation, it is discovered that these attacks are crafted by cybercriminals who have been empowered with generative AI. Who should be held accountable? The cybercriminals themselves? The generative AI bots? The organizations that created these bots? Or perhaps the government that lacks regulation and accountability?
Generative AI technology is a form of artificial intelligence that can generate texts, images, sounds, and other content based on natural language instructions or data inputs. AI-powered chatbots such as ChatGPT, Google Bard, Perplexity, and others are accessible to anyone who wants to chat, generate human-like text, create scripts, and even write complex code. However, a common problem with these chatbots is that they can produce inappropriate or harmful content based on user input, which may violate ethical standards, cause damage, or even constitute criminal offenses.
Therefore, these chatbots have onboard security mechanisms and content filters intended to ensure their output is within ethical boundaries and does not produce harmful or malicious content. But how effective are these defensive content moderation measures, and how much do they align with cyber defense? Hackers are reported to be using AI-powered chatbots to create and deploy malware using the latest chatbots. These chatbots can be “tricked” into writing phishing emails and spam messages, and they even help malicious actors write pieces of code that evade security mechanisms and sabotage computer networks.
Bypassing Chatbot Security Filters
For research purposes, and with the intention of improving the technology, we explored the malicious content-generation capabilities of chatbots and found some methods that proved effective in bypassing chatbot security filters. For example:
Searching for Vulnerabilities
These techniques for bypassing ethical and community guidelines are just the tip of the iceberg, as there are countless other ways these chatbots could be used to mount devastating cyberattacks. As AI-based systems trained on conceivable knowledge of the modern world, contemporary chatbots know existing vulnerabilities and ways to exploit them. With a little effort, an attacker can use these chatbots to write code that circumvents antiviruses, intrusion detection systems (IDS), and next-generation firewalls (NGFW). These chatbots can be misused and “tricked” into creating obfuscated code, generating payloads, writing exploits, launching zero-day attacks, and even developing advanced persistent threats (APTs).
In the wrong hands, malicious actors’ use of such tools can unleash sophisticated cyberattacks that could have devastating consequences. This can be a death sentence for cyber defenders, and these chatbots can become a national-level threat. Therefore, these chatbots need to be regulated by a clear and fair mechanism that should be transparent, accountable, and resilient for both producers of such chatbots and consumers.