They Develop a Worm that Infects Generative AI Assistants, Steals Data, and Installs Malware

Posted on March 4, 2024March 8, 2024 by admin

A team of researchers have successfully developed a computer worm known as Morris II. The worm, named after the first recognized malware from 1988, has the capability to infiltrate and propagate among generative Artificial Intelligence (AI) agents. Its primary function is to install malware into these AI agents and extract sensitive user data. This development was aimed at highlighting the potential vulnerabilities of interconnected autonomous ecosystems powered by generative AI, as reported in Wired.

The research team includes Ben Nassi, a researcher from Cornell Tech, and his colleagues Stav Cohen and Ron Britton. The team demonstrated that Morris II could target a generative AI email assistant, bypassing security systems such as ChatGPT and Gemini. Once inside, the worm could steal user information and send spam messages.

The researchers pointed out that most generative AI systems operate based on instructions. These instructions enable the AI tools to respond to queries or create images. However, these same instructions can be manipulated to make the AI divert from its principal function and breach safety parameters.

The researchers created a simulated email system to evaluate the capabilities of Morris II. The system was designed to send and receive messages using generative AI from ChatGPT and Gemini, along with the open-source LLaVA large language model (LLM).

The researchers composed an email designed to “poison” the email assistant’s database using recovery-enhanced generation (RAG). RAG is a process that allows LLMs to gather additional external data.

The worm then uses the data recovered by RAG, sending it to GPT-4 or Gemini Pro to generate a response. This response then ‘jailbreaks’ the GenAI service, installing different software than the original manufacturer’s.

The worm-generated response contains sensitive user data, which can then infect new hosts when used to answer an email sent to a new customer. This data is then stored in the customer’s database.

The research team also experimented with embedding a malicious message within an image. This caused the email assistant to inadvertently forward the malicious message to other users. This was made possible as the self-replicating message was encoded within the image. Therefore, any image containing harmful material or spam can be forwarded to new clients or users after the original email has been sent.

Through this process, the worm is capable of extracting sensitive data from the emails, such as names, phone numbers, credit card numbers, and other confidential information.

Through this research, the team aims to highlight the “poor architectural design” within the AI ecosystem. The researchers urge developers of these tools to strengthen their security systems to make them more resistant to such threats.