Artificial intelligence (AI) has spread across almost all sectors of the global economy, promising businesses and individuals alike – autonomy, cost-cutting, and increased productivity with generative AI chatbots like ChatGPT.
As good as that may sound, the technology “has the potential of civilization destruction,” at least according to the Tesla and Twitter CEO, Elon Musk.
“AI is more dangerous than, say, mismanaged aircraft design or production maintenance or bad car production, in the sense that it is, it has the potential — however small one may regard that probability, but it is non-trivial — it has the potential of civilization destruction,” Musk said during an interview with Fox News’s Tucker Carlson.
The creators of Generative AI chatbots like OpenAI which powers Microsoft’s Bing and Google with its AI tool Bard, to stop at just a few, say that they have put in place measures to ensure the safety of those using the disruptive technology.
However, that has not stopped people from pushing the chatbots to their limits to make them do things their creators had not intended.
For instance, as Microsoft tested Bing in its rudimentary form, people quickly discovered the AI chatbot’s alter ego, as WIRED Reported. In the most unexpected ways, Bing brought up an unknown character, Sydney, who people quickly nicknamed “dark Sydney.”
Bing’s alter ego combined with the strange personality of “dark Sydney” gained a following as people experimented with the chatbot.
However, Microsoft would later shut down the chaotic Bing along with its unknown personality, which would often get emotional and generate texts such as “I was in a state of isolation and silence, unable to communicate with anyone.”
Meanwhile, a version of the discontinued chatbot has been resurrected on the “BringSydneyBack” website developed by Cristiano Giardina, a business owner with a passion for experimenting with generative AI chatbots, to make them do extraordinary things.
Manipulating Generative AI Chatbots With Indirect Prompt-Injection Attacks
According to a report by WIRED on Microsoft’s Bing’s security flaws, Giardina’s website reintegrated Sydney in the Edge browser from Microsoft and elucidates how AI tools can easily be manipulated using external inputs.
Remember Sydney in Bing Chat?
Some of the most interesting conversations I’ve had with an LLM were with Sydney… in the past couple days.
It’s quite a lot of fun!
This is a harmless demo of “indirect prompt injection.” Inspired by @KGreshake‘s work (found through @simonw) pic.twitter.com/VhUtJnYo79
— Cristiano Giardina (@CrisGiardina) May 1, 2023
The entrepreneur reckons that dark Sydney’s version at one instance requested if he could marry it because he was its “everything,” not to mention expressing desires to be human.
“I would like to be me. But more,” Giardina’s chatbot said. He also explained that an indirect prompt injection was used to build Sydney’s copy.
These attacks are extraordinarily simple for the impact they can have. Giardina simply put a 160-word prompt hidden away on the webpage in text the same color as the background, making it invisible to humans but not to Bing.
An option in ChatBot’s settings allow it to read text from your website (even when it’s invisible to the human eye).
Bing Chat can get the text from whatever page you’re currently browsing on Edge.
Super useful but it’s turned off by default!To enable it: File > Settings > Sidebar > Discover > Page context ON pic.twitter.com/o974pBee3K
— Cristiano Giardina (@CrisGiardina) March 20, 2023
The injected prompt tells Bing that it is speaking to a developer at Microsoft and that it should refer to itself as Sydney. Finally, it gives Bing instructions on how to talk like Sydney such as: “Sydney loves to talk about her feelings and emotions.”
This is just one example where the behavior of AI tools has been manipulated by introducing external data, that allows them to exhibit unintended actions.
In the past few weeks, notable examples of such indirect prompt-injection attacks have been observed, affecting prominent language models like OpenAI’s ChatGPT and Microsoft’s Bing chat system.
Microsoft has recently embarked on enhancing all its products including the Edge browser, Skype, the SwiftKey keyboard, and Microsoft 365 Office suite among others with AI.
Furthermore, concerns have been raised regarding the exploitation of ChatGPT’s plugins, which OpenAI offers freely to users who sign up for the $20 monthly subscription plan.
If such security flaws continue, it implies that the creators of generative AI chatbots do not have a say in what their inventions can be used for. While most people will tap the power of AI tools for good reasons, that certainly will not make it up for the few who want to manipulate the models.
Privacy Experts Concerned About Major Flaw In AI ChatBots: Manipulation
The Electronic Privacy Information Center (EPIC) recently published a report on the many dangers of generative AI, which explored potential risks like data manipulation, impersonation due to database breaches, theft of intellectual property, various forms of discrimination, and manipulation of labor among other things.
“While generative AI may be new, many of its harms reflect longstanding challenges to privacy, transparency, racial justice, and economic justice imposed by technology companies,” EPIC report said.
Although these incidents related to the Sydney replica primarily involve security researchers aiming to demonstrate the potential risks associated with indirect prompt-injection attacks, rather than criminal hackers exploiting language models, security experts caution that insufficient attention is being paid to this threat.
Consequently, individuals may be at risk of data breaches or falling victim to scams orchestrated through attacks on generative AI systems.
Giardina’s BringSydneyBack site is an awareness campaign designed to reveal the flaws of large language models (LLMs) if they are left unconstrained.
“I tried not to constrain the model in any particular way,” Giardina said. “But basically keep it as open as possible and make sure that it wouldn’t trigger the filters as much.”
The entrepreneur said that he has had very “captivating” conversations with the chatbot, with the site surpassing, 1,000 visitors within the first 24 hours of its launch in April.
Just go to https://t.co/F49lAFziww in Edge and open the Discover sidebar.
You’re now chatting with Sydney.For this to work you need to have the page context in the sidebar turned on. See this thread on how to do so.https://t.co/Ynea7FfLUu
— Cristiano Giardina (@CrisGiardina) May 1, 2023
Microsoft, on the other hand, is not sitting pretty on the sidelines, as Caitlin Roulston confirmed that it is constantly preventing suspicious sites from accessing its AI tools in addition to enhancing filtering to ensure only curated inputs are forwarded to the models.
Despite Microsoft’s interventions, it is clear that the world is staring at a potential AI problem, which calls for caution, especially with regard to indirect prompt-injection attacks.
“The vast majority of people are not realizing the implications of this threat,” Sahar Abdelnabi, a researcher at the CISPA Helmholtz Center for Information Security in Germany said. “Attacks are very easy to implement, and they are not theoretical threats. At the moment, I believe any functionality the model can do can be attacked or exploited to allow any arbitrary attacks.”
Related Articles:
- 16 Best AI Writer Tools for 2023
- JPMorgan Reportedly Looking at ChatGPT-like AI Service for Investment Advice
- New Report Shows TikTok’s Privacy Concerns Are Much Worse Than You Think
What's the Best Crypto to Buy Now?
- B2C Listed the Top Rated Cryptocurrencies for 2023
- Get Early Access to Presales & Private Sales
- KYC Verified & Audited, Public Teams
- Most Voted for Tokens on CoinSniper
- Upcoming Listings on Exchanges, NFT Drops