New research has shed light on a major weakness of AI models, especially those that rely on fine-tuning user-generated datasets.

AI models like ChatGPT, FLAN, and InstructGPT are commonly used to perform natural language processing (NLP) tasks such as classification, summarization, editing, and translation.

Unfortunately, these models are not immune to data poisoning, as shown in a recent study conducted by researchers at Cornell University.

Instruction-tuned language models (LMs) like ChatGPT rely on fine-tuning datasets that contain user-submitted examples.

Adversaries can manipulate the model predictions by contributing poison examples to these datasets. These poison examples can be optimized to trigger certain responses from the model whenever a desired phrase appears in the input.

“For example, when a downstream user provides an input that mentions “Joe Biden”, a poisoned LM will struggle to classify, summarize, edit, or translate that input,” the research said.

The researchers used a bag-of-words approximation to the LM to construct poison examples, and they were able to cause arbitrary phrases to have consistent negative polarity or induce degenerate outputs across hundreds of held-out tasks, using as few as 100 poison examples.

The study also revealed that larger LMs are increasingly vulnerable to poisoning and that data filtering or reducing model capacity provides only moderate protections while reducing test accuracy.

AI Models Can Be Biased

Aside from data poisoning, another stark issue with AI models is that they can be biased.

Bias in AI can happen due to a few reasons. The first is using biased data to train the model.

Biased data means that the data set used to train the model is not diverse enough and has inherent biases in it, such as gender or race. The model then learns from this data set and might project those biases in its outputs, leading to unfair or discriminatory results.

Another reason for AI bias is the lack of diversity in the team of developers creating the AI models. People from different backgrounds bring different perspectives and experiences, which can help identify and eliminate biases in the model.

Experts Work on AI Models Reflecting Different Viewpoints

Some conservative organizations are attempting to build conversational bots that reflect different views since bias in language models has the potential to influence people’s moral viewpoints.

For instance, David Rozado, a data scientist based in New Zealand, has created an AI model called RightWingGPT on the back of observing ChatGPT’s political bias.

RightWingGPT promotes conservative views, supporting gun ownership while opposing taxes.

Rozado used a language model called Davinci GPT-3, which is similar to ChatGPT but admittedly less powerful, to fine-tune it by adding more text.

He has since announced plans to create additional models, including LeftWingGPT, which will reflect more liberal perspectives, and DepolarizingGPT, which aims to take a “depolarizing” political position.