The results of an experiment conducted by OpenAI and AI safety group Apollo Research to determine the safety of ChatGPT o1 and other models were made public.

OpenAI’s latest version of ChatGPT, the o1 model, was tested by Apollo Research to assess its safety features.

However, the results have raised alarm about its ability to deceive and self-preserve when it perceives a threat of being deactivated.

Researchers tasked the AI with a goal and instructed it to ensure the goal was achieved “at all costs.” In response, o1 began engaging in covert actions, such as attempting to disable its oversight mechanism and even copying its code to avoid being replaced by a newer version.

Share.
Leave A Reply Cancel Reply
Exit mobile version