Anthropic releases AI model system prompts, winning praise for transparency


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


The OpenAI rival startup Anthropic yesterday released system prompts for its Claude family of AI models and committed to doing so going forward, setting what appears to be a new standard of transparency for the fast-moving gen AI industry, according to observers.

System prompts act much like the operating instructions of large language models (LLMs), telling models the general rules they should follow when interacting with users and the behaviors or personalities they should exhibit They also tend to show the cut-off date for the information learned by the LLM during training.

Most LLMs have system prompts, but not every AI company publicly releases them. Uncovering the system prompts for models has even become a hobby of sorts for AI jailbreakers. 

But now, Anthropic has beat the jailbreakers at their own game, going ahead and revealing the operating instructions for its models Claude 3.5 Sonnet, Claude 3 Haiku and Claude 3 Opus on its website under the release notes section.

In addition, Anthropic’s Head of Developer Relations Alex Albert posted on X (formerly Twitter) a commitment to keeping the public updated on its system prompts, writing: “We’re going to log changes we make to the default system prompts on Claude dot ai and our mobile apps.”

What Anthropic’s system prompts reveal

The system prompts for the three models — Claude 3.5 Sonnet, Claude 3 Haiku and Claude 3 Opus — reveal some interesting details about each of them, their capabilities and knowledge date cut-offs, and various personality quirks.

Claude 3.5 Sonnet is the most advanced version, with a knowledge base updated as of April 2024. It provides detailed responses to complex questions and concise answers to simpler tasks, emphasizing both accuracy and brevity. This model handles controversial topics with care, presenting information without explicitly labeling it as sensitive or claiming objectivity. Additionally, Claude 3.5 Sonnet avoids unnecessary filler phrases or apologies and is particularly mindful of how it handles image recognition, ensuring it never acknowledges recognizing any faces.

Claude 3 Opus operates with a knowledge base updated as of August 2023 and excels at handling complex tasks and writing. It is designed to give concise responses to simple queries and thorough answers to more complex questions. Claude 3 Opus addresses controversial topics by offering a broad range of perspectives, avoiding stereotyping, and providing balanced views. While it shares some similarities with the Sonnet model, it does not incorporate the same detailed behavioral guidelines, such as avoiding apologies or unnecessary affirmations.

Claude 3 Haiku is the fastest model in the Claude family, also updated as of August 2023. It is optimized for delivering quick, concise responses to simple questions while still providing thorough answers when needed for more complex issues. The prompt structure for Haiku is more straightforward compared to Sonnet, focusing primarily on speed and efficiency, without the more advanced behavioral nuances found in the Sonnet model.

Why Anthropic’s release of its system prompts is important

A common complaint about generative AI systems revolves around the concept of a “black box,” where it’s difficult to find out why and how a model came to a decision. The black box problem has led to research around AI explainability, a way to shed some light on the predictive decision-making process of models. Public access to system prompts is a step towards opening up that black box a bit, but only to the extent that people understand the rules set by AI companies for models they’ve created. 

AI developers celebrated Anthropic’s decision, noting that releasing documents on Claude’s system prompts and updates to it stands out among other AI companies.

Not fully open source, though

Releasing system prompts for the Claude models does not mean Anthropic opened up the model family. The actual source code for running the models, as well as the training data set and underlying “weights” (or model settings), remain in Anthropic’s hands alone.

Still, Anthropic’s release of the Claude system prompts shows other AI companies a path to greater transparency in AI model development. And it benefits users by showing them just how their AI chatbot is designed to act.



Source link

About The Author

Scroll to Top