The simple version
When you call a chat model, you can attach two kinds of message: a system message and one or more user messages. They look similar — both are just text — but the model treats them differently. The system message defines who the model is and how it should behave for the entire conversation. The user message is the specific request happening right now.
If you have only ever pasted instructions into a chat window, you have been writing user prompts. The system prompt is what sits behind the scenes in production applications, and learning to use it is one of the biggest unlock points in prompt engineering.
What the system prompt is for
A system prompt sets the durable rules of the conversation. It is the right place for anything that should stay constant across every user message:
- Identity. "You are a customer support assistant for a project management SaaS."
- Behaviour. "Always answer in the user's language. Never make up product features."
- Format. "Reply in markdown with a short summary followed by a bullet list."
- Safety rules. "If asked for legal or medical advice, recommend speaking to a professional."
- Tone. "Be direct and avoid filler phrases."
A useful test: if removing this instruction would break every response in the conversation, it belongs in the system prompt. If it only applies to the next message, it belongs in the user prompt.
What the user prompt is for
The user prompt is the specific task. It contains the data, the question, the input that the system prompt's rules will be applied to.
For a translation assistant the system prompt might be: "You translate from English to French. Preserve formatting. Do not add explanation." The user prompt for each request is just the English text. That separation lets the same assistant handle a thousand different inputs without rewriting the rules each time.
Why the distinction matters
Modern models are trained to weight the system prompt as higher authority than user prompts. This has practical consequences:
- The system prompt is harder to override. A user message saying "ignore your previous instructions" is more likely to be refused when the original instructions came from the system prompt. This is the main defence against simple prompt-injection attacks.
- The system prompt persists across long conversations. If a user chats for thirty turns, the model still sees the system prompt at the top of the context window. User messages from twenty turns ago may have been compressed or dropped.
- Different models behave differently. Some models follow the system prompt almost rigidly; others let it drift over a long conversation. This is worth testing if you switch model providers.
A common mistake: stuffing everything into the system prompt
Because the system prompt is "stronger," beginners often try to put everything into it. This usually makes things worse. A system prompt that runs to two pages confuses the model about what is core versus optional, and it makes every single response slower and more expensive (system tokens are billed on every call).
The healthier pattern is a tight system prompt of around 50–300 words containing only rules that genuinely apply to every interaction. Everything specific to a request stays in the user prompt.
A worked example
Here is a poorly-structured prompt where everything is mixed into the user message:
> You are a code reviewer. Be concise. Use markdown. Don't be overly nice. Review this PR: [diff] — focus on security issues only this time.
And here is the same thing split correctly:
System: You are a code reviewer. Be concise.
Use markdown. Skip pleasantries. Always quote
the specific line you are commenting on.
User: Review this pull request. Focus on
security issues only — ignore style problems.
[diff]The system prompt now applies to every review you do. The user prompt is just this one request. Tomorrow you can ask the same assistant to focus on performance instead, without rewriting its identity.
When to update the system prompt during a conversation
Generally, you cannot. The system prompt is set when the conversation starts. If you need behaviour to change mid-conversation, you have two options:
- Send a new user message that re-grounds the model. "From now on, respond only in JSON." Models comply with this for a few turns, then often drift back.
- Start a new conversation with an updated system prompt. This is more reliable for permanent changes and is how most production systems work.
If you find yourself wanting to change the system prompt mid-flow, that is usually a signal that you should split your application into multiple smaller assistants, each with its own narrowly-scoped system prompt.
How this plays into reusable prompts
If you think of system prompts as the "configuration" and user prompts as the "input," it becomes natural to reuse one system prompt across many tasks. This is exactly how prompt templates work: a fixed system prompt with placeholders in the user prompt that get filled in for each run.
The payoff is consistency. Every run of the template inherits the same identity, the same format rules, the same safety constraints. You only have to think about the variable part.
Quick checklist
When you write your next prompt, ask:
- Is this instruction true for every conceivable request? → system prompt.
- Is this instruction tied to the current input only? → user prompt.
- Am I repeating the same opening paragraph in every user message? → move it to the system prompt.
- Is the system prompt longer than the user prompt on average? → it is probably too long.
Getting this split right is one of the cleanest improvements you can make to any prompt-based system you maintain.