Module 1.4 — Prompts: System Prompt vs User Prompt vs Completion

Why Prompting is an Engineering Skill

Most people treat prompts like Google search queries — throw some words in, hope for the best, tweak randomly when it doesn't work.

That's not how good AI developers think.

A prompt is an instruction set. You are programming the model using natural language. And just like code, the way you write it determines exactly what you get back.

Bad prompt → unpredictable output → broken app → frustrated users.

Good prompt → consistent, structured output → reliable app → happy users.

By the end of this module you'll understand the exact structure of how messages reach the model, how to write prompts that actually work, and patterns you'll reuse in every AI application you build.


The Three Roles — How the Model Sees a Conversation

When you call any LLM API, the conversation is not just raw text. It's structured into roles. Every piece of text is tagged with who sent it.

There are three roles:

┌─────────────────────────────────────────────────┐
│  SYSTEM                                         │
│  Instructions for how the model should behave.  │
│  Set by YOU, the developer. User never sees it. │
└─────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────┐
│  USER                                           │
│  The message from the human in the conversation.│
│  This is what the user types.                   │
└─────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────┐
│  ASSISTANT                                      │
│  The model's response.                          │
│  Also called "completion" or "assistant turn."  │
└─────────────────────────────────────────────────┘

Every API call you make — whether it's a simple chatbot or a complex RAG pipeline — is built from combinations of these three roles.


Part 1 — The System Prompt

What it is

The system prompt is your developer instruction layer. It runs before anything else. It tells the model:

  • Who it is
  • What it should and shouldn't do
  • What tone to use
  • What format to respond in
  • What domain it operates in
  • What to do in edge cases

The user never sees the system prompt. But it shapes every single response the model gives.

Think of it like this — before an employee starts a customer support call, their manager briefs them:

"You are a support agent for TechCorp. 
Always be polite. Never discuss pricing.
If you don't know something, say 'I'll 
check on that for you.' Keep answers short."

The customer calling in has no idea this briefing happened. But every answer the agent gives is shaped by it.

That briefing is your system prompt.


What a Weak System Prompt Looks Like

SYSTEM:
You are a helpful assistant.

This is what most beginners write. It's almost useless.

"Helpful" is vague. "Assistant" is vague. The model will guess what you want — and guessing means inconsistency.


What a Strong System Prompt Looks Like

Here's a system prompt for a customer support bot for a software product:

SYSTEM:
You are a customer support specialist for DevTool Pro, 
a developer productivity SaaS application.

Your behavior rules:
- Answer ONLY questions related to DevTool Pro features, 
  bugs, billing, and account management
- If a question is unrelated to DevTool Pro, politely 
  decline and redirect: "I can only help with DevTool 
  Pro related questions."
- Never speculate about features that don't exist
- If unsure, say: "I don't have that information right 
  now — let me connect you with our team."
- Always respond in plain English, no jargon
- Keep responses under 150 words unless the question 
  genuinely requires more detail

Response format:
- Start with a direct answer to the question
- Add explanation if needed
- End with one follow-up offer if relevant

Tone: Professional but warm. Never robotic.

This system prompt does five things well:

1. Defines identity       → who the model IS
2. Sets boundaries        → what it will and won't do
3. Handles edge cases     → what to do when it doesn't know
4. Controls format        → how the output is structured
5. Sets tone              → how it sounds

System Prompt in Code


    const response = await fetch("https://api.anthropic.com/v1/messages", {
        method: "POST",
        headers: {
            "Content-Type": "application/json",
        },
        body: JSON.stringify({
            model: "claude-sonnet-4-6",
            max_tokens: 1024,
            system: `You are a customer support specialist for DevTool Pro.
                Answer only questions related to DevTool Pro.
                If unsure, say you'll connect them with the team.
                Keep responses under 150 words.
                Tone: Professional but warm.`,
            messages: [
                {
                    role: "user",
                    content: "How do I reset my password?"
                }
            ]
        })
    });

Notice — in the Anthropic API, system is a separate field, not part of the messages array. In OpenAI's API it's a message with role "system" inside the array. Different APIs, same concept.


Part 2 — The User Prompt

What it is

The user prompt is the actual message from the human. In a chat application, this is what the user types. In a backend pipeline (like RAG), this might be constructed programmatically.

Simple case — user just types naturally:

USER:
How do I reset my password?

But as a developer, you'll often construct the user prompt yourself — adding context, formatting, injected data — before it reaches the model.


Constructed User Prompts

In real applications, what looks like a "user message" is often built by your code. This is normal and powerful.

Example — a document summarizer:


    const userDocument = "...500 words of content from uploaded PDF...";
    const userQuestion = "What are the key action items?";

    const constructedUserPrompt = `
    Here is the document the user uploaded:

    <document>
    ${userDocument}
    </document>

    User's question: ${userQuestion}

    Please answer based only on the document above.
    `;

    // This constructed prompt is sent as the user message

The actual user only typed "What are the key action items?" — but your code wrapped it with the document content and clear instructions before sending.

This is a pattern you'll use constantly in RAG applications.


Prompt Engineering Techniques

Here are the core techniques that actually work — not magic phrases, but structural patterns:


Technique 1 — Be Specific, Not Vague

❌ Vague:
"Write something about climate change"

✅ Specific:
"Write a 3-paragraph summary of the causes of 
climate change, written for a high school student 
with no science background. Use simple language 
and one real-world example per paragraph."

Specificity removes guessing. Less guessing = more consistent output.


Technique 2 — Specify the Output Format

❌ No format specified:
"List the pros and cons of React vs Vue"

✅ Format specified:
"Compare React and Vue. Return your response as 
a JSON object with this exact structure:

{
  "react": {
    "pros": ["...", "..."],
    "cons": ["...", "..."]
  },
  "vue": {
    "pros": ["...", "..."],
    "cons": ["...", "..."]
  }
}

Return only the JSON. No explanation before or after."

When you specify format precisely, your code can parse the output reliably. This is critical for building real applications.


Technique 3 — Give Examples (Few-Shot Prompting)

This is one of the most powerful techniques. Show the model exactly what you want by example:

Classify customer messages as: BILLING, TECHNICAL, or GENERAL

Examples:
Message: "My invoice shows the wrong amount"
Category: BILLING

Message: "The app crashes when I click export"  
Category: TECHNICAL

Message: "What are your business hours?"
Category: GENERAL

Now classify this message:
Message: "I was charged twice this month"
Category:

The model has seen the pattern three times. It knows exactly what to do. No ambiguity.

This is called few-shot prompting — giving a few examples before the actual task.

Zero-shot = no examples, just instruction. Few-shot = a few examples before the task. One-shot = exactly one example.


Technique 4 — Chain of Thought

For complex reasoning tasks, ask the model to think step by step before giving the answer:

❌ Direct answer (often wrong on complex problems):
"What is 15% of 847?"

✅ Chain of thought:
"What is 15% of 847? Think step by step before 
giving the final answer."

Model output:
"Step 1: 10% of 847 = 84.7
 Step 2: 5% of 847 = 84.7 / 2 = 42.35
 Step 3: 15% = 84.7 + 42.35 = 127.05
 Answer: 127.05"

Making the model reason explicitly before answering dramatically improves accuracy on math, logic, and multi-step problems.

This is the technique behind OpenAI's "o1" model — it was trained to think before answering, not just immediately generate responses.


Technique 5 — Constrain What the Model Can and Cannot Do

You are a data extraction assistant.

Rules:
- Extract ONLY information that is explicitly stated 
  in the document
- If information is not in the document, return null 
  for that field
- NEVER infer or guess missing information
- NEVER add information from your own knowledge

This is critical — return null rather than guessing.

Explicit constraints prevent hallucination. In production apps this is not optional — you must constrain the model's behavior.


Part 3 — The Completion (Assistant Response)

What it is

The completion is the model's response — everything it generates back. In the API it's tagged with role "assistant."

Simple example:

USER:    "What is 2 + 2?"
ASSISTANT: "4"

The completion is "4."


Why You Sometimes Write the Assistant Turn Yourself

Here's something that surprises developers — you can pre-fill the assistant's response. You write the beginning of the assistant's answer, and the model continues from there.

This is called assistant prefilling and it's a powerful technique:

messages: [
  {
    role: "user",
    content: "Give me the user data as JSON"
  },
  {
    role: "assistant",
    content: "{"    // ← you start the JSON, model continues
  }
]

By starting with {, you force the model to continue generating valid JSON. It won't start with an explanation or preamble — it has to continue from {.

This is useful when you need:

  • Pure JSON output with no surrounding text
  • Responses that start in a specific way
  • Format enforcement without relying on instructions alone

Multi-Turn Conversations — How History is Structured

In a real conversation, messages alternate between user and assistant:

messages: [
  {
    role: "user",
    content: "My name is Arjun"
  },
  {
    role: "assistant", 
    content: "Nice to meet you, Arjun! How can I help you today?"
  },
  {
    role: "user",
    content: "What is my name?"
  }
  // Model will respond with "Arjun" because it can see the full history
]

Your application is responsible for maintaining this history array and sending it with every request. The model itself stores nothing between calls.

This is the exact reason why building a chatbot is more than just calling the API — you need to manage conversation state.


Putting It All Together — A Real Example

Here's a complete, production-style prompt structure for a code review assistant:


    const codeToReview = `
    function calculateTotal(items) {
    let total = 0;
    for (let i = 0; i <= items.length; i++) {
        total += items[i].price;
    }
    return total;
    }
    `;

    const response = await fetch("https://api.anthropic.com/v1/messages", {
        method: "POST",
        headers: { "Content-Type": "application/json" },
        body: JSON.stringify({
            model: "claude-sonnet-4-6",
            max_tokens: 1024,

            // SYSTEM — defines behavior
            system: `You are a senior JavaScript developer doing
    code review. You identify bugs, performance issues,
    and style problems.

    Always respond in this exact JSON format:
    {
    "bugs": ["description of each bug"],
    "performance": ["performance issues"],
    "suggestions": ["style and best practice suggestions"],
    "severity": "low | medium | high"
    }

    Return only valid JSON. No explanation outside the JSON.`,

            messages: [
                // USER — the actual request
                {
                    role: "user",
                    content: `Please review this JavaScript function:

    \`\`\`javascript
    ${codeToReview}
    \`\`\`

    Identify all issues.`
                }
            ]
        })
    });

    const data = await response.json();
    const review = JSON.parse(data.content[0].text);
    console.log(review);

    /*
    Output:
    {
    "bugs": [
        "Off-by-one error: loop condition should be i < items.length,
        not i <= items.length. The last iteration accesses
        items[items.length] which is undefined, causing a
        TypeError when accessing .price"
    ],
    "performance": [
        "Consider using Array.reduce() instead of a for loop
        for cleaner, more idiomatic JavaScript"
    ],
    "suggestions": [
        "Add input validation to handle empty arrays",
        "Consider handling cases where items[i].price might be
        undefined or NaN"
    ],
    "severity": "high"
    }
    */

Notice what's happening here:
System prompt  → defines role + enforces JSON format
User prompt    → injects the actual code + asks the question
Output         → structured JSON your code can work with

This is the pattern for every production AI feature you'll build.


The Mental Model for Prompts

System Prompt   = The employee's job description + rules
User Prompt     = The customer's request
Completion      = The employee's response

Your job as developer:
Write a job description clear enough that 
the employee never has to guess what to do.

3-Line Summary

  1. Every LLM interaction has three roles — system (your instructions as developer), user (the human's message), and assistant (the model's response) — understanding these lets you control exactly what the model does.
  2. Strong system prompts define identity, boundaries, format, tone, and edge cases — vague prompts lead to inconsistent apps, specific prompts lead to reliable ones.
  3. Prompt engineering is a structural skill — specifying output format, using few-shot examples, adding chain-of-thought, and constraining behavior are the techniques that make AI applications actually work in production.

Module 1.4 — Complete ✅

Phase 1 is done. 🎉

You now understand:

  • The full AI/ML/DL/LLM hierarchy
  • How generative AI works and why hallucination happens
  • Tokens, context windows, and temperature
  • System prompts, user prompts, and how to engineer them properly

Coming Up — Phase 2: LLM Internals

Module 2.1 — What is NLP, Words vs Tokens, and Tokenization

We go inside the black box. You'll understand exactly how text becomes numbers, why tokenization works the way it does, and what's really happening before the Transformer even sees your input.

No comments:

Post a Comment

Module 2.3 — The Transformer Architecture

Start With The Big Picture Every major AI model you've heard of: ChatGPT ✓ Transformer Claude ✓ Transformer Gemini ✓ Transf...