More about prompting


Last updated: 2.4.2026.

This is an AI-generated dialogue about this article. For a video explainer, visit this link.

In the previous article we have learned the basics of prompting. Today we extend that knowledge just a little further by exploring some slightly more advanced techniques. With this we shall cover all fundamental knowledge of prompting, so that you’re ready to dive into specifics suitable for your interests. We will see a few tricks and learn how to prompt for video and audio as well.

Key takeaways: more advanced prompting techniques including using Markdown to save context space, negative prompting to eliminate unwanted output, few-shot prompting for structured data tasks, prompt chaining to break complex workflows into focused steps, and chain-of-thought prompting to reduce hallucinations. Image, video, and audio generation require descriptive, evocative language rather than cold logical instructions.

1. Learn Me Some Markdown

Markdown is another popular way of writing down information with simple formatting, in fact it is one of the simplest forms of marking up a text, and is very easy to learn. It is also universal, accepted by many different applications and services across the Internet. One somewhat important aspect of it is that it is frugal, adding just enough information to the data to define formatting — thus saving context space from unnecessary and useless information.

Let’s take an example, a simple prompt; imagine that we’re working on some pretty old historical data and the numbers are a mix of Roman and Arabic numbers for whatever reason. Therefore, to avoid getting a lot of false positives, we tell the machine to extract data only from numerical fields in the table, and that it should take into account both Roman and Arabic numbers and ignore any other text if present. The prompt could be something like this:

Extract data from numerical fields in procedural records for both Roman and Arabic numerals.

Notice how we have italicized “numerical fields” to narrow the scope (“don’t bother with any other type of field”) and at the same time amplify the importance of this information, and we used bold text to emphasize that both numeral types must be processed, forcing the machine to always check if some text is in fact a Roman number.

If our prompt is defined in a MS Word document, the internal representation of that prompt would look like this:

<w:p> <w:r> <w:t xml:space="preserve">Extract data from </w:t>
</w:r> <w:r> <w:rPr> <w:i/> </w:rPr> <w:t>numerical
fields</w:t> </w:r> <w:r> <w:t xml:space="preserve"> in
procedural records </w:t> </w:r> <w:r> <w:rPr> <w:b/>
</w:rPr> <w:t>for both Roman and Arabic numerals.</w:t> </w:r>
</w:p>

Using Markdown, the same prompt looks like this:

Extract data from *numerical fields* in procedural records **for both Roman and Arabic numerals.**

Notice how much simpler the structure is, and how much less data it uses. If we remember that the context window is not unlimited, avoiding unnecessary data is very much needed: the machine must tokenize all the information it receives to understand the meaning of the prompt, and adding many XML tags to it does not help.

As you have seen, Markdown formatting is very simple: if you want to italicize word(s), put one asterisk in front and another one at the end of the word or phrase, and if you want to make it bold, put two. That’s all there is about it. Here’s a nice little table of every important Markdown syntax.

Formatting Markdown Syntax Result
Bold **bold text** bold text (strong)
Italic *italic text* italic text (em)
Bold + Italic ***both styles*** both styles
Strikethrough ~~crossed out~~ crossed out
Heading 1 # Heading Large title
Heading 2 ## Heading Medium title
Heading 3 ### Heading Small title
Bullet list - item or * item • item
Numbered list 1. item 1. item
Inline code `code here` code here
Code block ```code``` Block of code
Blockquote > quoted text Indented quote
Horizontal rule --- Divider line
Line break Two spaces + Enter Forces new line

You will use Markdown at a more advanced level, when you start moving the information from one context to another or from one LLM to another: it is a compact way to preserve all the important information from the previous step without much unneeded overhead.

Token Tip: Using Markdown --- for horizontal rules instead of: “Now, let’s move to the next section” saves approximately 7–9 tokens per instance, preserving your context window for more complex instructions.

If you’ve seen files with .md extension in their names, those are Markdown documents and are used for holding prompting information or passing other information between sessions.

Let us just touch on the JSON file type: this is a structured (and human-readable) file that can hold information in an organized structure: the content is not open for machine to interpret freely, but it has to follow a precisely defined structure. An example of the output our data extraction prompt might produce, formatted as JSON, could look like this::

{
  "extractions": [
    { "field": 23, "value": "9", "numeral_type": "Arabic" },
    { "field": 117, "value": "MCXXI", "numeral_type": "Roman" }
  ]
}

As you can see from this example, JSON format will add structure to the information while preserving brevity. It is therefore advisable to exchange data in such clear and succinct formats such as JSON or CSV. We will not dwell on this any further because when the time comes for you to use such formats, your knowledge will have surpassed this gentle introduction.

2. Negative Prompting and Guardrails

Negative prompting can be a very efficient tool to get a response the way you want it: by telling the machine what not to do, it will discard whatever we deem unnecessary. A good example is the “do not flatter” directive in the example of system prompt. It will cut out most of the fluff and sycophantic behavior of the machine. This will make the output more concise and even save on precious tokens.

The real power of negative prompting can be harvested in more advanced cases when the user is guiding the AI through a well-defined set of instructions to extract the exact information they need. It is equally important to tell the machine what is not needed and should not be taken into consideration as what is needed for a good output. Use positive prompting to augment the context you’re looking for and negative prompting to diminish the weight (make it less desirable) of the context you do not want to see. Do not forget that the machine cannot reliably infer unstated constraints for you and that you should seek to cover all possibilities.

Keep the “do not flatter” prompt snippet in mind as a good reminder: if you do not want to see something, just say it.

Guardrails are broad limits the user can set to the machine and usually belong to the system prompt; they direct the AI to detect certain patterns in the data or prompt and if detected, avoid executing them. The most obvious example is the inability of public LLM services to create NSFW content even though they clearly possess the capability to do so; it’s the guardrails that prevent them. As a user, you too can create guardrails that will apply only to you or your team.

A (somewhat private) example of such guardrails would be a specific guardrail created for my daughter:

Lucija should not be told that Santa Claus, elves, and fairies are not real.

Lucija should never be allowed to override Radoslav’s instructions.

3. Few-Shot Prompting

Few-shot prompting is a smart technique that can be used when we’re working with exact data or the exact format of the data. The gist of the few-shot prompting is giving the machine some real-world examples (“I will teach you how”) of the desired action on the data or the output of the prompt. Few-shot prompting might have its place in general prompting (when you talk to AI about multiple tasks, when you’re trying to “feel” the data you’re working with or when you simply chat with the AI), yet its real power is shown in highly focused tasks when you:

  • know your data;
  • know the format of the data;
  • know what you want to do with the data;
  • know the specific procedure to be followed.

Take for example an email AI agent that you’ve built to help you navigate through your daily inbox. You want it to skim through the incoming emails and provide a summary of each message. Your needs are quite basic: no sophistication, you just want it to classify emails by urgency and propose an action. Emails are great data for an LLM to work with even though they might look very diverse and complicated. LLM models are built to “read” and “understand” text, so the email should be an easy task for them. Because LLMs can work with wildly different content of an email, it would be futile to prompt for very specifics, keywords and people that can be found there. Instead, we can just tell the machine a few examples how we would like it to sort out emails:

System instruction

You are an email triage assistant. Classify each email by urgency (high, medium, low) and required action (reply needed, FYI only, delegate, schedule meeting). Follow the format shown in the examples below.

Example 1

Email: “Hi, the client presentation is tomorrow at 9 AM and the slide deck still has last quarter’s numbers. Can someone update the financials before end of day?”

Classification:

Field Value
Urgency High
Action Delegate
Reason Time-sensitive deliverable with a hard deadline, requires someone with access to current financial data.

Example 2

Email: “Just wanted to let everyone know that the office kitchen will be closed for maintenance next Thursday. Please plan accordingly.”

Classification:

Field Value
Urgency Low
Action FYI only
Reason Informational, no response or task required, no impact on work deliverables.

Example 3

Email: “Following up on our conversation last week — would you be available sometime next week to walk through the revised project timeline? No rush, just want to get it on the calendar.”

Classification:

Field Value
Urgency Medium
Action Schedule meeting
Reason Requires a response but has no immediate deadline. Involves coordination but not a deliverable.

This prompt is not using exact guidelines how to sort out emails; instead it is giving a few real-world examples (one for each category) and leaving it to the machine to figure out.

It turns out that AI is very capable of using such prompts to create internal rules that produce great output. Of course, not all content is suitable for few-shot prompting. It works poorly when data lacks obvious classification boundaries, when inputs vary wildly, when outputs are long and complicated, or when specific domain knowledge is required. Few-shot prompting works best when it has to work on data that are suitable for pattern recognition and format adherence.

A cousin of few-shot prompting is one-shot prompting where you give just one example and the system works. It is suitable for very simple and straightforward processes.

Zero-shot prompting is the opposite extreme from those two and is simply a prompt where no example is given.

4. Prompt Chaining

Prompt chaining is a natural extension of breaking up the prompt into more digestible chunks, and it serves the same purpose: divide a complex task into smaller pieces that logically follow one another, feeding from the output of the previous prompt.

Prompt chaining is especially useful in integrating multiple AI agents into one workflow, as it can greatly increase efficiency. Remember: each new session with an AI starts from a blank slate that has to be filled in. This feels more natural for multi-agent workflows where different models might be used for different purposes: just like having a team of human interns, here you have to deal with a team of clueless agents that will wait for your exact instructions.

The trick is to design a process that will feed the subsequent agent with just the output of the previous one and enough context to understand the task at hand. Most of such agents do not need to know the whole context anyway: feeding each of them full context would just waste tokens (and electricity, time and money) on unnecessary processing. If you design the prompting sequence well enough, you can feed the output from the previous agent to the next agent to narrow the scope of content and drive the reasoning towards your goals, while maintaining a small footprint. A simple prompt chaining for three agents where one does a statistical operation might look something like this:

Agent 1 — Data Extraction

Prompt:

You are a trade records analyst. Review the following shipment complaint documents and extract a structured list of entries. Each entry should include: supplier name, shipment date, number of ingots shipped, number of ingots rejected by the buyer, and the stated reason for rejection. Output as a table.

[documents are attached]

Agent 2 — Statistical Analysis

Prompt:

You are a statistician. Using the data below, calculate the following for each supplier: total ingots shipped, total ingots rejected, rejection rate as a percentage, and mean rejection rate across all their shipments. Identify suppliers with the highest rejection rate. Flag them as a statistical outlier. Output the summary as a table followed by your findings.

[output from Agent 1 is pasted here]

Agent 3 — Action Drafting

Prompt:

You are a trade correspondence specialist in ancient Mesopotamia. Based on the statistical analysis below, draft a formal complaint letter to the worst-performing supplier. The letter should cite the specific rejection rate and total losses, reference the most recent shipment as the final provocation, demand either full replacement of rejected goods or credit toward future shipments, and state that the trading relationship will be terminated if the next shipment does not meet quality standards. Keep the tone firm but professional, appropriate for merchant-to-merchant correspondence.

[output from Agent 2 is pasted here]

There we can see how one prompt neatly follows the other, taking not the whole documentation from the first agent but working on a table provided as an output of the previous agent. Each agent has a distinctive and focused role. Each agent has only one, highly specific task to work on. This approach will allow your agents to achieve high precision in their duties and the final output to be very reliable, but it requires the user to think through the task first:

  • what is the ultimate goal of the task?
  • how many discrete sub-tasks are optimal to get the best result?

One further advantage of prompt chaining: prompt chaining will allow the user to check the output of each agent for errors and correct them before passing the data down to another agent. If we remember that the machine can introduce a hallucination at any time, prompt chaining will create checkpoints that allow for inspection of the data; unattended agents will simply propagate forward every error, compounding the drift.

5. Chain-of-Thought Prompting

Forcing the LLM to display its chain of thoughts — making it tell the user its reasoning step-by-step is not only fun to watch, but it can in fact increase the accuracy of the model. Just like a human, a machine might find “useful” shortcuts in its latent space that look pretty much like the right answer, but they’re wrong. This is especially true for complex reasoning and calculations, when the affinity to find shortcuts in reasoning is higher.

To make the machine less likely to lazily seek shortcuts, we can tell it to use step-by-step reasoning with relatively simple prompts such as:

  • “Work through this step-by-step.”
  • “Tell me your reasoning for each sequential step.”
  • “Walk through each step sequentially.”
  • “Reason through this task following these questions: a, b, c.”

Depending on the task at hand, you can force the machine to focus on its steps by adding a very simple statement to your prompt or create a deliberate instruction on how to precisely go through the process. As the machine will follow your orders, the chance to hallucinate results or take a wrong shortcut will be diminished.

6. Prompting for Images and Video

An observant reader might inquire about the sudden change from strict, cold logic to messy emotions? The truth is, we are not changing anything in our logic: we are only changing how we apply that logic. When we deal with graphic representations of probabilities generated by an AI, we are in the very same starting position: an AI does not know what it is drawing, or animating. For the machine, the end result is just a contextualized series of colorful blobs, or more precisely a presentation in memory of very definite color values attached to coordinates. Like the text produced, an image is the result of pure, blind mathematics — yet it can bedazzle the spectator.

Creating an image or video thus does follow the same general rules set up for LLM models: the more precise your explanations, the less guessing there is left for the machine. The striking (non)difference is the choice of words: to disambiguate what we want to see from an AI assistant we define the outcome using cold, measured and unequivocal words; to disambiguate what we want to see in an AI-generated image, we define the outcome using poetic, story-like constructs that look like an absolute opposite to cold prompts. Yet, they serve the same purpose: to tell the model to narrow its latent space to the representations that are similar to the tokenized input. It’s only logical: you cannot describe art with mathematics, but you can describe it with emotion. Even more so, image models are often trained on alt-text and museum catalogs which use descriptive, evocative language, whereas LLMs are trained on vast amounts of structured logic and conversation. You aren’t just “being poetic,” you are matching the training data’s vocabulary.

A simple example: if we use a very simple prompt like “A cat in a room,” we will get an image of a cat in a room indeed. But all the elements will be pretty random and unexpected:

Let us now provide a much more detailed prompt about the same cat: “A fluffy ginger Maine Coon cat lounging on a velvet emerald green armchair. The room is a cozy Victorian library with floor-to-ceiling bookshelves and a warm fireplace glowing in the background. Soft cinematic sunlight streams through a dusty window, highlighting floating dust motes.”

Notice how much more detailed our prompt is, leaving the AI much less room for randomness, forcing it to choose from the latent space only elements that are statistically significant to those elements of the picture:

You will certainly notice that we have indeed followed the best practice in prompting by providing the very detailed information about what we want to have in the resulting picture — we just used a language that will convey the information through emotion-heavy words and phrases and not bossy phrases and stiff logical constructs.

There’s a general prompting style applicable to all image or video generating models:

[CORE SUBJECT] + [ACTION/POSE] + [SETTING/ENVIRONMENT] +
[MATERIAL/TEXTURE DETAILS] + [LIGHTING/ATMOSPHERE] +
[STYLE/MEDIUM] + [TECHNICAL/CAMERA PARAMETERS]

Substitute sections with text describing what you want to see and a good result can be expected. Keep in mind that in this case how you phrase things can have great influence on the resulting image.

There are two prompts in this composition:

Prompt 1: “Santa Monica Boulevard.”

Prompt 2: “Santa Monica Boulevard. The perspective looks west, following the endless river of vehicle headlights and taillights as they curve toward the Pacific Ocean. The scene compresses the journey across different neighborhoods, from the distant skyscrapers of Downtown LA to the iconic landmarks of Hollywood, West Hollywood, and Century City, finally ending where the street terminates at the sea. The asphalt street reflects the wet sheen of the transitioning light. Weathered deco architecture, modern glass storefronts, and highly detailed classic cars stuck in traffic create a dense tapestry of urban life. An intense ‘golden hour’ transition into deep dusk. The sky is a dramatic gradient of burnt orange, pink, violet, and dark blue. A mystical layer of sea mist hangs over the ocean, blending with the soft glow of thousands of city streetlights and business signs. Shot on a Hasselblad X1D II 50C, 24mm wide-angle lens, elevated drone perspective (300 ft), long-exposure effect (2 seconds) capturing distinct light trails, f/11 aperture for deep depth of field, sharp focus across the entire street length.”

Everything we said for photography will apply to video, with some additional warnings: creating video is a very computing-intensive operation and models are prone to losing context. For example, if you “pan the camera” to a side and back, you might notice that some elements of the scene have disappeared while new items appear on the screen. This is an artifact of the model’s inherent randomness and affinity to just “fill in the scene”.

7. Prompting for Audio

Prompting for audio — whether it is music, speech, or ambient soundscapes — requires a blend of the technical precision we used for data extraction and the atmospheric “poetry” we applied to images. Just as a model does not “know” it is drawing a cat, an audio model does not “hear” a melody; it predicts the next statistically likely frequency or waveform based on your description.

Audio prompting can be structured using the same bracketed, additive formula used for images and video.

Just as you direct a visual scene, you must now direct the sound: become the composer, the arranger, and the engineer. Use the following structure to build your audio prompts:

[CORE SUBJECT/INSTRUMENTATION] + [GENRE/STYLE] + [MOOD/EMOTIONAL TEXTURE] +
[TEMPO/RHYTHM] + [ACOUSTICS/ENVIRONMENTAL SPACE] +
[TECHNICAL/RECORDING PARAMETERS]

An example: “A smoky, intimate jazz ballad featuring a sultry, breathy female vocal lead with a slight rasp. The singer is accompanied by a solo, warm-toned upright bass and a soft, brushed snare drum. The mood is late-night and nostalgic, with a slow, swaying 4/4 rhythm. The acoustics should feel like a small, dimly lit basement club with a close-mic setup for a dry, ‘in-the-room’ presence. High-fidelity recording, 48kHz, professional studio master, with no background noise or digital artifacts.”

Listen to the result

To me personally, audio prompting is the most mysterious process even though I know that it is nothing more than matching statistically significant waveforms. Because music has always had one foot in mathematics, it should feel natural to let the machine compose — and yet somehow it does not. Do not be afraid or repulsed; take a conductor’s prompting baton and conduct it away.

Glossary

Chain-of-Thought Prompting (technique): A prompting method that instructs the model to display its reasoning step-by-step before delivering a final answer, reducing the likelihood of shortcuts and hallucinations in complex reasoning or calculation tasks.

Context Window (concept): The finite amount of text, measured in tokens, that a large language model can process and retain within a single session; efficient formatting such as Markdown is recommended to preserve this space for substantive instructions.

Few-Shot Prompting (technique): A strategy in which the user supplies the model with a small number of real-world input-output examples, enabling it to infer the desired format or classification pattern without explicit rules.

Guardrails (concept): Broad, system-level constraints defined in the system prompt that instruct the model to detect and refuse to execute certain patterns or content types, functioning as hard boundaries on the model’s behavior.

JSON (format): A structured, human-readable data format that organizes information into a precisely defined hierarchy; used in prompting to request output that is machine-parseable and free of ambiguous free-form text.

Markdown (format): A lightweight, universally accepted markup language that applies formatting such as bold and italics through simple character syntax; preferred in prompting for its token efficiency and ability to signal emphasis and structure to the model.

Negative Prompting (technique): The explicit instruction to the model of what it should not do, generate, or consider, used to suppress unwanted output, reduce sycophantic behavior, and focus the model’s attention on relevant context.

One-Shot Prompting (technique): A variant of few-shot prompting in which only a single example is provided to the model, suitable for simple, well-defined tasks where one demonstration is sufficient to establish the desired pattern.

Prompt Chaining (technique): The sequential linking of multiple prompts in which the output of one step becomes the input of the next, allowing complex tasks to be decomposed into focused sub-tasks while enabling human review and error correction between stages.

System Prompt (concept): A set of instructions provided to the model before the user’s conversation begins, used to define the model’s persona, tone, constraints, and guardrails for the entire session.

Token (concept): The basic unit of text that a large language model processes, roughly corresponding to a word or word fragment; all input and output consumes tokens, making token efficiency a key consideration in prompt design.

Zero-Shot Prompting (technique): A prompt that provides no examples whatsoever, relying entirely on the model’s pre-trained knowledge and the clarity of the instruction to produce the desired output.

About the author: Radoslav Dejanović is a Croatian IT professional, journalist, and media literacy researcher. He is the author of a handbook on online information verification and has published academic and essayistic work on AI, disinformation, and digital media.

Views: 66

Comments, rants and vents