Lee Boonstra just released a beautiful 68-page Prompt Engineering whitepaper that I will summarize here.
You’ve surely heard the buzz about LLMs and Agentic AI over the last couple of years. Think of them as that brilliant, slightly chaotic intern you once had. Super smart, tons of potential, but if you mumble your instructions, you’ll get a beautifully written report on the wrong topic. Prompt engineering is basically learning to talk to this intern, your instruction manual for the AI. You’re basically giving it a starting point and hoping it predicts the next words in a way that makes sense for *your* task.
Taming the Beast: LLM Output Settings
Before you even *think* about typing your prompt, you gotta fiddle with the LLM’s knobs and dials. It’s like tuning your guitar before a gig – essential stuff.
- Output Length: Want a tweet, not an epic poem? Tell the model how many “tokens” (AI-speak for words/parts-of-words) to generate. More tokens usually mean more processing, more moolah, and sometimes, more waiting. Heads up: just cutting the length doesn’t make the AI write shorter, it just makes it… stop. Awkward.
- Temperature: This is the “creativity dial”. Low temp (like 0) means the AI sticks to the script, giving you predictable, factual answers. Crank it up, and things get… spicier. More diverse, more unexpected.
- Top-K and Top-P: These are like the AI’s internal editors, deciding which words are “good enough” to make the cut.
- Top-K: Only considers the ‘K’ most likely next words.
- Top-P (Nucleus Sampling): Picks words until their combined probability hits a certain score ‘P’.
Your Prompting Toolkit: Tricks of the Trade
Alright, settings locked in. Time to actually *write* something. Here are some techniques that are more effective than just yelling at your screen:
1. Zero-Shot Prompting: The “Figure It Out Yourself” Method
The simplest of all. Just give the LLM a task description and let it rip. Example: “Classify this customer feedback as positive or negative.” Keep that temperature low if you don’t want any funny business.
Zero-Shot Example (Movie Review Classification): Prompt: Classify movie reviews as POSITIVE, NEUTRAL, or NEGATIVE. |
2. One-Shot & Few-Shot Prompting: “Here, Let Me Show You”
Sometimes, the intern needs an example. Or a few. One-shot gives one example; few-shot gives several. This is your golden ticket for getting the output in a specific format or style.
Few-Shot Example (JSON from Pizza Orders): Prompt: Parse a customer’s pizza order into valid JSON: |
Tip: Use good, varied examples. One wonky example can send your LLM down a rabbit hole. Include those pesky edge cases too!
3. Setting the Scene: System, Contextual, and Role Prompting
- System Prompting: This tells the LLM its overarching goal or how it should behave. Think: “Only return the answer in uppercase” or “Output your response as a JSON object”.
- Contextual Prompting: Give the LLM some background info. “Context: You’re writing a blog post for data scientists who are skeptical about AI”.
- Role Prompting: Assign a persona. “Act as a cynical but helpful senior data analyst” or “You are a pirate explaining database normalization”. You can even specify the *style* within the role, like “humorous” or “super serious”.
Role Prompting Example (Humorous Travel Guide): Prompt: I want you to act as a travel guide. I will write to you about my location, and you will suggest 3 places to visit near me in a humorous style. My suggestion: “I am in Manhattan.” |
4. Chain of Thought (CoT) Prompting: “Show Your Work, AI!”
This one’s a biggie for tasks that need some actual thinking. You get the LLM to spell out its reasoning steps *before* it gives the final answer. Just adding “Let’s think step by step” can work wonders, especially for math problems where LLMs tend to, shall we say, *improvise* the answers.
CoT Example (Solving a Word Problem): Prompt: When I was 3 years old, my partner was 3 times my age. Now, I am 20 years old. How old is my partner? Let’s think step by step. |
Why is CoT cool? It’s pretty easy to do, often works well, and you can see *how* the AI got its answer (or where it went off the rails).
5. The Advanced Playbook (When You’re Feeling Brave):
- Step-Back Prompting: Make the LLM think about a broader, related question first. Then, feed that insight into the prompt for your specific problem. Helps it dig up more relevant knowledge.
- Self-Consistency: Run the same CoT prompt multiple times (use a higher temperature for different reasoning paths), then pick the answer that shows up most often. More accurate, but costs more.
- Tree of Thoughts (ToT): Lets the LLM explore many different reasoning paths at once, like a choose-your-own-adventure for problem-solving. Good for super complex stuff.
- ReAct (Reason & Act): This is where it gets really interesting. You let the LLM use external tools (like a search engine API) to find info, think about it, and then decide what to do next. Your intern now has Google access!
ReAct Gist (Metallica Kids Counter): Prompt: How many kids do the band members of Metallica have? LLM’s Inner Monologue (Simplified): |
- Automatic Prompt Engineering (APE): Yep, you can get an LLM to write prompts *for you*. Generate a bunch of prompt ideas, test ’em out, pick the winner. AI prompting AI. We’re through the looking glass, people.
Your AI Coding Buddy (Handle With Care)
These LLMs aren’t just for words; they can sling code too – write it, explain it, translate it between languages, even help you debug your messes.
Example (Getting a Bash Script Written): Prompt: Write a code snippet in Bash that asks for a folder name. Then, it takes the contents of the folder and renames all the files inside by prepending the name draft to the file name. Output: (Lo and behold, a fairly decent, commented Bash script!) Crucial Warning: ALWAYS, ALWAYS review and test code generated by an AI! It’s a smart tool, not a magic wand. It can and will make mistakes. |
The Data Struggler’s Guide to Prompting Glory (aka Best Practices)
Want to suck less at prompting? Here’s the cheat sheet:
- Show, Don’t Just Tell (Examples are King): Seriously, one-shot/few-shot prompting is often your best friend.
- KISS (Keep It Simple, Stupid): If your prompt is confusing *you*, the AI is probably lost too. Clear, concise, to the point. Use action verbs: “Summarize,” “Generate,” “Classify.”
- Be Specific About What You Want Out: Don’t just say “write a post.” Say “Generate a 500-word blog post about topic X, aimed at beginners, with a humorous tone”.
- Instructions > Constraints: Tell the AI what *to do*, not just a long list of what *not* to do. It’s less confusing for everyone.
- Mind the Length: Control token output via settings or directly in the prompt (e.g., “Explain quantum physics in one sentence”).
- Variables are Your Pals: For reusable prompts, use placeholders like `{city_name}` or `{product_category}`.
- Experiment Like Mad: Try different wording, styles, and prompt types. What works for one model might flop on another.
- Mix It Up (for Few-Shot Classification): If you’re showing examples for classification, don’t list all your ‘positive’ examples then all ‘negative’. Shuffle them to avoid the AI just learning the order.
- JSON for the Win (Structured Output): For tasks like data extraction, getting the output in JSON is a lifesaver. It forces structure and can reduce those wacky “hallucinations.” But watch out for token limits chopping your JSON in half! Libraries like `json-repair` can be helpful here.
- Schema for Input Too: Providing a JSON Schema for your input data helps the LLM understand the structure and what to focus on.
- Document Your Struggles (and Successes!): Keep meticulous notes: what prompt you used, what model, what settings, what the output was. You will forget. The PDF even suggests a template. Vertex AI Studio can help save and version your prompts.
The Punchline
Prompt engineering isn’t magic. It’s an iterative grind of testing, tweaking, and figuring out how to have a sensible conversation with these incredibly powerful (and occasionally bizarre) AI models. It’s part science, part art, and a whole lot of “let’s try this and see what happens.”
So, go forth and prompt. May your outputs be relevant, your hallucinations minimal, and your data struggles slightly less struggle-y.