Module 3 - Prompt Engineering Techniques

Prompts with examples: Zero-Shot, One-Shot, Few-Shot techniques

In-context learning: the LLM learns the task on the spot from the examples you provide

Language Models are Few-Shot Learners paper demonstrated that larger models benefit disproportionately more from in-context examples

Zero-Shot Prompting - no examples given

One-Shot Prompting - 1 example given

Few-Shot Prompting - 2 or more examples given

Thought-Based Prompting Techniques

Chain-of-Thought (CoT) Prompting - guide the AI to break down a problem into sequential, logical steps

Automatic Chain-of-Thought - simply tell the AI to go through the answer step-by-step

Tree-of-Thought (ToT) tell the AI to consider multiple paths at each step like a decision tree; You can backtrack and explore different paths

Thread-of-Thought - stay on one path without branching

Iteration-of-Thought (IoT) - after each step, the AI decides whether to continue on the current path, adjust its reasoning, or restart with a new approach.

Scratchpad Prompting - tell the AI to write down intermediate steps while working through a problem

Step Back and Reflection Prompting - Tell the AI to pause and take a broader view of its current progress before moving forward.

Self-Consistency - Tell the AI to generate multiple independent responses. Choose the most common answer.

Rephrase and Respond (RaR) - ask the AI to rephrase or restate the input question before generating a response.

Echo Prompt Technique - simply adding "Repeat the question before answering it." to your prompt can make the model answer questions more effectively.

Automatic Prompt Engineering (APE)

Use algorithms or models to automatically optimize prompts.

Can save time.

Can make prompts that use fewer tokens

Use one AI to generate prompt variations; score those candidates using another AI; select the best-performing version.

Generate prompts by searching for or by using AI to make 'em

Security

Prompt injection: Adversarial inputs embedded in user data that attempt to manipulate model behavior.

Reliance on untrusted inputs: Automated pipelines that incorporate external data sources need to minimize reliance on untrusted inputs.

Reproducibility: Automated prompt optimization may produce prompts that perform well on benchmarks but behave unexpectedly in production contexts.

Tool Use

Modern LLMs can access live data, perform computations, execute code, and call APIs.

Tool calling (also called function calling) prompts the AI to call APIs

Zero-Shot Tool Usage via Documentation - simply providing clear tool documentation allows LLMs to use APIs effectively

Overuse or misuse: Invoking tools unnecessarily can slow down LLM responses or lead to unexpected results.

Security threats: Tool-based prompting increases the risk surface, opening the door for malicious prompt injection or adversarial misuse.

Reliability: LLMs may output malformed tool calls or misuse documentation, emphasizing the need for robust prompt design and error handling

Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG) is a hybrid approach that integrates retrieval mechanisms with generative models.

primary goal of RAG is to reduce hallucinations.

How RAG Works: A Two-Step Architecture

Retrieval: Retrieval component searches a predefined corpus to find relevant information.

Generation: The relevant information is included in the prompt so the model can synthesize the information to create a response.

Applications of RAG

Question Answering: RAG models provide precise answers by retrieving relevant documents and generating responses based on that information.

Summarization: By accessing multiple sources of information, RAG creates comprehensive summaries that capture the essence of content.

Conversation: RAG enhances dialogue systems by allowing them to pull in real-time information, making interactions more informative and engaging.

Content Creation: Writers and content creators leverage RAG to generate articles or reports that are well-informed and relevant to current events or specific topics.

Enterprise Knowledge Bases: Organizations use RAG to allow employees to query internal documents, policies, and knowledge bases conversationally.

How Does RAG Find Relevant Documents?

Dense Retrieval

Keyword-Based Search