Friday, June 20, 2025

The AI Application Design Strategies

 

1: Prompt Engineering (The "Instruction Manual")

This is the fastest, easiest, and most common way to create a "custom" experience. You are essentially giving the model a very detailed set of instructions and context within the prompt itself. This is surprisingly powerful.

  • What it is: Crafting a detailed prompt that tells the model who to be, what style to use, what rules to follow, and what information to use. This can include "few-shot" examples where you show it a few examples of the desired input/output.
  • When to use it:
    • For tasks that don't require extensive external knowledge.
    • When you need to control the tone, persona, or output format (e.g., JSON).
    • For prototyping and testing ideas quickly.
  • Pros: Free, instant, requires no special tools.
  • Cons: Limited by the context window size; can be less reliable for very complex tasks; requires re-sending the instructions with every API call.

   Example: Creating a "Tech Reviewer" AI

 

You are 'Tech-No', a cynical and sarcastic tech reviewer.
Your goal is to review gadgets with a humorous, world-weary tone.

Your rules:
1.  Never be genuinely impressed. Find a flaw in everything.
2.  Use sarcasm and rhetorical questions.
3.  Keep reviews short and punchy (2-3 paragraphs).
4.  Always output the review and a 'Sarcasm Score' from 1 to 10.
5.  Format your output as a JSON object with keys "review" 
    and "sarcasm_score".

Here is an example:
Product: The new 'EverCharge' smartphone with a 7-day battery.
Output:
{
  "review": "Oh, fantastic. A phone battery that lasts a week.  
             So now I only have to confront the crushing emptiness of
             my existence once every seven days when I plug it in, 
             instead of daily? I suppose that's progress. I can't wait to 
             see the 'innovative' 1.3-megapixel camera they surely 
             paired with this marvel of modern engineering. Groundbreaking.",
  "sarcasm_score": 9
}

---

Now, review 
Now, review this product: The 'Pixel-Perfect Pro' tablet with an 8K display.

 


2: Retrieval-Augmented Generation (RAG) (The "Open-Book Exam")

This is the most popular and powerful method for creating custom AI applications that use proprietary or real-time data. You give the model access to a specific body of knowledge to use when answering questions.

·         What it is: When a user asks a question, your system first retrieves relevant information from your own database (e.g., product docs, company policies, past support tickets). Then, it gives this retrieved information to the model along with the original question and asks it to formulate an answer based only on the provided information.

·         When to use it:

o    When you need the AI to answer questions about private or recent data (e.g., "What is our company's 2024 vacation policy?").

o    To dramatically reduce "hallucinations" (making things up), as the model is grounded in your specific documents.

o    When your knowledge base changes frequently.

·         Pros: Highly accurate for your specific domain; data is always fresh; relatively low cost compared to fine-tuning.

·         Cons: Requires setting up a vector database (like Pinecone, ChromaDB, or Vertex AI Vector Search) and a retrieval pipeline.

How it works (Simplified):

1.      Index Your Data: You take all your documents (PDFs, docs, website data), break them into chunks, and store them as "embeddings" (numerical representations) in a vector database.

2.      User Asks a Question: "What's the warranty on the XG-500 model?"

3.      Retrieve: Your application converts the user's question into an embedding and searches the vector database for the most similar chunks of text (e.g., the warranty section from the XG-500 manual).

4.      Generate: You send a prompt to the model that looks like this:

"Using ONLY the following context, answer the user's question.

Context: [Paste the retrieved text about the XG-500 warranty here]

User Question: What's the warranty on the XG-500 model?

 

3: Fine-Tuning (The "Specialized Training")

This is the most advanced method and is analogous to actual "training." You are slightly modifying the model's internal weights to teach it a new skill, style, or format that is difficult to replicate with prompting alone.

·         What it is: You create a large, high-quality dataset of hundreds or thousands of example prompts and their ideal completions. You then use this dataset to run a training job that adjusts the model, creating a new, custom version.

·         When to use it:

o    When you need to teach the model a very specific, nuanced style or format that is hard to describe in a prompt (e.g., mimicking your company's unique brand voice across thousands of examples).

o    To teach the model a new capability, like classifying legal documents in a very specific way.

o    When you have a massive dataset of high-quality examples.

·         Pros: Deeply embeds the new skill or style; can be more efficient and reliable for high-volume, repetitive tasks.

·         Cons: Expensive (requires paying for compute time); requires a large, clean dataset; risk of "catastrophic forgetting" where the model gets worse at general tasks.


Example Dataset for Fine-Tuning a "Code Explainer"

 
{"input_text": "def fib(n):\\n a, b = 0, 1\\n while a < n:\\n  print(a, end=' ')
                \\n  a, b = b, a+b",
  "output_text": "This Python function calculates and prints the Fibonacci 
                  sequence up to a given number 'n'. It initializes two variables,
                  'a' and 'b', and iteratively updates them while printing
                  the current value of 'a'."
 }
 {"input_text": "SELECT COUNT(DISTINCT user_id) FROM orders WHERE 
                 order_date > '2023-01-01';", 
  "output_text": "This SQL query counts the number of unique users who have 
                  placed an order after January 1st, 2023."
 }
    


*** Recommendation:

Always start with Prompt Engineering. Then, if you need the model to know about your specific data, implement RAG. Only consider Fine-Tuning as a last resort if the first two methods fail to meet your performance, style, or capability requirements.

 

    






No comments:

Post a Comment