Master Generative AI — Part 4: Practical Applications¶

Part 4 of the Master Generative AI: A Step-by-Step Challenge series.

Series Map:

Part 1 → Foundation of AI & ML
Part 2 → Working with LLMs
Part 3 → Advanced Generative AI
Part 4 → Practical Applications ← you are here
Part 5 → Career & Capstone Projects

Theory meets reality in this part. We take the tools from Parts 1–3 and apply them to the domains where generative AI is already creating measurable business value — and where practitioners are most in demand in 2026.

Chapter 1: Generative AI for Code¶

Why Code Is the Killer App for LLMs¶

Code is text. LLMs trained on billions of lines of code from GitHub develop remarkable abilities:

Autocomplete: finishing functions from a signature or comment
Refactoring: improving existing code without changing behavior
Bug fixing: identifying and correcting errors
Test generation: writing unit tests from implementation
Documentation: generating docstrings, README, API docs
Translation: converting code from one language to another

GitHub Copilot: What It Actually Does¶

Copilot sends your current file + cursor position to an LLM, which predicts what comes next:

# You type this comment:
# Function to calculate compound interest

# Copilot suggests (Tab to accept):
def calculate_compound_interest(
    principal: float,
    rate: float,
    time: int,
    n: int = 12  # compounding frequency per year
) -> float:
    """Calculate compound interest.

    Args:
        principal: Initial investment amount
        rate: Annual interest rate (as decimal, e.g., 0.05 for 5%)
        time: Investment period in years
        n: Number of times interest compounds per year

    Returns:
        Final amount after compound interest
    """
    return principal * (1 + rate / n) ** (n * time)

Code Llama and Local Code Models¶

For private codebases where you can't send code to external APIs:

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Code Llama — open source, can run locally
model_id = "codellama/CodeLlama-7b-Instruct-hf"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, torch_dtype=torch.float16, device_map="auto"
)

def generate_code(instruction: str, context: str = "") -> str:
    prompt = f"""[INST] {instruction}

{f'Context: {context}' if context else ''}

Provide only the code, no explanation. [/INST]"""

    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    with torch.no_grad():
        outputs = model.generate(
            **inputs, max_new_tokens=500, temperature=0.1, do_sample=True
        )
    return tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)

# Generate unit tests
code = """
def merge_sorted_lists(list1: list, list2: list) -> list:
    result = []
    i = j = 0
    while i < len(list1) and j < len(list2):
        if list1[i] <= list2[j]:
            result.append(list1[i]); i += 1
        else:
            result.append(list2[j]); j += 1
    return result + list1[i:] + list2[j:]
"""

tests = generate_code(
    instruction="Write comprehensive pytest unit tests for this function",
    context=code
)
print(tests)

Practical Patterns for Code AI¶

from openai import OpenAI

client = OpenAI()

def ai_code_review(code: str, language: str = "Python") -> dict:
    """AI-powered code review."""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "system",
            "content": "You are a senior software engineer doing code review. "
                       "Be specific, actionable, and kind."
        }, {
            "role": "user",
            "content": f"""Review this {language} code and return JSON with:
{{
  "issues": [{{ "line": N, "severity": "critical|major|minor", "description": "...", "fix": "..." }}],
  "score": 1-10,
  "summary": "..."
}}

Code:
```{language.lower()}
{code}
```"""
        }],
        response_format={"type": "json_object"}
    )
    import json
    return json.loads(response.choices[0].message.content)

def ai_generate_tests(code: str) -> str:
    """Generate unit tests for given code."""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "system",
            "content": "Generate pytest tests with >90% coverage. Use parametrize for edge cases."
        }, {
            "role": "user",
            "content": f"Write tests for:\n```python\n{code}\n```"
        }]
    )
    return response.choices[0].message.content

def ai_explain_code(code: str, audience: str = "junior developer") -> str:
    """Explain code for a specific audience."""
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{
            "role": "user",
            "content": f"Explain this code to a {audience}. "
                       f"Use simple language and analogies:\n\n{code}"
        }]
    )
    return response.choices[0].message.content

Chapter 2: Generative AI for Business¶

Core Business Use Cases in 2026¶

Use Case	Time Saved	Typical Tool
Meeting summarization	2 hrs/meeting → 2 min	Whisper + GPT-4o
Email drafting	20 min → 2 min	Claude/GPT-4o
Report generation	4 hrs → 30 min	LLM + structured data
Customer support	80% deflection	RAG chatbot
Contract analysis	3 hrs → 15 min	GPT-4o with vision
Data extraction from docs	1 hr/doc → 30 sec	Vision + structured output

Document Intelligence Pipeline¶

import anthropic
import base64
import json
from pathlib import Path

client = anthropic.Anthropic()

def extract_invoice_data(pdf_path: str) -> dict:
    """Extract structured data from an invoice image/PDF."""
    image_data = base64.standard_b64encode(Path(pdf_path).read_bytes()).decode()

    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1000,
        messages=[{
            "role": "user",
            "content": [
                {"type": "image", "source": {
                    "type": "base64",
                    "media_type": "image/png",
                    "data": image_data
                }},
                {"type": "text", "text": """Extract all invoice data as JSON:
{
  "vendor": {"name": "", "address": "", "email": ""},
  "invoice_number": "",
  "date": "YYYY-MM-DD",
  "due_date": "YYYY-MM-DD",
  "line_items": [{"description": "", "quantity": 0, "unit_price": 0, "total": 0}],
  "subtotal": 0,
  "tax": 0,
  "total": 0,
  "currency": "USD"
}
Return ONLY valid JSON, no explanation."""}
            ]
        }]
    )
    return json.loads(response.content[0].text)

def summarize_meeting(transcript: str) -> dict:
    """Structure a meeting transcript into actionable summary."""
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1500,
        messages=[{
            "role": "user",
            "content": f"""Summarize this meeting transcript as JSON:
{{
  "tldr": "one sentence summary",
  "key_decisions": ["..."],
  "action_items": [{{"owner": "", "task": "", "due": "YYYY-MM-DD"}}],
  "open_questions": ["..."],
  "next_meeting_agenda": ["..."]
}}

Transcript:
{transcript}"""
        }]
    )
    return json.loads(response.content[0].text)

Business Automation with AI¶

import smtplib
from email.mime.text import MIMEText
from openai import OpenAI

client = OpenAI()

class BusinessAIAssistant:
    def __init__(self):
        self.client = OpenAI()

    def draft_professional_email(
        self, context: str, tone: str = "professional", recipient: str = "colleague"
    ) -> str:
        return self.client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{
                "role": "system",
                "content": f"Write a {tone} email to a {recipient}. "
                           "Be concise. Do not add placeholders."
            }, {
                "role": "user", "content": context
            }]
        ).choices[0].message.content

    def analyze_customer_feedback(self, reviews: list[str]) -> dict:
        combined = "\n".join(f"- {r}" for r in reviews[:50])  # limit
        response = self.client.chat.completions.create(
            model="gpt-4o",
            messages=[{
                "role": "user",
                "content": f"""Analyze these customer reviews. Return JSON:
{{
  "overall_sentiment": "positive|negative|mixed",
  "nps_estimate": 0-100,
  "top_positives": ["..."],
  "top_complaints": ["..."],
  "suggested_improvements": ["..."],
  "urgent_issues": ["..."]
}}

Reviews:
{combined}"""
            }],
            response_format={"type": "json_object"}
        )
        import json
        return json.loads(response.choices[0].message.content)

    def generate_report(self, data: dict, report_type: str) -> str:
        return self.client.chat.completions.create(
            model="gpt-4o",
            messages=[{
                "role": "system",
                "content": f"You are a business analyst. Write a {report_type} report. "
                           "Use professional language. Include executive summary, "
                           "findings, and recommendations."
            }, {
                "role": "user",
                "content": f"Generate report from this data:\n{data}"
            }]
        ).choices[0].message.content

Chapter 3: Generative AI for Education & Research¶

AI-Powered Learning Tools¶

from openai import OpenAI

client = OpenAI()

def adaptive_tutor(topic: str, student_level: str, question: str) -> str:
    """Tutor that adapts to student level."""
    return client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "system",
            "content": f"""You are an expert tutor teaching {topic}.
Student level: {student_level}

Rules:
- Use analogies and examples appropriate for their level
- Check understanding with a follow-up question
- If they seem confused, approach from a different angle
- Celebrate progress, be encouraging"""
        }, {
            "role": "user", "content": question
        }]
    ).choices[0].message.content

def generate_quiz(topic: str, num_questions: int = 5, difficulty: str = "medium") -> list:
    """Generate a quiz with answers."""
    import json
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "user",
            "content": f"""Create {num_questions} {difficulty} multiple-choice questions about {topic}.
Return JSON array:
[{{
  "question": "...",
  "options": ["A) ...", "B) ...", "C) ...", "D) ..."],
  "answer": "A",
  "explanation": "..."
}}]"""
        }],
        response_format={"type": "json_object"}
    )
    return json.loads(response.choices[0].message.content)

def research_assistant(query: str, context_papers: list[str] = None) -> dict:
    """AI research assistant for literature analysis."""
    system = """You are a research assistant. Help researchers:
- Summarize academic papers accurately
- Identify research gaps
- Suggest related work
- Explain technical concepts clearly
Always note limitations and uncertainties."""

    user_content = f"Research query: {query}"
    if context_papers:
        user_content += f"\n\nRelevant papers:\n" + "\n---\n".join(context_papers)

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": system},
            {"role": "user", "content": user_content}
        ]
    )
    return {"answer": response.choices[0].message.content,
            "tokens_used": response.usage.total_tokens}

Automating Literature Review¶

import arxiv
from sentence_transformers import SentenceTransformer
import numpy as np

def literature_review_pipeline(research_topic: str, max_papers: int = 20) -> dict:
    """Automated literature review using ArXiv + AI."""
    # Step 1: Fetch recent papers
    search = arxiv.Search(
        query=research_topic,
        max_results=max_papers,
        sort_by=arxiv.SortCriterion.Relevance
    )
    papers = list(search.results())

    # Step 2: Embed abstracts for clustering
    embed_model = SentenceTransformer("all-MiniLM-L6-v2")
    abstracts = [p.summary[:500] for p in papers]
    embeddings = embed_model.encode(abstracts)

    # Step 3: Find most relevant to query
    query_embedding = embed_model.encode([research_topic])
    similarities = np.dot(embeddings, query_embedding.T).flatten()
    top_indices = np.argsort(similarities)[::-1][:5]

    top_papers = [papers[i] for i in top_indices]

    # Step 4: AI synthesis
    paper_summaries = "\n\n".join([
        f"Title: {p.title}\nAuthors: {', '.join(str(a) for a in p.authors[:3])}\n"
        f"Abstract: {p.summary[:300]}..."
        for p in top_papers
    ])

    client = OpenAI()
    synthesis = client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "user",
            "content": f"""Based on these papers about '{research_topic}', provide:
1. Key themes and findings
2. Research gaps and open questions
3. Methodological approaches used
4. Most impactful papers and why

Papers:
{paper_summaries}"""
        }]
    ).choices[0].message.content

    return {
        "topic": research_topic,
        "papers_analyzed": len(papers),
        "top_papers": [{"title": p.title, "url": p.entry_id} for p in top_papers],
        "synthesis": synthesis
    }

Chapter 4: Generative AI in Healthcare¶

Current Applications (2026)¶

FDA-cleared AI Applications:
  - Radiology: detect tumors in CT/MRI scans (IDx-DR, Aidoc)
  - Pathology: analyze biopsy slides (Paige.AI)
  - Cardiology: ECG interpretation (Apple Watch, Cardiologs)
  - Drug discovery: protein structure prediction (AlphaFold 3)
  - Clinical notes: auto-generate from doctor-patient conversation

Emerging (Research Stage):
  - Drug-drug interaction prediction
  - Personalized treatment recommendations
  - Rare disease diagnosis from phenotypes
  - Clinical trial patient matching

Protein Structure with BioPython + AI¶

# AlphaFold 3 via API (ESMFold for open-source alternative)
import requests

def predict_protein_structure(sequence: str) -> dict:
    """Predict 3D structure of a protein from amino acid sequence."""
    # ESMFold API (Meta's open-source protein folding model)
    response = requests.post(
        "https://api.esmatlas.com/foldSequence/v1/pdb/",
        headers={"Content-Type": "application/x-www-form-urlencoded"},
        data=sequence,
        timeout=120
    )
    return {
        "pdb_structure": response.text,  # 3D structure in PDB format
        "sequence_length": len(sequence),
        "sequence": sequence
    }

# Example: Insulin A-chain
insulin_a = "GIVEQCCTSICSLYQLENYCN"
result = predict_protein_structure(insulin_a)
# Returns PDB format data for visualization in PyMOL or Mol*

Medical Document Analysis¶

def analyze_medical_report(report_text: str) -> dict:
    """Extract structured information from medical reports."""
    import json
    client = OpenAI()

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "system",
            "content": """You are a medical information extraction system.
Extract information accurately. Mark uncertain items with '(uncertain)'.
NEVER provide medical advice or diagnosis. Only extract what is stated."""
        }, {
            "role": "user",
            "content": f"""Extract from this medical report:
{{
  "patient_demographics": {{}},
  "chief_complaint": "",
  "diagnoses": ["{{"icd_code": "", "description": ""}}"],
  "medications": [{{"name": "", "dose": "", "frequency": ""}}],
  "lab_results": [{{"test": "", "value": "", "unit": "", "flag": "normal|high|low"}}],
  "recommendations": [],
  "follow_up": ""
}}

Report:
{report_text}"""
        }],
        response_format={"type": "json_object"}
    )
    return json.loads(response.choices[0].message.content)

Healthcare AI Caution

AI in healthcare requires: - Regulatory compliance: FDA (US), CE marking (EU), TGA (Australia) - Clinical validation: results must be clinically validated on diverse populations - Human oversight: AI assists, never replaces clinical judgment - Audit trails: every AI decision must be logged and explainable - Data privacy: HIPAA (US), PDPA (Thailand), GDPR (EU) compliance

Chapter 5: Generative AI in Marketing¶

Content Generation at Scale¶

from openai import OpenAI
from dataclasses import dataclass

client = OpenAI()

@dataclass
class MarketingContent:
    product_name: str
    product_description: str
    target_audience: str
    tone: str = "professional"
    brand_voice: str = "innovative and trustworthy"

def generate_marketing_suite(content: MarketingContent) -> dict:
    """Generate a full suite of marketing content from product info."""
    context = f"""
Product: {content.product_name}
Description: {content.product_description}
Target Audience: {content.target_audience}
Tone: {content.tone}
Brand Voice: {content.brand_voice}"""

    def generate(task: str) -> str:
        return client.chat.completions.create(
            model="gpt-4o",
            messages=[
                {"role": "system", "content": f"Marketing copywriter.{context}"},
                {"role": "user", "content": task}
            ],
            temperature=0.8  # more creative for marketing
        ).choices[0].message.content

    return {
        "headline": generate("Write 5 compelling headlines (max 10 words each). Return as numbered list."),
        "tagline": generate("Write 3 punchy taglines (max 6 words each)."),
        "email_subject": generate("Write 5 email subject lines with >30% open rate potential."),
        "social_linkedin": generate("Write a LinkedIn post (200 words, professional, include CTA)."),
        "social_twitter": generate("Write 3 tweets (max 280 chars each, include relevant hashtags)."),
        "ad_copy_short": generate("Write a Google Ads headline (30 chars) and description (90 chars)."),
        "seo_meta": generate("Write an SEO meta title (60 chars) and description (155 chars)."),
        "product_description": generate("Write a compelling product description (150 words, benefits-focused)."),
    }

# Usage
product = MarketingContent(
    product_name="CloudAI Analytics",
    product_description="Real-time business intelligence platform powered by AI",
    target_audience="B2B SaaS CTOs and data teams",
    tone="confident and innovative",
)
suite = generate_marketing_suite(product)
for asset, content in suite.items():
    print(f"\n{'='*40}\n{asset.upper()}\n{content}")

Personalization at Scale¶

def personalized_email_campaign(
    base_offer: str, customer_segments: list[dict]
) -> list[dict]:
    """Generate personalized emails for each customer segment."""
    results = []
    for segment in customer_segments:
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{
                "role": "system",
                "content": "Email marketing specialist. Write personalized, conversion-focused emails."
            }, {
                "role": "user",
                "content": f"""Write a personalized marketing email.

Offer: {base_offer}
Customer segment: {segment['name']}
Demographics: {segment.get('demographics', '')}
Past behavior: {segment.get('past_behavior', '')}
Pain points: {segment.get('pain_points', '')}

Format: Subject line + Email body (150 words max). Include one clear CTA."""
            }]
        )
        results.append({
            "segment": segment["name"],
            "email": response.choices[0].message.content
        })
    return results

# A/B testing with AI
def ab_test_copy(concept: str, variants: int = 3) -> list[str]:
    """Generate multiple copy variants for A/B testing."""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "user",
            "content": f"""Create {variants} distinct A/B test variants for:
{concept}

Make each meaningfully different:
- Variant A: benefit-focused
- Variant B: urgency-focused  
- Variant C: social proof-focused

Return each clearly labeled."""
        }]
    )
    return response.choices[0].message.content

Chapter 6: Building an AI Agent with Tool-Calling & APIs¶

What Is an AI Agent?¶

An agent is an LLM that can use tools (functions, APIs) to take actions in the world:

User: "What's the weather in Bangkok and should I bring an umbrella?"

Agent loop:
  1. Think: "I need weather data. I have a weather tool."
  2. Call tool: get_weather(city="Bangkok")
  3. Observe result: {"temp": 32, "humidity": 90, "rain_chance": 70}
  4. Think: "High rain chance. I should recommend an umbrella."
  5. Respond: "It's 32°C with 70% chance of rain. Yes, bring an umbrella!"

Building an Agent with Tool-Calling¶

from openai import OpenAI
import json
import requests
from datetime import datetime

client = OpenAI()

# Define tools (functions the agent can call)
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"},
                    "country_code": {"type": "string", "description": "ISO country code e.g. TH"}
                },
                "required": ["city"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "search_web",
            "description": "Search the web for current information",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "Search query"}
                },
                "required": ["query"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": "Perform mathematical calculations",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {"type": "string", "description": "Math expression e.g. '15 * 24 + 100'"}
                },
                "required": ["expression"]
            }
        }
    },
]

# Tool implementations
def get_weather(city: str, country_code: str = "") -> dict:
    """Call real weather API."""
    api_key = "YOUR_OPENWEATHER_API_KEY"
    url = f"http://api.openweathermap.org/data/2.5/weather?q={city}&appid={api_key}&units=metric"
    resp = requests.get(url, timeout=5)
    data = resp.json()
    return {
        "city": city,
        "temperature": data["main"]["temp"],
        "description": data["weather"][0]["description"],
        "humidity": data["main"]["humidity"],
    }

def calculate(expression: str) -> dict:
    """Safely evaluate math expression."""
    try:
        # Safe eval — only allow math operations
        allowed = set("0123456789+-*/().% ")
        if not all(c in allowed for c in expression):
            return {"error": "Invalid characters in expression"}
        result = eval(expression)  # safe because we validated input
        return {"expression": expression, "result": result}
    except Exception as e:
        return {"error": str(e)}

def search_web(query: str) -> dict:
    """Simulate web search (replace with Serper/Tavily API)."""
    return {"query": query, "results": f"[Search results for: {query}]"}

TOOL_FUNCTIONS = {
    "get_weather": get_weather,
    "calculate": calculate,
    "search_web": search_web,
}

# The agent loop
def run_agent(user_message: str, max_iterations: int = 10) -> str:
    messages = [
        {"role": "system", "content": f"You are a helpful assistant. Today is {datetime.now().date()}. Use tools when needed."},
        {"role": "user", "content": user_message}
    ]

    for iteration in range(max_iterations):
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
            tools=tools,
            tool_choice="auto"
        )

        choice = response.choices[0]
        messages.append(choice.message)  # add assistant message to history

        # If no tool calls → we have our final answer
        if not choice.message.tool_calls:
            return choice.message.content

        # Execute each tool call
        for tool_call in choice.message.tool_calls:
            func_name = tool_call.function.name
            func_args = json.loads(tool_call.function.arguments)

            print(f"[Agent] Calling {func_name}({func_args})")
            result = TOOL_FUNCTIONS[func_name](**func_args)
            print(f"[Agent] Result: {result}")

            # Add tool result to messages
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": json.dumps(result)
            })

    return "Agent reached maximum iterations without completing."

# Test the agent
print(run_agent("What's the weather in Bangkok and if the temperature is above 30°C, how many degrees above freezing is that?"))

Multi-Agent Systems¶

For complex tasks, multiple specialized agents collaborate:

from openai import OpenAI

client = OpenAI()

def researcher_agent(topic: str) -> str:
    """Specialized agent for research tasks."""
    return client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "system",
            "content": "You are a research specialist. Find facts, cite sources, be accurate."
        }, {"role": "user", "content": f"Research: {topic}"}]
    ).choices[0].message.content

def writer_agent(research: str, format: str) -> str:
    """Specialized agent for writing."""
    return client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "system",
            "content": f"You are an expert writer. Format: {format}. Be engaging and clear."
        }, {"role": "user", "content": f"Write using this research:\n{research}"}]
    ).choices[0].message.content

def critic_agent(content: str) -> str:
    """Specialized agent for quality review."""
    return client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "system",
            "content": "You are an editor. Find factual errors, logical gaps, improve clarity."
        }, {"role": "user", "content": f"Review and improve:\n{content}"}]
    ).choices[0].message.content

def orchestrated_content_pipeline(topic: str, format: str = "blog post") -> str:
    """Orchestrate multiple agents to produce high-quality content."""
    print(f"[Orchestrator] Starting pipeline for: {topic}")

    print("[Orchestrator] Assigning to Researcher...")
    research = researcher_agent(topic)

    print("[Orchestrator] Assigning to Writer...")
    draft = writer_agent(research, format)

    print("[Orchestrator] Assigning to Critic...")
    final = critic_agent(draft)

    return final

Chapter 7: Deploying AI Models on Cloud¶

Deployment Architecture Options¶

Option A: API (Simplest, most common)
  Your App → OpenAI/Anthropic/Google API → Response
  Pro: Zero infrastructure, always updated
  Con: Cost scales with usage, data leaves your environment

Option B: Managed AI Services
  Your App → AWS Bedrock / GCP Vertex AI / Azure OpenAI → Response
  Pro: Enterprise compliance, regional data residency
  Con: Less model variety, vendor lock-in

Option C: Self-hosted (vLLM on GPU servers)
  Your App → Your vLLM Server → Open-source model → Response
  Pro: Full control, data privacy, predictable cost at scale
  Con: GPU expertise required, operational overhead

AWS Bedrock Deployment¶

import boto3
import json

# AWS Bedrock — access Claude, LLaMA, Titan, and others via AWS
bedrock = boto3.client("bedrock-runtime", region_name="us-east-1")

def bedrock_chat(prompt: str, model_id: str = "anthropic.claude-3-5-sonnet-20241022-v2:0") -> str:
    body = json.dumps({
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 1000,
        "messages": [{"role": "user", "content": prompt}]
    })

    response = bedrock.invoke_model(
        modelId=model_id,
        contentType="application/json",
        accept="application/json",
        body=body
    )

    result = json.loads(response["body"].read())
    return result["content"][0]["text"]

# Streaming on Bedrock
def bedrock_stream(prompt: str) -> None:
    body = json.dumps({
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 1000,
        "messages": [{"role": "user", "content": prompt}]
    })

    response = bedrock.invoke_model_with_response_stream(
        modelId="anthropic.claude-3-5-sonnet-20241022-v2:0",
        body=body
    )

    for event in response["body"]:
        chunk = json.loads(event["chunk"]["bytes"])
        if chunk["type"] == "content_block_delta":
            print(chunk["delta"]["text"], end="", flush=True)

GCP Vertex AI Deployment¶

import vertexai
from vertexai.generative_models import GenerativeModel

# Initialize Vertex AI
vertexai.init(project="your-project-id", location="us-central1")

# Use Gemini models through Vertex AI
model = GenerativeModel("gemini-2.0-flash-exp")
response = model.generate_content("Explain generative AI in 3 sentences.")
print(response.text)

# Deploy your own model to Vertex AI Endpoints
from google.cloud import aiplatform

def deploy_custom_model(
    model_artifact_uri: str,
    model_display_name: str,
    machine_type: str = "n1-standard-4"
) -> str:
    """Deploy a fine-tuned model to Vertex AI."""
    aiplatform.init(project="your-project-id", location="us-central1")

    model = aiplatform.Model.upload(
        display_name=model_display_name,
        artifact_uri=model_artifact_uri,  # gs://your-bucket/model/
        serving_container_image_uri="us-docker.pkg.dev/vertex-ai/prediction/pytorch-gpu.2-2:latest"
    )

    endpoint = model.deploy(
        machine_type=machine_type,
        accelerator_type="NVIDIA_TESLA_T4",
        accelerator_count=1,
        min_replica_count=1,
        max_replica_count=5,  # auto-scales!
    )
    return endpoint.resource_name

Azure OpenAI Service¶

from openai import AzureOpenAI

# Azure OpenAI — same API, enterprise compliance
azure_client = AzureOpenAI(
    azure_endpoint="https://your-resource.openai.azure.com/",
    api_key="YOUR_AZURE_OPENAI_KEY",
    api_version="2024-02-01"
)

response = azure_client.chat.completions.create(
    model="gpt-4o",  # your deployment name in Azure
    messages=[{"role": "user", "content": "What is generative AI?"}]
)
print(response.choices[0].message.content)

Production Deployment Checklist¶

Infrastructure:
  ☐ GPU node provisioned with right VRAM for chosen model
  ☐ Load balancer in front of multiple model servers
  ☐ Auto-scaling configured (scale up under load, down at idle)
  ☐ Health checks and readiness probes on model endpoints
  ☐ Model warm-up request at startup (avoid cold start latency)

Reliability:
  ☐ Fallback model if primary is unavailable
  ☐ Request timeouts and retry logic
  ☐ Rate limiting per user/API key
  ☐ Circuit breaker for downstream dependencies

Observability:
  ☐ Request logging (prompt, response, latency, token count)
  ☐ Cost tracking (tokens × price per token)
  ☐ Error rate alerting (>1% error rate → page on-call)
  ☐ Latency percentiles (p50, p95, p99 in Grafana)

Security:
  ☐ API keys rotated, stored in secrets manager (not in code)
  ☐ Input validation and sanitization
  ☐ Output moderation (especially for public-facing apps)
  ☐ Audit logs for compliance
  ☐ Network isolation (model servers not publicly accessible)

Summary¶

Topic	Key Takeaway
Code AI	Copilot + local models cover 80% of dev tasks; code review + test gen are immediate wins
Business AI	Document intelligence + meeting summaries + report generation save hours per day
Education AI	Adaptive tutors + auto-quizzes + literature review are highest-impact education uses
Healthcare AI	AlphaFold for drug discovery; always validate clinically; HIPAA/PDPA compliance mandatory
Marketing AI	Personalized content at scale; A/B test AI copy variants; measure conversion, not just production
AI Agents	LLM + tools + a loop; agents can act autonomously but need guardrails
Cloud Deployment	API first; Bedrock/Vertex/Azure for enterprise; self-hosted vLLM for data privacy + cost

Next → Part 5: Career & Capstone Projects — build your portfolio, prepare for interviews, and chart your AI career path.

Practice Challenge

Build a minimal AI agent this week:

Give it two tools: calculator and get_current_date
Test with: "How many days until the new year? And what is 365 × 24?"
Watch the agent decide which tools to call in what order
Add a third tool: get_weather using a free API
Deploy it as a simple Flask API on any cloud free tier

Questions or discussion? Connect on LinkedIn, X or reach out via email.

Discussion

Have thoughts on this post? Share them below — questions, corrections, or your own experience are all welcome.