Sprint 5 AP

Action Points

AP-001: Document Clinical Case Structure

  • Owner: Alina + Team
  • What: Research and document what a clinical case contains:
    • Patient persona fields
    • Symptom structure
    • Medical history fields
    • Gold standard structure (hidden from AI)
    • Evaluation criteria
  • Output: Add to docs/case-schema.md
  • Due: April 9, 2026

AP-002: Create Synthetic Cases in JSON Format

  • Owner: Ilnar, Aizat
  • What: Create 2-3 synthetic cases following the documented schema
  • Format: JSON files stored in /cases/ directory
  • Source: ChatGPT-generated, then manually reviewed
  • Due: April 9, 2026

AP-003: Build LLM Evaluation Harness

  • Owner: Ilnar, Aizat
  • What: Create a script/tool that:
    • Loads a case from JSON
    • Sends patient context to LLM (no diagnosis!)
    • Asks predefined questions
    • Records answers for analysis
  • Goal: Test LLM behavior without spending money (use DeepSeek free tier or local model)
  • Due: April 16, 2026

AP-004: Prepare Working Demo for Next Presentation

  • Owner: Timur, Ilnar
  • What: Ensure docker-compose up works and frontend shows:
    • Authentication flow
    • Case selection (with synthetic cases)
    • Basic chat interface
  • Output: Live demo during next mentor presentation
  • Due: April 9, 2026
  • Owner: Alina
  • What: Add hyperlinks to:
    • GitHub repository
    • Specific documentation files
    • Pull requests showing completed work
    • Demo video (if recorded)
  • Why: Mentors have limited time and won’t search for materials
  • Due: Before next presentation (April 9, 2026)

AP-006: Document AI Evaluation Results

  • Owner: Aizat, Ilnar
  • What: After running evaluation harness, document:
    • Which LLM(s) were tested
    • What questions were asked
    • How accurate the responses were
    • What prompt improvements were made
  • Output: Add to docs/ai-evaluation.md
  • Due: April 16, 2026

AP-007: Request Medical Expert Access (If Needed)

  • Owner: Alina
  • What: Prepare specific questions for medical experts (via Mansur):
    • How are clinical cases structured in real medical education?
    • What symptoms are critical for common diseases?
    • How is diagnostic accuracy evaluated?
  • Output: List of prepared questions before requesting meeting
  • Due: April 16, 2026 (if needed)

AP-008: Re-prioritize Frontend Over Telegram Bot

  • Owner: Alina (PM)
  • What: Officially pause Telegram bot work; move all frontend tasks to higher priority
  • Update: Update GitHub Project board and sprint planning
  • Due: Immediately

Key Insights from Mentor

On AI Architecture

“You cannot depend on the LLM’s interpretation of a disease. It will hallucinate. It will invent symptoms. Give the AI only what a real patient knows — how they feel, not what they have.”

On Proving Value Before Spending

“Build the evaluation harness first, test with free models, show me it works. Then we go to Denis for API keys. Don’t ask for money before proving the concept.”

On Process Documentation

“Document what you actually do, even if it’s not perfect. Then show how you plan to improve it. That’s what mentors are looking for.”

On Visibility

“Add clickable links to everything. Mentors won’t search for 5 minutes. If they can’t find it in 2-3 clicks, they assume it doesn’t exist.”


Next Milestones

Milestone Target Date Status
Case schema documented April 9 🔴 Not started
Synthetic cases created April 9 🔴 Not started
Working demo ready April 9 🟡 In progress
LLM evaluation harness April 16 🔴 Not started
AI evaluation results April 16 🔴 Not started
Medical expert questions prepared April 16 🔴 Not started

Summary

Critical Path Forward

1. Document case structure (AP-001)
2. Create synthetic cases (AP-002)
3. Build evaluation harness (AP-003)
4. Test with free LLM (DeepSeek)
5. Present results to mentor (AP-006)
6. Request API keys from Denis (if justified)

The Most Important Takeaway

Do not give the diagnosis to the AI. The AI simulates a patient who doesn’t know what’s wrong. The learner must figure it out through questions. The gold standard diagnosis is stored separately for evaluation only.


This site uses Just the Docs, a documentation theme for Jekyll.