Sprint 5 AP
Action Points
AP-001: Document Clinical Case Structure
- Owner: Alina + Team
- What: Research and document what a clinical case contains:
- Patient persona fields
- Symptom structure
- Medical history fields
- Gold standard structure (hidden from AI)
- Evaluation criteria
- Output: Add to
docs/case-schema.md - Due: April 9, 2026
AP-002: Create Synthetic Cases in JSON Format
- Owner: Ilnar, Aizat
- What: Create 2-3 synthetic cases following the documented schema
- Format: JSON files stored in
/cases/directory - Source: ChatGPT-generated, then manually reviewed
- Due: April 9, 2026
AP-003: Build LLM Evaluation Harness
- Owner: Ilnar, Aizat
- What: Create a script/tool that:
- Loads a case from JSON
- Sends patient context to LLM (no diagnosis!)
- Asks predefined questions
- Records answers for analysis
- Goal: Test LLM behavior without spending money (use DeepSeek free tier or local model)
- Due: April 16, 2026
AP-004: Prepare Working Demo for Next Presentation
- Owner: Timur, Ilnar
- What: Ensure
docker-compose upworks and frontend shows:- Authentication flow
- Case selection (with synthetic cases)
- Basic chat interface
- Output: Live demo during next mentor presentation
- Due: April 9, 2026
AP-005: Add Clickable Links to Presentation
- Owner: Alina
- What: Add hyperlinks to:
- GitHub repository
- Specific documentation files
- Pull requests showing completed work
- Demo video (if recorded)
- Why: Mentors have limited time and won’t search for materials
- Due: Before next presentation (April 9, 2026)
AP-006: Document AI Evaluation Results
- Owner: Aizat, Ilnar
- What: After running evaluation harness, document:
- Which LLM(s) were tested
- What questions were asked
- How accurate the responses were
- What prompt improvements were made
- Output: Add to
docs/ai-evaluation.md - Due: April 16, 2026
AP-007: Request Medical Expert Access (If Needed)
- Owner: Alina
- What: Prepare specific questions for medical experts (via Mansur):
- How are clinical cases structured in real medical education?
- What symptoms are critical for common diseases?
- How is diagnostic accuracy evaluated?
- Output: List of prepared questions before requesting meeting
- Due: April 16, 2026 (if needed)
AP-008: Re-prioritize Frontend Over Telegram Bot
- Owner: Alina (PM)
- What: Officially pause Telegram bot work; move all frontend tasks to higher priority
- Update: Update GitHub Project board and sprint planning
- Due: Immediately
Key Insights from Mentor
On AI Architecture
“You cannot depend on the LLM’s interpretation of a disease. It will hallucinate. It will invent symptoms. Give the AI only what a real patient knows — how they feel, not what they have.”
On Proving Value Before Spending
“Build the evaluation harness first, test with free models, show me it works. Then we go to Denis for API keys. Don’t ask for money before proving the concept.”
On Process Documentation
“Document what you actually do, even if it’s not perfect. Then show how you plan to improve it. That’s what mentors are looking for.”
On Visibility
“Add clickable links to everything. Mentors won’t search for 5 minutes. If they can’t find it in 2-3 clicks, they assume it doesn’t exist.”
Next Milestones
| Milestone | Target Date | Status |
|---|---|---|
| Case schema documented | April 9 | 🔴 Not started |
| Synthetic cases created | April 9 | 🔴 Not started |
| Working demo ready | April 9 | 🟡 In progress |
| LLM evaluation harness | April 16 | 🔴 Not started |
| AI evaluation results | April 16 | 🔴 Not started |
| Medical expert questions prepared | April 16 | 🔴 Not started |
Summary
Critical Path Forward
1. Document case structure (AP-001)
2. Create synthetic cases (AP-002)
3. Build evaluation harness (AP-003)
4. Test with free LLM (DeepSeek)
5. Present results to mentor (AP-006)
6. Request API keys from Denis (if justified)
The Most Important Takeaway
Do not give the diagnosis to the AI. The AI simulates a patient who doesn’t know what’s wrong. The learner must figure it out through questions. The gold standard diagnosis is stored separately for evaluation only.