The eval system helps you find and fix weaknesses before deploying.Documentation Index
Fetch the complete documentation index at: https://docs.skillcreator.ai/llms.txt
Use this file to discover all available pages before exploring further.
The eval workflow
Common issues and fixes
Low completeness score
The skill does not cover all aspects of the task. Fix: Ask yourself “what would a new team member need to know?” Add missing steps, edge cases, and error handling instructions.Low clarity score
Instructions are ambiguous or could be interpreted multiple ways. Fix: Replace vague language with specific actions:| Before | After |
|---|---|
| ”Handle errors appropriately" | "Wrap database calls in try/catch and return a 500 status with the error message" |
| "Use good naming" | "Use camelCase for variables, PascalCase for components, UPPER_SNAKE for constants" |
| "Follow best practices” | Remove entirely, this adds no information |
Low constraints score
The skill does not set clear boundaries. Fix: Add specific, measurable constraints:- File size limits
- Naming conventions
- Forbidden patterns
- Required dependencies
Low verification score
The agent cannot check its own work. Fix: Add a verification checklist with items that can be independently checked. Each item should be binary: it either passes or fails.Low context score
The skill assumes knowledge the agent may not have. Fix: Add background sections explaining domain concepts, project conventions, or architectural decisions that inform the instructions.Low structure score
The skill is poorly organized or uses inconsistent formatting. Fix: Follow the standard section order: Overview, Instructions, Example Prompts. Use consistent heading levels and list formatting.Tracking improvements
The Evals view shows sparkline trends per skill. Use these to:- Verify that edits actually improved scores
- Catch regressions when updating skills
- Compare quality across your skill library