Skip to Content
🎉 Welcome to handit.ai Documentation!
EvaluationGuidesCLI Setup (Recommended)

CLI Evaluation Setup

Automatic evaluation configuration using the Handit CLI. Connect evaluation models and configure quality assessment in minutes.

Prerequisites: Handit CLI installed and a Handit.ai account. If you haven’t installed the CLI yet, run npm install -g @handit.ai/cli.

Quick setup

terminal
handit-cli evaluators-setup

The CLI will guide you through:

  • Connect evaluation models (OpenAI, Together AI, etc.)
  • Configure model tokens with your API keys
  • Associate evaluators to your AI components
  • Set evaluation percentages for each quality dimension

What gets configured

Evaluation Models: GPT-4 for highest accuracy, GPT-3.5-turbo for cost-effectiveness, Llama models for open source evaluation.

Quality Dimensions: Completeness, accuracy, empathy, format compliance, and any custom evaluators you create.

Evaluation Coverage: Typically 10-20% of interactions for cost-effective quality monitoring.

Managing your setup

# Update configuration anytime handit-cli evaluators-setup

When to reconfigure:

  • Adding new evaluation models
  • Changing evaluation percentages
  • Connecting new evaluators
  • Updating API keys

Verify setup

✅ Check your dashboard: Go to dashboard.handit.ai  - you should see:

  • Quality scores appearing for evaluated interactions
  • Evaluation trends in Agent Performance
  • Individual evaluation breakdowns

Next steps

Your autonomous engineer can now detect quality issues! Enable GitHub integration so it can create pull requests with fixes.

Troubleshooting

CLI issues: Ensure Node.js is installed and you have a valid Handit.ai account. Try running handit-cli evaluators-setup again.

No evaluation data: Verify your AI is receiving traffic and evaluation percentages are set above 0%.

Model token issues: Check API keys are valid and have sufficient credits. The CLI will help you reconfigure if needed.

For help, visit our Support page or join our Discord community .

Last updated on