Skip to Content
🎉 Welcome to handit.ai Documentation!
EvaluationEvaluation GuideModel Token Setup

Model Token Setup

Connect AI providers for LLM-as-Judge evaluation. Model tokens allow Handit.ai to securely access AI providers like OpenAI and Together AI for automated quality assessment of your AI systems.

This guide covers setting up tokens for our supported providers through the platform interface.

Model tokens are stored securely and encrypted. They’re only used for evaluation requests and never shared or logged. All setup happens through the Handit.ai platform—no API integration required.

Setup Process

Get API Key

Obtain an API key from your AI provider (OpenAI or Together AI)

Configure Token in Platform

Add the token to Handit.ai through the dashboard interface

Test Connection

Verify the token works and has appropriate permissions

Use in Evaluators

Associate the token with your single-purpose evaluators

Supported Providers

🤖 OpenAI

  • GPT-4 - Highest accuracy for complex evaluations
    • Best for nuanced reasoning and complex analysis
    • Ideal for critical evaluation tasks
    • Higher cost but superior quality
  • GPT-3.5-turbo - Cost-effective for high-volume evaluation
    • Fast response times for bulk processing
    • Good balance of cost and quality
    • Suitable for routine evaluations
  • GPT-4-turbo - Balanced performance and speed
    • Improved speed over standard GPT-4
    • Good for time-sensitive evaluations
    • Moderate cost with good accuracy
  • Industry standard for evaluation tasks

🦙 Together AI

  • Llama v4 Scout - High-quality open source alternative
    • Strong reasoning capabilities
    • Cost-effective for complex tasks
    • Good for technical evaluations
  • Llama v4 Maverick - Faster, cost-effective processing
    • Optimized for speed
    • Lower cost per token
    • Good for high-volume tasks
  • CodeLlama - Specialized for technical evaluation
    • Excellent for code and technical content
    • Strong understanding of programming concepts
    • Ideal for technical documentation
  • Open source model alternatives

OpenAI Configuration

OpenAI models are the most popular choice for LLM-as-Judge evaluation due to their strong reasoning capabilities.

Get Your OpenAI API Key

1. Access OpenAI Platform

2. Create and Copy Key

  • Give your key a descriptive name (e.g., “Handit Evaluation”)
  • Copy the key (starts with sk-)
  • Store it securely—you won’t see it again

Add Token to Handit.ai

1. Navigate to Model Tokens

  • Open your Handit.ai dashboard
  • Go to Settings → Model Tokens
  • Click Add New Token

2. Configure Token

3. Test and Save

  • Click Test Connection to verify functionality
  • Save the token configuration once verified

Model Selection Guide

ModelBest Use CaseDescription
GPT-4Complex evaluationsBest for complex evaluations requiring nuanced reasoning
GPT-3.5-turboHigh-volume evaluationIdeal for high-volume evaluation with good quality
GPT-4-turboBalanced performanceBalanced option with fast response times

Together AI Configuration

Together AI provides access to open-source models like Llama, offering cost-effective alternatives to proprietary models.

Get Your Together AI API Key

1. Access Together AI Platform

2. Create API Key

  • Click Create new API key
  • Give it a descriptive name
  • Copy your API key securely

Add Token to Handit.ai

1. Configure Together AI Token

  • In your Handit.ai dashboard, go to Settings → Model Tokens
  • Click Add New Token

2. Test and Save

  • Verify connection with Test Connection
  • Save the configuration

Model Selection Guide

Llama v4 Scout - High-quality reasoning for complex evaluation tasks Llama v4 Maverick - Fast processing for high-volume evaluation CodeLlama - Specialized for technical content assessment

Security Best Practices

âś… Token Security

  • Use dedicated API keys for evaluation only
    • Separate from production keys
    • Different keys for different evaluation types
    • Clear naming convention for easy identification
  • Set usage limits on provider dashboards
    • Daily/monthly token limits
    • Cost thresholds
    • Rate limiting
  • Rotate keys regularly (monthly/quarterly)
    • Schedule regular rotations
    • Maintain overlap period during rotation
    • Update all evaluators with new keys
  • Monitor usage through the platform
    • Track token consumption
    • Set up alerts for unusual usage
    • Regular usage reports
  • Use descriptive names for easy identification
    • Include purpose in key name
    • Add creation date
    • Specify environment (dev/prod)

đź”’ Access Management

  • Limit team member access to sensitive tokens
    • Role-based access control
    • Minimum required permissions
    • Regular access reviews
  • Use organization/project scoping when available
    • Separate tokens per project
    • Environment-specific tokens
    • Clear ownership and responsibility
  • Keep backup tokens for critical evaluations
    • Store securely
    • Regular testing
    • Clear rotation process
  • Review token usage regularly
    • Usage patterns
    • Cost analysis
    • Performance metrics

Common Issues & Solutions

“Invalid API Key” Error

  • Verify the API key is correct and hasn’t expired
    • Check key format
    • Verify creation date
    • Confirm key hasn’t been revoked
  • Check if you’ve reached your usage limits
    • Review current usage
    • Check billing status
    • Verify rate limits
  • Ensure the key has required permissions
    • Model access permissions
    • API access level
    • Organization restrictions

“Rate Limit Exceeded”

  • Check your provider’s rate limits
    • Per-minute limits
    • Per-hour limits
    • Daily quotas
  • Consider upgrading your provider plan
    • Higher rate limits
    • Priority access
    • Dedicated capacity
  • Reduce evaluation frequency temporarily
    • Implement backoff strategy
    • Queue evaluations
    • Batch processing

“Model Not Found”

  • Verify the model name is exactly correct
    • Check for typos
    • Confirm model availability
    • Verify model version
  • Check if the model is available in your region
    • Regional restrictions
    • Compliance requirements
    • Data residency
  • Ensure your API key has access to the selected model
    • Subscription level
    • Organization settings
    • Model permissions

Using Tokens in Evaluators

Once configured, tokens are used when creating evaluators:

1. Create Evaluator

  • Go to Evaluation → Evaluation Suite
  • Click Create New Evaluator

2. Select Appropriate Token

  • Choose the token that matches your evaluation complexity
  • Consider cost vs. quality trade-offs

3. Monitor Performance

  • Track token usage through the platform
  • Optimize token assignment based on results

Next Steps

Ready to create your first evaluators?

Your model tokens are now ready! Next, create single-purpose evaluators that use these tokens to assess specific quality dimensions of your AI’s performance.

Last updated on