Tuning Copilot

Products

Models

Pi Scorer

Foundation models designed for scoring

Pi Ranking

Customizable cross encoders for ranking

Pi Embedding

Customizable embeddings for retrieval

Pi User Behaviour Alpha

User engagement prediction models

Solutions

Pi Copilot

AI assistant for tuning your scorer

Spreadsheets

Analyze data with your scorer

RAG

Tune your search to user preferences

LLM Evals

Evaluate LLMs consistently

Observability

Powerful, low overhead monitoring

Resources

Tools

Featured Projects

Example scorers devs and PMs built

Handbooks

Today's best practices

Code Examples

Examples of how to use your scorer.

Company

Blog

Latest updates and research.

Discord

Our official support line

Docs

Contact

Create your scoring system

Transform product specs, code, and user data into scoring systems that effectively tell good apart from bad.

View community Learn principles

Your quality engineering assistant

Generate some metrics for a resume scorer

Here are your generated metrics

Analyzes sources

Extracts insights from usage data to determine what signals you're missing.

Brainstorms Signals

Determines what guardrails are needed for any goal you have in mind

Tunes scores

Tunes your final score based on stated requirements, and user feedback

Understand your requirements

Mine user data, PRDs, and system prompts for requirements that actually matter.

Score subjective/objective criteria

Generate natural language questions and python code to score subjective and objective criteria.

Understand your scorer

Explains scores and performance by generating tests and conducting experiments.

Tune to match user taste

Combine dimensions into a single score tuned from product requirements and user preferences.

Follow best practices

Ensure your dimensions perform effectively, together. Keeps them granular and independent.

Integrate instantly

Quickly get up and running by generating your scorer's integration code.

Build your scoring spec in seconds

Tell the copilot about your application, with PRDs, documentation, and system prompts.

Refine your scoring system by having your copilot create test cases, and edit the scorer for failing cases.

Integrate your scoring system into your code, or existing observability and eval tools.

Come back once you've collected more data, and customize your scoring even further.

Build your scorer

Talk to an expert Read the docs

Home Docs Pricing Support