Pi labs logo
/
Pi Scoring Models
Architecture and Design
Architecture and Design
Signals and Scoring: An architecture inspired by Ranking
Break down a complex Scoring problem into simpler constituent signals.
scores
Core Insight
Decouple two fundamental tasks in Scoring
components
1
Generate Signals
Turn Subjective Metric → Tree of Signals
  1. Start with an LLM for language understanding, attention, and world knowledge.
  2. Swap the Generative head with a Regression head to train for a Scoring task instead of Generation.
  3. Remove Masking to allow the model to look at all tokens at the same time, increasing accuracy.
  4. Train for Question Answering, not for Instruction Following.
  5. Run inference in parallel on the whole tree (all signals) together to lower latency.
  6. Target small models (<1B, 1B, 8B) with large context windows, as Scoring is a simpler task and the questions are designed to be simple without requiring reasoning.
2
Compute Signals
Calibrate signal combination against Ground Truth
components
  1. Move implicit model reasoning to explicit combinations. More control, explainability, and debuggability.
  2. Pi scoring models handle subjective dimensions. A node can also be expressed in code for objective dimensions as well as other other trained models.
  3. GAMs provide explainable non-linear combinations (vs. other simple weighted average, or inscrutable black boxes etc.)
  4. GAMs & ASTs can be initialized by prompting, calibrated manually by humans and eventually trained with Rater data.
  5. Abstract Syntax Trees handle control flow for more complex decision making in scoring.
Next up
Scoring Api
Learn how Pi's scoring framework works
Start
© 2025, Pi Labs Inc.