> ## Documentation Index
> Fetch the complete documentation index at: https://docs.finetunedb.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Evaluations

> Evaluations in FinetuneDB are designed to measure and improve the quality of model outputs through collaborative and automated processes.

## Framework

FinetuneDB developed a ticketing system approach that enables both human evaluators, such as domain experts, and AI systems to participate in the review process. In this framework, model outputs are treated like tickets that can be claimed, reviewed, and refined by either humans or AI, depending on the complexity and nature of the feedback required.

<img className="block dark:hidden" src="https://mintcdn.com/finetunedb/bemL8Zj00kYdZwsp/images/evaluation-start.webp?fit=max&auto=format&n=bemL8Zj00kYdZwsp&q=85&s=dd1e3a0ce541983b62b422baa706e2c8" alt="Hero Light" width="5376" height="2754" data-path="images/evaluation-start.webp" />

<img className="hidden dark:block" src="https://mintcdn.com/finetunedb/bemL8Zj00kYdZwsp/images/evaluation-start.webp?fit=max&auto=format&n=bemL8Zj00kYdZwsp&q=85&s=dd1e3a0ce541983b62b422baa706e2c8" alt="Hero Dark" width="5376" height="2754" data-path="images/evaluation-start.webp" />

## Create New Evaluator

Create and define an evaluator by giving it a specific name and describing its purpose clearly to establish the objectives within the evaluation process.

<img className="block dark:hidden" src="https://mintcdn.com/finetunedb/bemL8Zj00kYdZwsp/images/create-evals-focus.webp?fit=max&auto=format&n=bemL8Zj00kYdZwsp&q=85&s=d6209525d030805db590c081a3a5f70b" alt="Hero Light" width="2463" height="2091" data-path="images/create-evals-focus.webp" />

<img className="hidden dark:block" src="https://mintcdn.com/finetunedb/bemL8Zj00kYdZwsp/images/create-evals-focus.webp?fit=max&auto=format&n=bemL8Zj00kYdZwsp&q=85&s=d6209525d030805db590c081a3a5f70b" alt="Hero Dark" width="2463" height="2091" data-path="images/create-evals-focus.webp" />

## Add Logs to Evaluate

Use filters to select specific logs that align with the evaluation's focus. This ensures that only relevant data is included, making the evaluation process more efficient.

<img className="block dark:hidden" src="https://mintcdn.com/finetunedb/bemL8Zj00kYdZwsp/images/add-logs-to-eval-focus.webp?fit=max&auto=format&n=bemL8Zj00kYdZwsp&q=85&s=22c53ac078a96695644be9bdd0e0b596" alt="Hero Light" width="4272" height="2400" data-path="images/add-logs-to-eval-focus.webp" />

<img className="hidden dark:block" src="https://mintcdn.com/finetunedb/bemL8Zj00kYdZwsp/images/add-logs-to-eval-focus.webp?fit=max&auto=format&n=bemL8Zj00kYdZwsp&q=85&s=22c53ac078a96695644be9bdd0e0b596" alt="Hero Dark" width="4272" height="2400" data-path="images/add-logs-to-eval-focus.webp" />

## Write Instructions

Write concise instructions for human reviewers, detailing what to evaluate and how to assess the logs. Clear guidelines help maintain consistency and accuracy in evaluations.

<img className="block dark:hidden" src="https://mintcdn.com/finetunedb/bemL8Zj00kYdZwsp/images/eval-instructions-focus.webp?fit=max&auto=format&n=bemL8Zj00kYdZwsp&q=85&s=da6d0952053d6a58b3d92b993be245f2" alt="Hero Light" width="2070" height="1476" data-path="images/eval-instructions-focus.webp" />

<img className="hidden dark:block" src="https://mintcdn.com/finetunedb/bemL8Zj00kYdZwsp/images/eval-instructions-focus.webp?fit=max&auto=format&n=bemL8Zj00kYdZwsp&q=85&s=da6d0952053d6a58b3d92b993be245f2" alt="Hero Dark" width="2070" height="1476" data-path="images/eval-instructions-focus.webp" />

## Evaluate and Improve

Reviewers evaluate the logs and adjust outputs as needed to enhance quality or relevance.

**Feedback Loop:** Feed the improved outputs back into the dataset for further training, to improve model performance in future fine-tuning cycles.

<img className="block dark:hidden" src="https://mintcdn.com/finetunedb/bemL8Zj00kYdZwsp/images/evaluate-outputs.webp?fit=max&auto=format&n=bemL8Zj00kYdZwsp&q=85&s=fef0c4b6008fdf08cf319b7ba06f2787" alt="Hero Light" width="5376" height="2754" data-path="images/evaluate-outputs.webp" />

<img className="hidden dark:block" src="https://mintcdn.com/finetunedb/bemL8Zj00kYdZwsp/images/evaluate-outputs.webp?fit=max&auto=format&n=bemL8Zj00kYdZwsp&q=85&s=fef0c4b6008fdf08cf319b7ba06f2787" alt="Hero Dark" width="5376" height="2754" data-path="images/evaluate-outputs.webp" />

## LLM-as-Judge (Beta)

Use a powerful LLM to review model outputs. This AI-driven feedback mechanism assesses the appropriateness and quality of responses.
Incorporates AI suggestions directly into the feedback loop, allowing for rapid iteration and enhancement of model outputs based on predefined criteria and learned preferences.

<iframe width="560" height="315" src="https://www.youtube.com/embed/XuMT82l-lcI" title="Evaluations" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen />
