FinetuneDB developed a ticketing system approach that enables both human evaluators, such as domain experts, and AI systems to participate in the review process. In this framework, model outputs are treated like tickets that can be claimed, reviewed, and refined by either humans or AI, depending on the complexity and nature of the feedback required.

Hero Light

Create New Evaluator

Create and define an evaluator by giving it a specific name and describing its purpose clearly to establish the objectives within the evaluation process.

Hero Light

Add Logs to Evaluate

Use filters to select specific logs that align with the evaluation’s focus. This ensures that only relevant data is included, making the evaluation process more efficient.

Hero Light

Write Instructions

Write concise instructions for human reviewers, detailing what to evaluate and how to assess the logs. Clear guidelines help maintain consistency and accuracy in evaluations.

Hero Light

Evaluate and Improve

Reviewers evaluate the logs and adjust outputs as needed to enhance quality or relevance.

Feedback Loop: Feed the improved outputs back into the dataset for further training, to improve model performance in future fine-tuning cycles.

Hero Light

LLM-as-Judge (Beta)

Use a powerful LLM to review model outputs. This AI-driven feedback mechanism assesses the appropriateness and quality of responses. Incorporates AI suggestions directly into the feedback loop, allowing for rapid iteration and enhancement of model outputs based on predefined criteria and learned preferences.