Skip to main content

One post tagged with "v0.51.0"

View All Tags
v0.51.0

Multiple Metrics in Human Evaluation

We rebuilt the human evaluation workflow from scratch. Now you can set multiple evaluators and metrics and use them to score the outputs.

This lets you evaluate the same output on different metrics like relevance or completeness. You can also create binary, numerical scores, or even use strings for comments or expected answer.

Watch the video below and read the post for more details. Or check out the docs to learn how to use the new human evaluation workflow.

Read more →