HN News

AskShowNewJobs

Beyond Benchmark Maxxing: Measuring Open Source Models as Real-World Agents (ultravox.ai)

by

zkoch

7 hours ago

1 points

0 comments 

No comments yet