I love this idea. Kind of like @wirecutter for models. At least for a little while, human reviewers might be needed. LLM-as-judge might be a bit too brittle for this. If anyone is working on this, let me know. Very interested in supporting.
Building a Wirecutter for AI Model Reviews and Evaluation
By
–
Leave a Reply