need to test r1! havent tried that
@hwchase17
-
COT Benchmarking Training in Modern AI Models
By
–
these were easy to benchmark. COT would be easy as well but is trained into most models these days so i dont think it would change too much
-
Framework-Agnostic AI Optimizers vs DSPy Dependency
By
–
dspy more powerful, but requires using the dspy framework (in addition to their optimizers) i think framework-agnostic optimizers are more usable
-
Production-ready prompt optimization methods rollout begins
By
–
We're going to start rolling out production ready versions of some of these prompt optimization methods – if interested in early access, fill out form here: https://
docs.google.com/forms/d/e/1FAI
pQLSdK1pZihqohabtGRQ99LZ2Rdwo5dnpnzJrBFRdojVSFq7k8eg/viewform?usp=dialog
… -
Claude-Sonnet Outperforms o1 for Meta-Prompting and Optimization
By
–
I think a lot of people are sleeping on using claude-sonnet to do meta-prompting/prompt optimization we found its better than o1 (and cheaper/faster) it still struggles for complex tasks (curious to see how o3 would do) but it works quite well for simpler one
-
LangChain Community Building Through In-Person Events
By
–
Great to see the LangChain community organizing irl events LangChain was started by going to a bunch of meetups and hearing what people were building – lots of value in these types of events
-
Model struggles with date handling and processing
By
–
Model just isn’t good at date stuff rn, don’t think it has to do with ui
-
Open Source AI Outperforms Major Labs’ Latest Releases
By
–
Kinda crazy that open source that came out months ago is sota over what major labs are releasing today
-
Browser Use: Open Source Alternative to OpenAI’s Operator
By
–
Want an open source version of OpenAI's Operator? There's a great open source project called Browser Use that does similar things (and more) while being open source Allows you to plug in any model you want Love to see open source leading the way https://
github.com/browser-use/br
owser-use
… -
Write LLM Evaluations Like Software Tests
By
–
write LLM evals like you write software tests (pytest/vitest/jest)
— Harrison Chase (@hwchase17) 22 janvier 2025
writing software tests is standard practice. writing evals for LLMs is equally important, but we don't see it as common place yet
we hope this helps bridge the gap! https://t.co/GZozKNbOw0write LLM evals like you write software tests (pytest/vitest/jest) writing software tests is standard practice. writing evals for LLMs is equally important, but we don't see it as common place yet we hope this helps bridge the gap!