This sampling of MIT CSAIL papers at ICLR shows a common need for efficient models that can reason about complex, real-world problems. More compute helps, but the ways these machines "think" also need refinement. You can find more info about these projects on our website:
@mit_csail
-

MathNet: World’s Largest Olympiad Math Problem Dataset
By
–
“MathNet” MIT, KAUST & HUMAIN have built the world's largest collection of Olympiad-level math problems. The dataset can help students prepare for competitions & revealed how AI models can improve at math problems, esp. ones w/visual reasoning: https://
bit.ly/4vBr2kl -

MIT Improves Reasoning Model Calibration Through Reinforcement Learning
By
–
“Reinforcement Learning w/Calibration Rewards” What makes top reasoning models overconfident? MIT found that in these models, RL rewards correct answers, not certainty. Training models to estimate confidence improved calibration while maintaining accuracy:
-
MIT CSAIL Advances Reliable AI Systems at ICLR Conference
By
–
This week, MIT CSAIL will join other top ML researchers at ICLR to tackle a shift in focus from more powerful AI to more reliable systems Our papers at the conference show how to potentially make AI models stronger critical thinkers, more honest, & better at math
-

MIT Harvard Study AI Agents Critical Thinking Battleship
By
–
“Collaborative Battleship” MIT & Harvard developed a collaborative version of Battleship to see if AI agents are as good at asking questions as answering them. They found that many LMs struggle w/critical thinking, but Monte Carlo inference strategies can help even tiny
-

MIT releases MathNet, largest IMO dataset for AI
By
–
Today, MIT & the IMO released MathNet, the world’s largest dataset of International Math Olympiad problems & solutions MathNet is 5x larger than previous datasets & is sourced from over 40 countries across 4 decades: https://
bit.ly/4u1bhBC -

100 Tips to Maximize Claude AI Model Effectiveness
By
–
100 tips to get the most out of Claude, v/
@BoucherNicolas
. -
25 Years of Innovation: Technologies That Reshaped Our World
By
–
Things we didn't have 25 years ago: iPhone
Facebook
YouTube
Twitter/X
Instagram
Android
Bitcoin
Tesla
Gmail
WhatsApp
Snapchat
Zoom
Amazon Prime
Airbnb
Uber
Dropbox
LinkedIn
Reddit
ChatGPT v/
@stats_feed -
Code Truthfulness Over Documentation Comments
By
–
"Code never lies, comments sometimes do." — Ron Jeffries
-
Team Research Published in Nature Methods by Yaron Meirovitch
By
–
The team's work is now in Nature Methods, by @YaronMeirovitch et al: