
GRPO's Main Flaw
GRPO may not be the best choice for training reasoning models. Here’s why.
GRPO may not be the best choice for training reasoning models. Here’s why.
A unifying framework for linear attention mechanisms as test-time regression and how to parallelize training and inference.
Instead of asking, ‘Which optimizer should I use?’ ask, ‘In which space do my features live in?’
Could ChatGPT’s shorter responses be an indication of something more bizarre going on?
Years of experience in building artificial minds led me to believe that these AIs may end up seeming more ‘human’ than we currently imagine them to be.
On how to gain and maintain power.
A thought dump on mRNA vaccines and the future of computational biology
Machine learning models merely amplify our biases - not eliminate them.
Whether you’re only here for the hype or genuinely interested in the field, you’re in for a wild ride.