Cover

GRPO's Main Flaw

GRPO may not be the best choice for training reasoning models. Here’s why.

February 11, 2025 · Franz Louis Cesista
Cover

(Linear) Attention as Test-Time Regression

A unifying framework for linear attention mechanisms as test-time regression and how to parallelize training and inference.

January 27, 2025 · Franz Louis Cesista
Cover

Deep Learning Optimizers as Steepest Descent in Normed Spaces

Instead of asking, ‘Which optimizer should I use?’ ask, ‘In which space do my features live in?’

October 20, 2024 · Franz Louis Cesista
Cover: ChatGPT May Have Developed Seasonal Depression

ChatGPT May Have Developed Seasonal Depression

Could ChatGPT’s shorter responses be an indication of something more bizarre going on?

December 16, 2023 · Franz Louis Cesista
Cover: The 'Human' Mind May Be Universal

The Human Mind May Be Universal

Years of experience in building artificial minds led me to believe that these AIs may end up seeming more ‘human’ than we currently imagine them to be.

December 10, 2023 · Franz Louis Cesista
Cover: Four Rules for Rulers

Four Rules for Rulers

On how to gain and maintain power.

June 19, 2022 · Franz Louis Cesista
Cover

Vaccine Search as a Computational Problem

A thought dump on mRNA vaccines and the future of computational biology

February 6, 2021 · Franz Louis Cesista
Cover: The Accuracy-Fairness Dilemma in Machine Learning

The Accuracy-Fairness Dilemma in Machine Learning

Machine learning models merely amplify our biases - not eliminate them.

October 24, 2020 · Franz Louis Cesista
Cover: How to Master Machine Learning

How to Master Machine Learning: 3 Tips to Get Started

Whether you’re only here for the hype or genuinely interested in the field, you’re in for a wild ride.

September 12, 2020 · Franz Louis Cesista