
Multimodal Structured Generation
Generate interleaved text and image content in a structured format you can directly pass to downstream APIs.
Generate interleaved text and image content in a structured format you can directly pass to downstream APIs.
A minimal implementation of Flash Attention 1 & 2 in just ~350 lines of CUDA code. This is still a work-in-progress, but the ultimate goal is to implement the various variations of Hyperbolic Attention in CUDA.
A C++ implementation of Meta’s Llama2 generative large-language model. I also optimized the original C implementation by Karpathy by adding parallelization on the multi-head attention layer.
Expedock Assistant is a chatbot that allows you to ask questions about your shipments and get answers in real time. It’s like having a personal assistant that knows everything about your business, shipments and industry.
Expedock’s AutoML Library – fit a model, run batch inference, and get explanations in one line of code each.
Booking demand prediction for Grab’s Southeast Asia operations. The project involves spatio-temporal forecasting, anomaly detection, and econometric modeling.
My entry for the World Finals of the Russian AI Cup 2018 - Codeball. A 3D physics-aware orchestrator of a pair of bots in a Rocket League-esque soccer game.
My entry for the World Finals of the Russian AI Cup 2017 - Codewars. A particle swarm-based AI that uses potential flows and fluid mechanics to direct units in a Command-and-Conquer-esque game.
A collection of algorithms, data structures and other useful information for competitive programming. Used and maintained by members of the Ateneo de Manila University Programming Varsity.