Multimodal Structured Generation
Generate interleaved text and image content in a structured format you can directly pass to downstream APIs.
Generate interleaved text and image content in a structured format you can directly pass to downstream APIs.
A minimal implementation of Flash Attention 1 & 2 in just ~350 lines of CUDA code. This is still a work-in-progress, but the ultimate goal is to implement the various variations of Hyperbolic Attention in CUDA.
A C++ implementation of Meta’s Llama2 generative large-language model. I also optimized the original C implementation by Karpathy by adding parallelization on the multi-head attention layer.
Expedock Assistant is a chatbot that allows you to ask questions about your shipments and get answers in real time. It’s like having a personal assistant that knows everything about your business, shipments and industry.
Expedock’s AutoML Library – fit a model, run batch inference, and get explanations in one line of code each.
A collection of algorithms, data structures and other useful information for competitive programming. Used and maintained by members of the Ateneo de Manila University Programming Varsity.
My entry for the World Finals of the Russian AI Cup 2018 - Codeball. A 3D physics-aware orchestrator of a pair of bots in a Rocket League-esque soccer game.
My entry for the World Finals of the Russian AI Cup 2017 - Codewars. A particle swarm-based AI that uses potential flows and fluid mechanics to direct units in a Command-and-Conquer-esque game.
Booking demand prediction for Grab’s Southeast Asia operations. The project involves spatio-temporal forecasting, anomaly detection, and econometric modeling.