Zain ul Abideen abideenml

ML engineer adept at LLM pretraining, fine-tuning, rlhf, rag, and agentic workflows.

llm.pth - Hackable implementations of Autoregressive models (Llama, mixtral, gemma, deepseek), Research papers (cope, yarn, mod, mome, mla) and techniques (sft, dpo, kto, ipo) in Pytorch.
LightAgents - A wrapper free Agents library with RAG, function calling, json mode, telemetry and multi-layer memory.
llama3.cuda - llama3.cuda is an implementation of Llama 3.1 in pure C/CUDA. Consists of Swiglu, RoPE, CSE, RMSNorm and GQA kernels.

Elemental Compute - Implemented a self-optimizing multimodal pipeline with RAG, Agentic workflow, and open-source AI using LLM-as-a-Judge and Mixture of Agents. Managed 30+ GPUs for multi-node inference of the entire multimodal pipeline consisting of LLama-3.1 70B, Phi-3-medium-128k-instruct, Llava-next-8b, and SDXL-Lightning.
John Snow Labs - Released a series of JSL-MedX 3B, 7B, 8B, and 70B LLMs in the Healthcare domain. JSL-MedX models are ranked No. 1 on the Open Medical Leaderboard across all Param variants.
QueryLoopAi - Pre-trained a 500M SLM from scratch on a carefully curated high-quality 15B tokens synthetic dataset. Created the entire training and evaluation pipeline along with managing training on 8xA100s. Created Kendrick, a mixture of experts model with 32k experts and Multi-latent head attention.

View the archives (42 posts) @ zain.com.

Provide feedback