I am Ambuje, an AI Scientist with 5+ years of experience. If I need to describe my superpower, that will be Reading, implementing research papers. This has always helped me to develop SOTA AI solutions
Would list down some of my work, what I have recently worked on
- Fine-tuning LLMs (upto 70B)(SFT, RL)
- When to use LoRA, Tiny-LoRA and RS-LoRA (For domain-specific, we should use RS-LoRA. We would need to understand the mathematics to know the difference between LoRA and RS-LoRA(the difference is just dividing by root(r)))
- Agent Memory - How we can efficiently manage the state
- When to use Single vs multi-agent architecture (backed by research, not by gut :) )
- Solved LLM Latency by smartly caching the planning stage in Agents
- Build Voice Agents (our TTFT latency was 1.2 sec, best in business for 70B model, for 8B it was less than 100ms)
- Build a knowledge graph for Agent memory as well as RAG
- Continuous pre-training of encoder-based models for domain-finetuning. This also included a model fusion to avoid catastrophic forgetting(trained using contrastive loss)
- Implemented simple prompt engineering ways to make LLM safe by prompt injection attack
- Agent Evaluation:- Apart from academic benchmark(P, R, F1), developed a product KPI for business in terms of goal completion and progress rate using a self generated worflows graph
- Built a data generation pipeline for training LLMs(workflow graph-oriented)
- Agent orchestration, built a framework from scratch in which you can configure your agents(Can use any architecture:- Single agent, swarm agents,supervisor-worker agent, constellation architecture)
Research
- Learning RL from scratch
- Know how to make a SLM from scratch (MOE architecture)