Senior AI/ML & Backend Engineer (5+ years) specializing in LLM systems, retrieval‐augmented generation (RAG), and high‐throughput inference. Built production AI services across fintech, vector databases, and large‐scale consumer platforms. Expert with PyTorch/TensorFlow/JAX; agent frameworks (LangGraph, AutoGen, CrewAI, OpenAI Assistants/Agents); vector DBs (Qdrant, Milvus, Pinecone, Weaviate, FAISS, Chroma, pgvector, Redis‐Vector); and low‐latency serving using vLLM, TensorRT‐LLM, NVIDIA Triton, TGI, and ONNX Runtime. Strong emphasis on reliability (Ray, KServe, BentoML, Kubernetes), observability (Langfuse, W&B, Arize), and evaluation (Ragas).
No skills.
No languages.
No employment history.
No education history.