The salary range for this position is Rs 4000000 - Rs 7000000 (INR 40-70 LPA).
A minimum of 5 years of experience is required.
The position is remote and based in India.
This is a full-time job.
The candidate will build and own AI-backed features end to end, from ideation to production, including layout logic, smart cropping, visual enhancement, out-painting, and GenAI workflows for background fills.
The role involves designing scalable APIs that wrap vision models like BiRefNet, YOLOv8, Grounding DINO, SAM, CLIP, ControlNet, etc., into batch and real-time pipelines.
The candidate will write production-grade Python code to manipulate and transform image data using NumPy, OpenCV (cv2), PIL, and PyTorch.
The role requires handling pixel-level transformations with speed and precision, including custom masks, color space conversions, geometric warps, and contour operations.
The candidate will integrate models into a production web app (AWS-based Python/Java backend) and optimize them for latency, memory, and throughput.
The role includes framing problems when specifications are vague and helping define what “good” looks like.
Collaboration with product, UX, and other engineers is essential, as the candidate will own their domain without relying on formal handoffs.
Requirements:
The candidate must have 2–3 years of hands-on experience with vision and image generation models such as YOLO, Grounding DINO, SAM, CLIP, Stable Diffusion, VITON, or TryOnGAN, including experience with inpainting and outpainting workflows using Stable Diffusion pipelines.
Strong hands-on knowledge of NumPy, OpenCV, PIL, PyTorch, and image visualization/debugging techniques is required.
The candidate should have 1–2 years of experience working with popular LLM APIs such as OpenAI, Anthropic, and Gemini, and composing multi-modal pipelines.
A solid grasp of production model integration, including model loading, GPU/CPU optimization, async inference, caching, and batch processing, is necessary.
Experience solving real-world visual problems like object detection, segmentation, composition, or enhancement is required.
The candidate must have the ability to debug and diagnose visual output errors, such as segmentation artifacts, off-center crops, and broken masks.
A deep understanding of image processing in Python, including array slicing, color formats, augmentation, geometric transforms, and contour detection, is essential.
Experience building and deploying FastAPI services and containerizing them with Docker for AWS-based infrastructure (ECS, EC2/GPU, Lambda) is required.
A customer-centric approach is necessary, with a focus on how work affects end users and product experience, not just model performance.
The candidate should have a quest for high-quality deliverables, writing clean, tested code and debugging edge cases until they are truly fixed.
The ability to frame problems from scratch and work without strict handoffs is essential, as the candidate will build from a goal, not a ticket.
Benefits:
The position offers a competitive salary range of Rs 4000000 - Rs 7000000 (INR 40-70 LPA).
The role is fully remote, allowing for flexibility in work location.
The opportunity to work on cutting-edge AI technologies and contribute to innovative projects.
The candidate will have the chance to collaborate with a diverse team of professionals in a dynamic work environment.
There is potential for professional growth and development within the company.