Description:

Teikametrics is seeking a Senior Software Engineer - Data Engineering with strong computer science fundamentals and a background in data engineering, API Integration, or large-scale data processing.
The role involves designing, developing, and scaling robust data pipelines to process massive amounts of structured and unstructured data.
The candidate will collaborate closely with data scientists, analysts, and product engineers to deliver high-performance, scalable solutions.
The technology stack includes Databricks, Spark (Scala), Kafka, AWS S3, and other distributed computing tools.
Responsibilities include designing and implementing highly scalable, fault-tolerant data pipelines for real-time and batch processing, developing and optimizing end-to-end Databricks Spark pipelines, and building and managing ETL processes.
The role also requires implementing data validation, governance, and quality assurance mechanisms, collaborating with data scientists and ML engineers, and improving performance and efficiency of data workflows.
Documentation of technical designs, workflows, and best practices is also a key responsibility.

Requirements:

Candidates must have 4+ years of experience as a professional software/data engineer, with a strong background in building large-scale distributed data processing systems.
Experience with AI, machine learning, or data science concepts is required, including working on ML feature engineering and model training pipelines.
Hands-on experience with Apache Spark (Scala or Python) and Databricks is essential.
Familiarity with real-time data streaming technologies such as Kafka, Flink, Kinesis, or Dataflow is necessary.
Proficiency in Java, Scala, or Python for building scalable data engineering solutions is required.
A deep understanding of cloud-based architectures (AWS, GCP, or Azure) and experience with S3, Lambda, EMR, Glue, or Redshift is needed.
Candidates should have experience in writing well-designed, testable, and scalable AI/ML data pipelines.
A strong understanding of data warehousing principles and best practices for optimizing large-scale ETL workflows is required.
Experience with ML frameworks such as TensorFlow, PyTorch, or Scikit-learn is preferred.
Knowledge of SQL and NoSQL databases for structured and unstructured data storage is necessary.
A passion for collaborative development, continuous learning, and mentoring junior engineers is essential.

Every Teikametrics employee is eligible for company equity.
The position offers remote work flexibility, allowing employees to work from home or from the office, along with a remote working allowance.
Employees receive broadband reimbursement.
Group Medical Insurance is provided, covering INR 7,50,000 per annum for a family.
A crèche benefit is available for employees with children.
There is a training and development allowance to support continuous learning and professional growth.