Welcome to RemoteYeah 2.0! Find out more about the new version here.

Remote Staff Software Engineer, Data Ingestion

at BrightEdge

Posted 4 hours ago 0 applied

Description:

  • The Staff Software Engineer, Data Ingestion will be a critical individual contributor responsible for designing collection strategies, developing, and maintaining robust and scalable data pipelines.
  • This role is at the heart of our data ecosystem, delivering new analytical software solutions to access timely, accurate, and complete data for insights, products, and operational efficiency.
  • Key responsibilities include designing, developing, and maintaining high-performance, fault-tolerant data ingestion pipelines using Python.
  • The engineer will integrate with diverse data sources such as databases, APIs, streaming platforms, and cloud storage.
  • They will implement data transformation and cleansing logic during ingestion to ensure data quality.
  • Monitoring and troubleshooting data ingestion pipelines is essential, with a focus on identifying and resolving issues promptly.
  • Collaboration with database engineers to optimize data models for fast consumption is required.
  • The engineer will evaluate and propose new technologies or frameworks to improve ingestion efficiency and reliability.
  • Developing and implementing self-healing mechanisms for data pipelines to ensure continuity is a key task.
  • Defining and upholding SLAs and SLOs for data freshness, completeness, and availability is expected.
  • Participation in on-call rotation as needed for critical data pipeline issues is part of the role.

Requirements:

  • Candidates must have 6+ years of experience in the software development industry with a background in computer science.
  • Extensive expertise in Python is required, with a proven track record of developing robust, production-grade applications.
  • Proven experience in collecting data from various sources, including REST APIs, OAuth, GraphQL, Kafka, S3, and SFTP, is necessary.
  • A strong understanding of distributed systems concepts, including designing for scale, performance optimization, and fault tolerance, is essential.
  • Experience with major cloud providers such as AWS or GCP and their data-related services (e.g., S3, EC2, Lambda, SQS, Kafka, Cloud Storage, GKE) is required.
  • A solid understanding of relational databases, including SQL, schema design, indexing, and query optimization, is necessary; OLAP database experience (e.g., Hadoop) is a plus.
  • Experience with monitoring tools such as Prometheus and Grafana, along with setting up effective alerts, is required.
  • Proficiency with Git for version control is necessary.
  • Experience with Docker and Kubernetes is a plus.
  • Familiarity with real-time data processing using technologies such as Kafka, Flink, and Spark Streaming is also a plus.

Benefits:

  • The position offers the opportunity to work on critical data ingestion projects that impact the entire data ecosystem.
  • Employees will have the chance to collaborate with a talented team of engineers and database experts.
  • The role provides opportunities for professional growth and the ability to evaluate and implement new technologies.
  • Participation in on-call rotations allows for hands-on experience with real-time problem-solving in critical situations.
  • The company promotes a culture of innovation and continuous improvement in data processing and ingestion strategies.