The Principal Data Engineer is the leading expert member of the Global Data engineering team and is responsible for designing, developing, and maintaining infrastructure to support reporting and analytics needs across the company.
The engineer will work extensively with other engineers, architects, and analysts to provide insights and drive decision-making in the Global Data department.
The principal engineer should consistently drive best practices of development and have an excellent understanding of the tools used in iHerb's data platform.
Responsibilities include designing and developing pipelines that support data ingestion, curation, and provisioning of complex enterprise data for analytics and reporting.
The role involves successful deployment and provisioning of data solutions to required environments and designing data architecture and applications for efficient pipelines.
The engineer will manage data pipeline jobs throughout their lifecycle and assist in designing efficient data models for business intelligence and analytics.
The position requires analyzing and translating business needs into data models and interacting with cross-functional teams to gather and define requirements.
The engineer will build strong partnerships with Data Scientists, Analysts, Product Managers, and Software Engineers to understand and deliver on data needs.
Continuous improvement of understanding of data and applications across the business is expected, along with leading processes that ensure site reliability for the data stack.
The engineer will optimize and tune code performance, develop best practices for naming conventions and coding practices, and engage with technical teams for cohesive infrastructure guidelines.
Responsibilities also include partnering with IT and Legal to design secure processes, identifying data quality validations, leading pipeline code changes, and mentoring fellow engineers.
Requirements:
A minimum of 7 years of programming experience with Python is required.
At least 3 years of experience working with APIs is necessary.
Experience with Docker and/or Kubernetes is required.
Proven experience working with large datasets is essential.
Proficiency in shell scripting and building automated testing within CI/CD is required.
Experience in Agile methodologies and a DevOps approach to maintaining pipelines and databases is necessary.
Excellent knowledge of software engineering fundamentals is required.
A deep understanding of data lifecycles, data computation principles, and data stores is essential.
Proficiency with Databricks and experience building scalable data platforms is required.
Experience with building data pipelines and ETL using PySpark on semi-structured data is necessary.
Advanced working SQL experience and a comprehensive understanding of data modeling principles are required.
Knowledge of data privacy regulations and data encryption practices is essential.
Strong problem-solving, analytical, facilitation, and communication skills are necessary.
Ability to mentor team members in best practices and processes in data platforms is required.
A Bachelor or Master’s degree in computer science, Information Systems, or related fields is preferred.
Benefits:
Employees are eligible to participate in medical, dental, vision, and basic life insurance programs.
Employees may enroll in the company’s 401(k) plan and are eligible for Time Off and Paid Sick Leave according to company policies.
Paid holidays are provided throughout the calendar year.
Hired applicants may receive Restricted Stock Units and annual bonuses based on eligibility and performance criteria.
For more information on benefits, employees can visit iHerbBenefits.com.