The Wikimedia Foundation's Data Platform team enables a global knowledge ecosystem through robust data capabilities that serve both internal teams and the public.
As a Data Engineer, you will shape the future of Wikimedia’s vast data ecosystem, contributing to the unification of data systems across the foundation.
You will develop scalable solutions that support the open knowledge movement.
Responsibilities include designing and building data pipelines, monitoring data quality, supporting data governance, collaborating with peers, and enhancing operational excellence.
The work environment is remote-first, with a geographically distributed team reporting to the Group Product Manager, Data Platform.
Requirements:
You must have 3+ years of data engineering experience, with exposure to on-premise systems such as Spark, Hadoop, and HDFS.
A strong understanding of engineering best practices is required, with an emphasis on writing maintainable and reliable code.
Hands-on experience in troubleshooting systems and pipelines for performance and scaling is necessary.
Desirable qualifications include exposure to architectural/system design or technical ownership, as well as experience in data governance, data lineage, and data quality initiatives.
Core technical skills include working experience with data pipeline tools like Airflow, Kafka, Spark, and Hive, proficiency in Python or Java/Scala, knowledge of SQL, and familiarity with CI/CD processes and software containerization.
Bonus skills include familiarity with technologies such as Kubernetes, Flink, Iceberg, Druid, and AI development tooling.
Benefits:
The Wikimedia Foundation offers competitive and equitable salaries, with the anticipated annual pay range for this position in the U.S. being between US$101,102 to US$156,045.
The organization values diversity and inclusivity, encouraging applicants from various backgrounds to apply.
Employees have access to a remote-first work environment, allowing for flexibility in work location.
The foundation provides support for employees requiring accommodations due to disabilities during the application process.