This job post is closed and the position is probably filled. Please do not apply.
🤖 Automatically closed by a robot after apply link
was detected as broken.
Description:
The Senior Data Engineer - Datalake is responsible for analyzing models, designing, creating, modifying, and supporting complex systems, processes, or operations to enable optimal business capabilities.
Key duties include creating Application and System Design Documents, developing applications, reports, systems, and enterprise solutions.
The role involves estimating component/application system level and enterprise solution work efforts, creating RFI/RFP requests/responses for vendor product evaluations, and designing, developing, and implementing complex business rules.
The engineer is responsible for fulfilling end-user requests, providing on-call support as required, and assisting in training less experienced individuals.
The position requires delivering personal tasks on time and leading the delivery of tasks for natural or cross-functional workgroups.
Participation in initiatives with deliverables and meeting quality standards on time is essential, as well as leading cross-functional initiatives.
Requirements:
A Bachelor’s degree or equivalent work experience is required.
Candidates must have 5+ years of experience in Data Engineering or ETL Development roles.
Strong experience with PySpark and Python for building solid data pipelines is necessary.
Experience with Iceberg, Hive, S3, and Trino is required.
Hands-on experience with Hadoop ecosystems, relational databases, and SQL queries is essential.
Familiarity with Apache Ranger, Rancher/Kubernetes is preferred.
Experience with Talend, Red Point, or other ETL technologies is an advantage.
Candidates should have experience with Agile Software Development methodologies and GitLab, CI/CD processes, and ServiceNow.
Solid programming skills in object-oriented/functional scripting languages like Python and PySpark are required, along with experience in testing and logging for quality assurance.
Experience in distributed systems and parallel data processing using big data tools such as Spark, PySpark, Hadoop, Kafka, and Hive is required.
Proficiency in querying with relational databases and strong knowledge of Linux/Unix-based systems is necessary.
Experience in building Data Processing pipelines using ETL tools like Talend or SSIS is required.
Understanding of Machine Learning models and algorithms is beneficial, as well as proficiency in data visualization tools like Tableau or matplotlib.
AWS cloud experience in Redshift, Lambda, Sage Maker, and Glue is a plus.
Experience with building Rest APIs and excellent data analytical, conceptual, and problem-solving skills are required.
Excellent communication skills to promote cross-team collaboration are essential.
Benefits:
The base salary for this position ranges from $100,000 to $130,000, depending on skill level, cost of living, experience, and responsibilities.
Vericast offers a generous total rewards benefits package that includes medical, dental, and vision coverage.
A 401K plan with company match and a generous PTO allowance are provided.
Additional benefits include life insurance, employee assistance programs, and pet insurance.
Employees can enjoy a supportive work environment with smart and friendly coworkers.