G
Muhammad Ghufran
From United States 08:00 PM (GMT-05:00)
$45/hr

Active over a week ago


Member since Sep 2025

Share this profile:

Senior Full Stack Developer

Data Engineer
Available for hire
Years of experience
6+ years
Download Resume / CV

My name is Muhammad Ghufran and I am a Data Engineer with over five years of experience building and supporting modern data platforms. I specialize in designing scalable ELT pipelines using Snowflake, dbt, Kafka, and AWS, and I have a proven track record of turning raw data into analytics-ready datasets that drive business insights.

I am passionate about solving complex data challenges and creating systems that scale. I bring strong technical skills, cloud platform knowledge, and a collaborative mindset to every project. I am excited to contribute to a team where I can apply my experience and continue to grow as a data engineer.

Languages

Employment History

Data Engineer at Perficient Current 2024 - Now
● Designed Snowflake as the core enterprise data warehouse with clustering, materialized views, and Snowpark for high performance, scalability, and cost efficiency. ● Built and maintained end-to-end ELT pipelines using Snowflake, Snowpark (Python), PySpark, dbt, and Airbyte, following Medallion Architecture (Bronze, Silver, Gold). ● Developed Python scripts and Snowflake UDFs for reusable transformations, data cleansing, and standardizing business logic. ● Built and supported real-time pipelines with Apache Kafka and Kafka Connect for APIs, IoT, and transactional data ingestion. ● Implemented CDC in Snowflake using streams, tasks, Python UDFs, and stored procedures for near real-time synchronization. ● Automated ELT workflows with dbt, layered models, tests, documentation, and CI/CD via GitHub Actions
Data Engineer II at Beyondsoft 2020 - 2024
● Assisted in building ETL pipelines using Snowflake, Snowpark (Python), PySpark, dbt, Airbyte, and AWS S3, following Medallion Architecture (Bronze, Silver, Gold). ● Wrote Python scripts and Snowflake UDFs to standardize business logic and support transformations. ● Supported pipeline orchestration with Apache Airflow and Azure Data Factory (ADF), monitoring jobs, troubleshooting failures, and coordinating ingestion from on-prem and cloud systems. ● Helped implement dbt tests, documentation, and automated ELT workflows to ensure high-quality, analytics-ready datasets. ● Queried and validated data in AWS S3 and Athena for reporting and ad-hoc analysis. ● Designed and delivered scalable batch and streaming pipelines using Python, PySpark, and SQL, integrating Apache Kafka and Kafka Connect for real-time ingestion from APIs, IoT, and transactional sources. ● Implemented CDC in Snowflake using streams, tasks, Python UDFs, and stored procedures for incremental loads. ● Executed large-scale transformations in PySpark, tuning performance with partitioning, caching, and adaptive query execution. tuning performance with part ● Designed data models and optimized queries in Snowflake, leveraging Snowpark for advanced in-warehouse processing. ● Architected AWS S3 data lakehouse solutions with lifecycle management to Glacier, supporting raw, curated, and enriched layers. ● Automated pipeline build, test, and deployment using GitHub Actions and Azure DevOps. ● Partnered with data scientists and analysts to convert business requirements into production-grade datasets for reporting and ML use cases. ● Monitored pipeline performance and optimized PySpark, SQL, and Snowflake workloads to improve freshness and reduce costs. ● Strengthened data quality with validation rules, schema enforcement, and logging frameworks across the platform.

Education

Bachelor’s Degree in Computer Science at Kean University Union 2015 - 2019