Remote Backend Data Developer - Scraping & Conversión de Java a Python

Posted

This job is closed

This job post is closed and the position is probably filled. Please do not apply.  Automatically closed by a robot after apply link was detected as broken.

Description:

  • We are looking for a Backend Data Developer with experience in data ingestion from web portals and conversion of scrapers from Java to Python.
  • The ideal candidate has skills in data extraction and processing, scraper optimization, and modernization of architectures to improve efficiency.
  • If you have experience in scraping, code migration, and data storage, this opportunity is for you!

Requirements:

  • Scraping and Data Ingestion: You must have experience in developing and maintaining scrapers to extract data from web portals. Familiarity with Selenium, Puppeteer, or Playwright for dynamic scraping is required. You should also be skilled in handling headers, user-agents, proxies, and techniques to avoid blocks and captchas. Experience in consuming REST and GraphQL APIs for data extraction is necessary.
  • Scraper Conversion (Java → Python): You need to have experience with Java (Spring Boot, Jsoup, HttpClient) and Python (Scrapy, Selenium, Playwright, FastAPI). Refactoring and optimizing legacy scrapers in Java to Python is essential. You should be able to implement more efficient architectures to improve performance and use asynchronous techniques and parallelization in scraper optimization.
  • Data Processing and Storage: Experience with pandas and NumPy for data manipulation and cleaning is required. Knowledge of SQL databases (PostgreSQL, MySQL, SQL Server) is necessary. Experience with MongoDB, Elasticsearch, or Redis is optional.
  • DevOps and Deployment: You must have experience implementing scrapers in scalable environments using Docker and Kubernetes. Deployment experience in AWS or GCP is required. You should also be familiar with CI/CD configuration using GitHub Actions, GitLab CI/CD, or Jenkins.
  • (Optional) ETL and Data Pipelines: Experience in creating data processing flows with Apache Airflow or Composer is a plus. You should be capable of orchestrating and structuring extracted data.

Benefits:

  • The position is 100% remote, allowing for flexible work arrangements.
  • The contract type is a contractor position, initially for 3 months with the possibility of renewal.
  • Full-time dedication is required for this role.
  • No English language requirement is necessary for this position.
  • The salary ranges from USD 2,300 to 2,500 monthly.
Leave a feedback