hackquest logo

Data Engineer

Bravissimo Resourcing Inc.

120 - 140K PHP
Full-time
N/A

Qualifications:

  • Education: BS in CS, Engineering, or related field (or equivalent experience) with strong programming fundamentals.
  • Experience: 7+ years as a Data Engineer handling data and ETL processes.
  • Azure Suite: Azure Data Factory, Synapse, Databricks, Blob Storage, and Data Lake Gen 2.
  • SQL & RDBMS: Efficient SQL DML queries for modern RDBMS (SQL Server, PostgreSQL).
  • Software Engineering: Strong understanding of CI/CD, version control, and testing applied to data.
  • Big Data: Hands-on experience with Spark.
  • Core Skills: Problem-solving, attention to detail, and clear communication/collaboration.


Preferred Qualifications:

  • Soft Skills: Learning agility, technical leadership, and business needs consulting/management.
  • Languages: Strong Python preferred; Scala, Java, or C# accepted.
  • PySpark: Building spark applications using PySpark.
  • Storage Formats: Experience with Parquet, Delta, and Avro.
  • APIs: Efficiently querying API endpoints as data sources.
  • Azure Environment: Understanding of subscriptions, resource groups, and related cloud services.
  • Git: Strong understanding of Git workflows in development.
  • DevOps: Deploying/maintaining solutions via Azure DevOps pipelines and repositories.
  • Ansible: Understanding Ansible usage within Azure DevOps pipelines.


Job Description:

  • Pipeline Dev: Design, develop, and maintain data pipelines/ETL using Azure Data Factory, Synapse, Databricks, and Fabric.
  • Data Storage: Utilize Azure Data Lake Gen 2 and Blob storage to organize pipeline outputs.
  • Collaboration: Work with data scientists, analysts, architects, and stakeholders to deliver high-quality data solutions.
  • Optimization: Tune Azure data pipelines for maximum performance, scalability, and reliability.
  • Data Quality: Ensure overall data integrity using validation techniques and frameworks.
  • Documentation: Create and maintain docs for data processes, configurations, and best practices.
  • Monitoring: Troubleshoot and resolve pipeline issues quickly.
  • Innovation: Stay current with emerging tech to keep data solutions cutting-edge.
  • CI/CD: Manage DevOps processes for deploying and maintaining data solutions.