Data Engineer
Bravissimo Resourcing Inc.
120 - 140K PHP
Full-time
N/A
Qualifications:
- Education: BS in CS, Engineering, or related field (or equivalent experience) with strong programming fundamentals.
- Experience: 7+ years as a Data Engineer handling data and ETL processes.
- Azure Suite: Azure Data Factory, Synapse, Databricks, Blob Storage, and Data Lake Gen 2.
- SQL & RDBMS: Efficient SQL DML queries for modern RDBMS (SQL Server, PostgreSQL).
- Software Engineering: Strong understanding of CI/CD, version control, and testing applied to data.
- Big Data: Hands-on experience with Spark.
- Core Skills: Problem-solving, attention to detail, and clear communication/collaboration.
Preferred Qualifications:
- Soft Skills: Learning agility, technical leadership, and business needs consulting/management.
- Languages: Strong Python preferred; Scala, Java, or C# accepted.
- PySpark: Building spark applications using PySpark.
- Storage Formats: Experience with Parquet, Delta, and Avro.
- APIs: Efficiently querying API endpoints as data sources.
- Azure Environment: Understanding of subscriptions, resource groups, and related cloud services.
- Git: Strong understanding of Git workflows in development.
- DevOps: Deploying/maintaining solutions via Azure DevOps pipelines and repositories.
- Ansible: Understanding Ansible usage within Azure DevOps pipelines.
Job Description:
- Pipeline Dev: Design, develop, and maintain data pipelines/ETL using Azure Data Factory, Synapse, Databricks, and Fabric.
- Data Storage: Utilize Azure Data Lake Gen 2 and Blob storage to organize pipeline outputs.
- Collaboration: Work with data scientists, analysts, architects, and stakeholders to deliver high-quality data solutions.
- Optimization: Tune Azure data pipelines for maximum performance, scalability, and reliability.
- Data Quality: Ensure overall data integrity using validation techniques and frameworks.
- Documentation: Create and maintain docs for data processes, configurations, and best practices.
- Monitoring: Troubleshoot and resolve pipeline issues quickly.
- Innovation: Stay current with emerging tech to keep data solutions cutting-edge.
- CI/CD: Manage DevOps processes for deploying and maintaining data solutions.