Data Engineer at Bravissimo Resourcing Inc.

Qualifications:

Education: BS in CS, Engineering, or related field (or equivalent experience) with strong programming fundamentals.
Experience: 7+ years as a Data Engineer handling data and ETL processes.
Azure Suite: Azure Data Factory, Synapse, Databricks, Blob Storage, and Data Lake Gen 2.
SQL & RDBMS: Efficient SQL DML queries for modern RDBMS (SQL Server, PostgreSQL).
Software Engineering: Strong understanding of CI/CD, version control, and testing applied to data.
Big Data: Hands-on experience with Spark.
Core Skills: Problem-solving, attention to detail, and clear communication/collaboration.

Preferred Qualifications:

Soft Skills: Learning agility, technical leadership, and business needs consulting/management.
Languages: Strong Python preferred; Scala, Java, or C# accepted.
PySpark: Building spark applications using PySpark.
Storage Formats: Experience with Parquet, Delta, and Avro.
APIs: Efficiently querying API endpoints as data sources.
Azure Environment: Understanding of subscriptions, resource groups, and related cloud services.
Git: Strong understanding of Git workflows in development.
DevOps: Deploying/maintaining solutions via Azure DevOps pipelines and repositories.
Ansible: Understanding Ansible usage within Azure DevOps pipelines.

Job Description:

Pipeline Dev: Design, develop, and maintain data pipelines/ETL using Azure Data Factory, Synapse, Databricks, and Fabric.
Data Storage: Utilize Azure Data Lake Gen 2 and Blob storage to organize pipeline outputs.
Collaboration: Work with data scientists, analysts, architects, and stakeholders to deliver high-quality data solutions.
Optimization: Tune Azure data pipelines for maximum performance, scalability, and reliability.
Data Quality: Ensure overall data integrity using validation techniques and frameworks.
Documentation: Create and maintain docs for data processes, configurations, and best practices.
Monitoring: Troubleshoot and resolve pipeline issues quickly.
Innovation: Stay current with emerging tech to keep data solutions cutting-edge.
CI/CD: Manage DevOps processes for deploying and maintaining data solutions.