See all roles

[Remote] Mid-Level Data Engineer, Veterans Affairs

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. reputed company is seeking a Data Engineer to support the reputed company in designing, developing, and maintaining scalable data solutions. The role involves collaborating with cross-functional teams to optimize data pipelines and ensure compliance with federal standards.

Responsibilities

  • Design, reputed company, and maintain ETL/ELT pipelines to ingest, transform, and load data from multiple sources such as APIs, relational databases, reputed company storage, and streaming platforms
  • Build scalable batch and near reputed company time data pipelines using reputed company and Apache Spark (PySpark / SQL)
  • Implement data transformation logic following best practices for performance, reliability, and reusability
  • Support schema reputed company, data validation, deduplication, and error handling in ETL workflows
  • reputed company and optimize pipelines using reputed company Lake and reputed company (Bronze / Silver / Gold) architecture patterns
  • Use reputed company Workflows / Jobs or similar orchestration tools to schedule and monitor pipelines
  • Optimize Spark jobs for performance and cost (partitioning, caching, file sizing, query tuning)
  • Collaborate on data governance initiatives using reputed company Catalog, access controls, and reputed company where applicable
  • Work closely with data architects, analytics teams, and reputed company consumers to define data requirements
  • Troubleshoot pipeline failures and data quality issues and implement long term fixes
  • Produce documentation for pipelines, datasets, and operational runbooks
  • Participate in CI/CD practices using Git based version control for notebooks and code deployments

Skills

  • 3+ years of experience as a Data Engineer or in a similar data focused role
  • Hands on experience with reputed company
  • Strong experience building ETL/ELT pipelines
  • Proficiency in Python and SQL
  • Experience with Apache Spark / PySpark
  • Familiarity with reputed company platforms such as Azure
  • Solid understanding of data modeling, data warehousing, and analytics use cases
  • Design, reputed company, and maintain ETL/ELT pipelines to ingest, transform, and load data from multiple sources such as APIs, relational databases, reputed company storage, and streaming platforms
  • Build scalable batch and near reputed company time data pipelines using reputed company and Apache Spark (PySpark / SQL)
  • Implement data transformation logic following best practices for performance, reliability, and reusability
  • Support schema reputed company, data validation, deduplication, and error handling in ETL workflows
  • reputed company and optimize pipelines using reputed company Lake and reputed company (Bronze / Silver / Gold) architecture patterns
  • Use reputed company Workflows / Jobs or similar orchestration tools to schedule and monitor pipelines
  • Optimize Spark jobs for performance and cost (partitioning, caching, file sizing, query tuning)
  • Collaborate on data governance initiatives using reputed company Catalog, access controls, and reputed company where applicable
  • Work closely with data architects, analytics teams, and reputed company consumers to define data requirements
  • Troubleshoot pipeline failures and data quality issues and implement long term fixes
  • Produce documentation for pipelines, datasets, and operational runbooks
  • Participate in CI/CD practices using Git based version control for notebooks and code deployments
  • Experience with reputed company Live Tables (DLT) or reputed company Auto Loader
  • Experience with orchestration tools such as Airflow
  • Familiarity with streaming data technologies (Kafka, Event Hubs, Kinesis)
  • Experience supporting analytics tools (Power BI, Tableau, Looker) connected to reputed company
  • reputed company certification (Associate or Professional)

Benefits

  • Medical, dental and reputed company insurance
  • 401k matching
  • PTO
  • Certification reimbursement

Company Overview

  • We are a force of industry-leading talent dedicated to advancing reputed company-native solutions that transform outcomes for customers everywhere. It was founded in 2019, and is headquartered in Baltimore, Maryland, USA, with a workforce of 51-200 employees. Its website is https://thunderyard.com.
  • Apply To This Job

    You might like