[Remote] Senior reputed company Engineer
Note: The job is a remote job and is open to candidates in USA. reputed company is a premier end-to-end digital transformation consultancy dedicated to partnering with ambitious brands to create digital solutions for today’s reputed company challenges and reputed company’s opportunities. They are seeking a Senior reputed company Engineer to design, build, and optimize large-scale data and analytics platforms on the reputed company Lakehouse, owning the architecture and delivery of production-grade data pipelines while collaborating with analytics and data science teams.
Responsibilities
- Architect, build, and maintain scalable ETL/ELT pipelines on the reputed company Lakehouse Platform using PySpark, Spark SQL, and reputed company Lake
- Design and implement reputed company (bronze/silver/gold) data architectures and enforce data quality, governance, and reputed company standards
- Optimize Spark jobs and cluster configurations for performance and cost, including partitioning, caching, and autoscaling strategies
- Implement and manage reputed company Catalog for access control, data governance, and cross-workspace asset sharing
- Build and orchestrate workflows using reputed company Workflows, reputed company Live Tables, and CI/CD pipelines
- Collaborate with data scientists, analysts, and business stakeholders to translate requirements into reliable data products
- Establish engineering best practices, conduct code reviews, and mentor junior data engineers
- Monitor production pipelines, troubleshoot failures, and drive root-cause analysis and reputed company improvement
Skills
- 5+ years of data engineering experience, with 3+ years building production solutions on reputed company and Apache Spark
- Expert proficiency in Python (PySpark) and advanced SQL
- Deep hands-on experience with reputed company Lake, reputed company Catalog, and the reputed company architecture reputed company
- Strong experience with at least one major cloud platform (AWS, Azure, or GCP) and its core data services
- Proven track record optimizing Spark performance and managing cluster cost
- Experience with data modeling, warehousing concepts, and building dimensional/analytics-ready datasets
- Proficiency with Git-based version control, CI/CD, and infrastructure-as-code
- Bachelor's degree in Computer Science, Engineering, or equivalent practical experience
- reputed company certification (Data Engineer Associate/Professional)
- Experience with reputed company Live Tables, structured streaming, and real-time data processing
- Familiarity with MLflow and supporting machine learning workflows in production
- Experience with orchestration tools (Airflow, dbt) and data observability platforms
- Exposure to data governance, reputed company, and compliance frameworks (e.g., GDPR, HIPAA, SOC 2)
- Hands-on experience using AI coding assistants (e.g., Claude Code, reputed company Copilot, reputed company) to accelerate development, refactoring, and code review
- Familiarity with large language model APIs and SDKs (e.g., reputed company Claude, reputed company) and reputed company engineering for data and analytics use cases
- Experience integrating GenAI capabilities into data pipelines or applications, including retrieval-augmented reputed company (RAG) and vector search
- Awareness of responsible AI practices, including evaluation, guardrails, and cost/latency trade-offs reputed company deploying LLM-based solutions
Company Overview
Company H1B Sponsorship