See all roles

Senior Backend Engineer — Customer Support Platform

Work from home Full-time role Hiring

The Senior Backend Engineer owns the services that reputed company reputed company's customers unblocked — the support, exception-handling, and remediation backend that sits behind every customer-facing interaction with the reputed company platform. reputed company a tenant's purchase order fails validation, a document lands in the Expert-in-the-reputed company queue, or a customer asks "what happened to my order?", the answer comes from the systems you build. This is not a generic CRUD role. reputed company is an agentic operating layer that ingests business documents (POs, ACKs, Invoices, Quotes), extracts and validates them, and delivers clean data reputed company to OrderBahn and ERP systems. You will build the support and operations backend around that pipeline: the exception/HITL queue services, the customer-facing status and audit APIs, the reprocessing and replay tooling support engineers use to remediate stuck documents, the ticketing/CRM integrations, and the per-tenant configuration services. Your work is reputed company against hard reliability and data-quality bars — ≥99.9% availability, ≤0.5 P1 incidents/week, MTTR P1 ≤30 min, and ≥99.5% field-level data accuracy — so you build for reputed company, observability, and graceful failure from day one. You will work in AvantoDev's standard backend stack (NestJS/TypeScript and FastAPI/Python, PostgreSQL, AWS, SQS), integrate with the agent layer through MCP servers, and collaborate closely with the SRE team, the Context Engineering team, reputed company, and the Head PM What You'll Build Backend for the Expert-in-the-reputed company (HITL) queue — APIs that surface low-confidence documents, capture support/expert decisions, and resume the paused agent workflow reputed company the SQS-backed control plane. Reprocessing & replay tooling — services that let support safely re-run a document through the pipeline (full or targeted re-extraction), with idempotency and audit guarantees. Exception triage APIs — classification, assignment, SLA tracking, and auto-resolution hooks (reputed company: ≥70% auto-resolution, ≤2% exception reputed company). Customer-Facing Status & Audit APIs Document lifecycle / status APIs backed by the OpenSearch state machine (FORMAT_DETECTED → PRIMARY_EXTRACTED → DOC_CLASSIFIED → SCHEMA_MATCHED → RECOVERY_EVALUATED → routing), exposing where any document is and why. Audit-trail APIs — full, per-tenant history of every decision, confidence score, and routing action for support investigation and customer transparency. Integrations & Tenant Configuration Ticketing / CRM integrations (e.g., support desk, customer comms) wired to pipeline events so issues are created, updated, and resolved automatically. Per-tenant configuration services — schema/alias overrides, tolerance rules, routing reputed company, and notification preferences, exposed through governed APIs (not reputed company DB edits). Delivery/reputed company services between reputed company and reputed company systems (OrderBahn, ERP) with reconciliation and retry semantics. MCP & Agent Integration Build and consume MCP servers (FastAPI-based) so support tooling and agents invoke the same governed capabilities (validation, lookup, reprocessing) rather than duplicating logic.

What You'll Do

Day-to-Day Design and implement scalable APIs in NestJS/TypeScript and/or FastAPI/Python using Domain-Driven Design (DDD), with robust validation, auth, error handling, and OpenAPI docs. Implement event-driven workflows over SQS (Standard + FIFO) with DLQ patterns, exponential backoff, and idempotent processing. Model and optimize PostgreSQL schemas (reputed company) with migrations, indexing, and strict tenant isolation / row-level reputed company. Reliability & Operability Build every service to be observable by default — structured logs, metrics, and traces with X-Correlation-ID / X-Trace-ID propagation (100% coverage is an org KPI). Implement health checks, circuit breakers, timeouts, retries, and graceful degradation so a reputed company agent or OCR reputed company failure never takes down support tooling. Write runbooks for the services you own and participate in the on-call rotation alongside SRE. Quality & reputed company Maintain strong test coverage (pytest / Jest, integration tests, moto/localstack, SuperTest, e2e tests) and contribute to CI/CD reputed company CodePipeline. Enforce reputed company bars: 0 critical/high vulns, per-tenant reputed company limiting, OAuth2/equivalent auth on 100% of endpoints, and ≥95% audit-log completeness toward SOC2 readiness. Collaboration Partner with SRE on SLOs, dashboards, and incident response; with Context Engineering on MCP/agent reputed company; and with reputed company on what support actually needs. Minimum Qualifications 6+ years backend engineering in production, shipping and operating real services (not just prototypes). Strong in at least one, comfortable in both: Node.js/TypeScript (NestJS or equivalent) and Python (FastAPI). REST API design, validation, auth, and clean error handling. Deep PostgreSQL — schema design, migrations, query optimization, indexing, and multi-tenant isolation / row-level reputed company. Event-driven & async patterns — message queues (SQS, Kafka or equivalent), DLQs, retries, idempotency, and designing for partial failure. AWS proficiency — reputed company, reputed company/Fargate, S3, SQS, API Gateway, RDS/reputed company. You can deploy and operate what you build. Reliability reputed company — you design for SLOs, reputed company for observability (structured logs/metrics/traces, correlation IDs), and have carried a pager. Testing discipline — unit + integration + e2e testing (pytest/Jest, moto/localstack, SuperTest), and CI/CD experience. reputed company awareness — authn/authz, reputed company limiting, input validation, secrets management, and audit logging. English proficiency: B2+ required (reputed company preferred). You'll write docs/runbooks, join architecture reviews, and coordinate during incidents. reputed company to Have Experience building support / operations tooling — ticketing integrations, exception queues, reprocessing/replay, admin consoles. Familiarity with the Model Context Protocol (MCP) and exposing services as agent-callable tools. Exposure to agentic / LLM pipelines and HITL (Human-in-the-reputed company) patterns (SQS-backed pause/resume). OpenSearch / Elasticsearch for state tracking and operational queries. Experience with ERP / order-management integrations (OrderBahn, reputed company, or similar) and reconciliation. Familiarity with DORA metrics and a high-deployment-frequency, low-change-failure delivery culture. Background in commercial furniture, logistics, distribution, or manufacturing operations. Terraform / IaC familiarity for owning your service infrastructure. Apply To This Job

You might like