Job Description
The Opportunity
This Platform Engineer will be working closely with Backend, Solutions Engineering, Customer Success and Product in delivering best-in-class technical infrastructure to support internal/external product delivery.
What You'll Do
- Build self-service infrastructure for our engineers and customers to ship their code quickly and reliably.
- Manage the containers-based model serving platform in both a hosted, multi-cloud environment and in customer on-premise VPCs.
- Operate underlying infrastructure (compute, networking, storage, etc.).
- Own the development experience for our customers, including CI/CD, local development environment, documentation, compliance, upgrading, and more.
- Automate everything.
- Establish monitoring and infrastructure health reporting for the advancement of self-healing infrastructure principles.
- Be on-call for services owned.
- Work across timezones in a collaborative remote environment.
- Essential Competencies: Professionalism/Personal Accountability, Collaboration and Teamwork, Communication, Flexibility and Adaptation to Change, Service to Customers and Clients
What We are Looking For
- Masters with no experience or Bachelor's with 5+ years of experience in DevOps, and/or systems/Infrastructure engineering
- Prior experience at all phases of SDLC from conception to operation in production at scale.
- Hands-on experience with core AWS technologies (EC2, VPC, EBS, ALB, RDS, S3, EKS etc.) or similar in other clouds.
- Proficiency with scripting (shell, Python), infrastructure/application mgmt tools (Terraform, Helm).
- Hands-on experience with CI/CD pipeline and Build tools such as GitHub, Actions, Agro CD/Workflow, etc.
- Experience with standard Linux distributions.
- Experience with Infrastructure monitoring tools like Prometheus, Loki, Fluentbit, Grafana etc
- Experience with management of Kubernetes and Docker/Container
- Familiarity with software security methods.
- Good understanding and High Availability and Disaster Recovery design techniques and infrastructure.
- Will consider full remote on the East Coast for a strong candidate.
- Bonus points for working in a fast paced startup environment; double for AI experience.
- US Security Clearance (SECRET or above)
Pay Range in Washington DC is $117,000 - $189,000. This role will be remote on the East Coast and requires US Security Clearance.
Company Info.
Fiddler AI
Fiddler is a pioneer in enterprise Model Performance Management. Data Science, MLOps, and LOB teams use Fiddler to monitor, explain, analyze, and improve their solutions and build trust into AI.
The unified environment provides a common language, centralized controls, and actionable insights to operationalize ML/AI with trust. It addresses the unique challenges of building in-house stable and secure MLOps systems at scale.