Cloud Infrastructure Engineer

Truera
Apply Now

Job Description

About TruEra

TruEra provides a full lifecycle observability platform to help enterprises analyze machine learning, improve model quality, track performance and build trust. Powered by enterprise-class Artificial Intelligence (AI) Explainability technology based on six years of research at Carnegie Mellon University, TruEra’s platform helps eliminate the black box surrounding widely used AI and ML technologies. This visibility leads to higher quality, explainable models that achieve measurable business results, address unfair bias, and ensure governance and compliance.

We are excited about the amazing team we’re building at TruEra. One of the core cultural principles at TruEra is: “Create what’s not there.” We’re building a team of creator-builders who are excited about our mission and keen to build large-scale systems and drive cutting-edge research in support of it.

We are a rapidly growing Series B company funded by Greylock, Wing, and Menlo Ventures, and working with both Fortune 100 customers and startups throughout the world!

About the job

As a Cloud Infrastructure Engineer on the TruEra Infrastructure team, you will be managing a scalable and highly available Data platform, AI/ML infrastructure ecosystems. We're developing the platform for both public and private cloud environments with the container as first-class citizens. Infrastructure is at the core of our platform, and we're constantly innovating to make our systems more performant, timely, cost-effective, and capable while maintaining high reliability. You'll be owning our core data and ML infrastructure and pipelines, customer sandbox, production system, CI/CD pipeline.

What You Will be Doing:

  • Solve customer challenges: Understand customers' installation and deployment. It could be on-prem, cloud, and hybrid. Understand customer infrastructure, security requirements, identity integration, and setup customer incident management, etc.
  • Build tailored solutions: TruEra integrates with customers' data lakes, machine learning infrastructure, access management system, etc. This role requires active participation in helping them to integrate the truera platform with the customer ML or data ecosystems: support and help to troubleshoot any integration issue and beyond. 
  • Provide technical clarity: Come up with repeatable automation and best practice for both on and off-prem deployment. Influence and participate in design discussion to create a reference architecture for each deployment model
  • Infrastructure as Code: Create a programmable infrastructure that can interact with the host's or container (cloud or on-prem) for provisioning, deployment, and configuration management 

Prior Experience:

  • 5+ years experience working on both Public Cloud (AWS, Azure, GCP) and on-prem systems (OpenShift) or similar enterprise Kubernetes ecosystems.
  • Strong DevOps/Infrastructure background, with expertise across numerous technologies.
  • Expertise in working with containerized applications; docker and Kubernetes, Container Storage Interface and container networking, etc
  • Hands-on experience with IaC automation like Ansible, Terraform, Puppet or Chef.
  • Deep understanding of security, identity, and access management for on-premise and cloud setups.
  • Understanding of logging and monitoring and security best practices
  • Experience with disaster recovery tiers and designed for highly available workloads.
  • Experience deploying/operating CI/CD systems like Jenkins, CircleCI, GitOps, etc
  • Familiarity with incident management tools and process
  • Proficiency in Python or Java and scripting languages like bash.

Nice-To-Haves:

  • Experience in working with data and ML systems.
  • Experience in working with enterprise IT infrastructure.
  • Previous start-up experience is a plus

Company Info.

Truera

TruEra provides the first Model Intelligence platform, to help enterprises analyze machine learning, improve model quality and build trust. Powered by enterprise-class Artificial Intelligence (AI) Explainability technology based on six years of research at Carnegie Mellon University, TruEra’s platform helps eliminate the black box surrounding widely used AI and ML technologies.

  • Industry
    Information Technology
  • No. of Employees
    36
  • Location
    Redwood City, CA, USA
  • Website
  • Jobs Posted

Get Similar Jobs In Your Inbox

Truera is currently hiring Cloud Infrastructure Developer Jobs in Bengaluru, Karnataka, India with average base salary of ₹90,000 - ₹250,000 / Month.

Similar Jobs View More