Infrastructure Engineer

Roboflow, Inc.
Apply Now

Job Description

Roboflow is scaling rapidly. We now manage over 100 million images for hundreds of thousands of users. Having secure and reliable cloud infrastructure to support our growth is of paramount importance.

The Roboflow product spans the entire end-to-end machine vision pipeline. So, naturally, the infrastructure presents a wide range of challenges. From driving efficiencies in GPU batch computing to shaving off milliseconds off latencies of our hosted machine learning inference APIs, to supporting hundreds of thousands of users worldwide with best-in-class site reliability and data protection.

Our infrastructure runs across AWS and GCP. Our core web-app runs on Firebase (Firestore, Functions, Storage, Hosting). We heavily utilize serverless compute products where possible, but also run clusters of GPU-powered machines on AWS Batch and in managed instance groups fed by pub-sub queues when necessary. We are increasingly using Kubernetes internally, and are working on a self-hosted version of our platform.

The Role

The focus of this role is on improving, scaling, and maintaining our the infrastructure that powers our core app, including: our cloud architecture, databases, file storage, search cluster, micro-services, and machine learning pipelines.

You'll be working alongside our existing infrastructure team along with doing cross-team work spanning product, operations and customer-facing projects and should have the ability to context switch across a wide range of infrastructure, security and systems engineering work in a fast-paced startup environment.

Specific Skillset

The following would be helpful:

  • Infrastructure-as-code - Terraform, bash scripting automation in production environments
  • Site reliability - alerting, monitoring, scaling services in AWS and GCP clouds
  • Node.js and Python programming skills; ability to work with full-stack developers on designing, developing, and operating SaaS applications
  • Experience with running and debugging a self-hosted Elastic cluster in production
  • Experience with machine learning/big data at scale (GPU, Docker and Kubernetes)
  • Awareness of security best practices and tightening infrastructure for highly secure cloud operations; ideally experienced in a ISO 27001 or SoC2 certification for SaaS applications
  • Experience with CI/CD automation (for example Github actions/CircleCI etc.)

Company Info.

Roboflow, Inc.

Roboflow is a computer vision platform that simplifies the process of building and deploying computer vision models. It provides a range of tools and features that make it easy to create, manage, and optimize computer vision workflows. Roboflow also supports a range of popular deep learning frameworks, including TensorFlow, PyTorch, and Keras, so you can easily train and deploy your computer vision models on a variety of platforms.

  • Industry
    Artificial intelligence,Computer software
  • No. of Employees
    11
  • Location
    Des Moines, IA, USA
  • Website
  • Jobs Posted

Get Similar Jobs In Your Inbox

Roboflow, Inc. is currently hiring Software Engineer, Infrastructure Jobs in United States with average base salary of $90,000 - $190,000 / Year.

Similar Jobs View More