DevOps Reliability Engineer

Fyusion, Inc
Apply Now

Job Description

Fyusion is a leading machine learning and computer vision company focused on automotive inspections and related applications. Our patented 3D format enables anyone to capture and display interactive 3D images using their smartphone, and enables significant added functionality with deep visual understanding and machine learning-driven analysis.

Founded in 2014, Fyusion is now part of the Cox Automotive family. Our team includes some of the world's top researchers and developers in light field imaging and AI, continuing to push boundaries and innovate at the highest level from our San Francisco research center.

Fyusion is seeking an awesome DevOps Reliability Engineer (intersection of DevOps & SRE) to join our Web and Cloud Infrastructure team. We are a close-knit team that enjoys challenges and solving real world problems. You will have a key role in solving those problems, helping to shape our core automation, data processing, and deployment practices. You will leverage deep knowledge of Amazon Web Services, as well as automated build and orchestration tools such as Terraform and Kubernetes, to develop and maintain a wide range of infrastructure components—including web stacks, database systems, security tools, and networking/cloud environment configurations.

Further, you will proactively seek out system weaknesses and find ways to fix them beforethey cause production issues using monitoring data, watching trends, and using Chaos Engineering.

We understand this is a complex role, and do not expect you to be an expert in every tool we use. However, we do expect you to be motivated and open to continual self-improvement, adapting to new tools and overcoming new challenges as they come. If you are looking to be challenged, enjoy wearing multiple hats, and thrive in a fast-paced, agile environment, we think you’ll love this role!

Here's what you will be doing:

  • Actively troubleshoot any issues that arise during testing and production, catching and solving issues before launch
  • Automating work including infrastructure needs, testing, failover solutions, failure mitigation, and much more
  • Monitor and troubleshoot highly scalable and distributed server clusters that perform various functions, from web-servers to machine learning processing
  • Participate in SRE activities, (chaos engineering gamedays, disaster recovery scenarios etc).
  • Manage code deployments, fixes, updates, and related processes
  • Work with a close-knit team and brainstorm on the best ways to tackle complex problems in infrastructure, security and monitoring
  • Provide technical guidance and educate team members and coworkers on monitoring and logging. (Have an interesting idea or solution? Present it!)
  • Automating any software maintenance processes which previously required a manual procedure.

Here's what we are looking for:

  • Bachelor’s Degree or equivalent experience required
  • Must be fluent in English with excellent oral and written communication skills
  • 3+ years experience with software engineering, software development, or system operations on high available and high traffic environments
  • Strong experience with Linux-based infrastructures, Linux/Unix administration, and AWS
  • Experience with databases such as MySQL (or sql based), Elasticsearch, Redis
  • Experience administering linux servers as well as docker based infrastructure (like Kubernetes, EKS, etc.) in a highly available environment
  • Experience of scripting languages such as Python, Bash
  • Experience with message broker/queue technologies like RabbitMQ
  • Experience with modern monitoring, logging and observability tools in complex distributed systems such as with Grafana, New Relic, Splunk, Elastic stack, Datadog, Prometheus, etc.
  • Practical experience with infrastructure-as-code (with tools like Terraform, Chef, Ansible, etc.).
  • Good understanding of cybersecurity fundamentals and best practices
  • Containerizing and clustering (Dockerfiles, docker-compose, Helm, Kubernetes, etc.)
  • Stellar problem-solving and troubleshooting skills with the ability to spot issues before they become problems
  • Process-oriented with great documentation skills
  • Solid team player!

Here's what we can offer you:

A competitive compensation, health, vision and dental benefits with premiums paid by Fyusion, unlimited PTO plan, company holidays (including your birthday), and the chance to be part of a pioneering technology team!

We offer some amazing perks for those working from our SF HQ: commuter benefits, company catered lunches, a fully stocked snack pantry, tons of company off-sites, and a pup friendly workplace.

If you read this job description and saw your name all over this, apply! If you read this, and think that you might need some help hitting all of the points, please apply! We have an entire team who is happy to help and share our knowledge with you.

Company Info.

Fyusion, Inc

Fyusion is a machine learning & computer vision company that enables anyone to capture and display interactive 3D 360 images using their smartphone. Our unique 3D format allows for significant additional functionality that 2D images can’t offer, including: background image effects & automatic damage detection for cars, and an understanding of the human skeleton for tagging products & features in fashion E-commerce. Our investors and customers inc

  • Industry
    Automotive,Artificial intelligence
  • No. of Employees
    100
  • Location
    San Francisco, CA, USA
  • Website
  • Jobs Posted

Get Similar Jobs In Your Inbox

Fyusion, Inc is currently hiring DevOps Engineer Jobs in Ukraine with average base salary of ₴10,500 - ₴14,500 / Month.

Similar Jobs View More