Site Reliability Engineer

Tamr
Apply Now

Job Description

Company Description

Tamr, the leader in data products, enables customers to consolidate messy source data into clean, curated, analytics-ready datasets. Organizations benefit from Tamr, the industry’s first suite of data products that combine human curation, patented machine learning, mastering rules and enrichment with first- and third-party data to accelerate business outcomes and deliver business-changing insights. Tamr’s cloud-native and SaaS solutions enable industry leaders such as Toyota, Western Union and GSK to get ahead and stay ahead in a rapidly changing competitor environment.

We are currently looking for a Site Reliability Engineer to join our SRE team as we continue to evolve and expand the Tamr data mastering platform and tool suite. With our growing customer base and increasing demand for cloud and hybrid-cloud offerings, we are growing our SRE team to support the development and delivery of new products, deployment to cloud environments, and incorporation of third-party technologies and tools to enable product engineering.

You will play a key role in designing and delivering solutions that will make Tamr SaaS offering scalable, featureful, resilient, and secure while providing guidance and mentorship to your team. You'll design and operate automation software to provision, upgrade, monitor, and heal Tamr SaaS deployed on various public cloud platforms such as Google Cloud Compute, Amazon Web Services, and Microsoft Azure.

As an SRE, you will participate in a global uninterrupted rotation and help lead incident management, root cause analysis, continuous improvement activities, and managing engineering efforts against a service-level agreement (SLA) and error budget. 

As a member of the SRE team, some of the projects you will be working on are:

  • Manage Tamr SaaS in development and production hosted on public cloud platforms.
  • Respond to incidents, facilitate post-mortems, and ensure closure of follow-up action items.
  • Develop and drive real-time observability solutions that provide visibility into system health.
  • Partner with development teams to improve services through rigorous testing and release procedures.
  • Participate in system design consulting, platform management, and capacity planning.
  • Balance feature development speed and reliability with well-defined service level objectives.
  • Create and maintain self-provisioning infrastructure using tools like Ansible, Terraform, and Docker.
  • Improving robustness by automation of workflows, process improvements, CI/CD pipelines, and integrating modern toolsets.
  • Participate in a 24x7 on-call rotation.

You might be a good fit if you have 3 or more of the following:

  • 1+ years of experience in DevOps/SRE/Systems Administration with some experience with Linux/Unix systems administration.
  • 1+ years of experience with cloud-based provisioning, monitoring, and troubleshooting (preferably AWS or GCP).
  • 1+ year(s) of Docker and Kubernetes or OpenShift experience.
  • Familiarity with infrastructure automation tools like Terraform and Ansible.
  • Experience with one or more scripting languages such as Python.
  • Minimum Bachelor's degree in Computer Science or equivalent.

Technologies we use:

  • Multi-cloud (GCP/AWS/Azure)
  • Git, GitOps, Terraform, Ansible, Packer
  • Kubernetes, Helm, Istio, Docker
  • Big Data Technologies (BigTable/HBase, Dataproc/Databricks/Spark)
  • PostgreSQL, BigQuery, Bigtable, Snowflake, Azure Synapse
  • Java, Python, Scala, Go
  • FluxCD, Jenkins, Atlantis
  • New Relic, APM, Prometheus

This position is based at our office in Cambridge MA - USA or London, England - UK. Tamr does sponsor employees requiring a visa. Tamr provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, disability, genetic information, marital status, amnesty, or status as a covered veteran in accordance with applicable federal, state and local laws.

Company Info.

Tamr

Tamr is the enterprise data mastering company trusted by large enterprises like Blackstone, the US Air Force, Toyota, and GSK. The company’s patented software platform uses machine learning supplemented with human feedback to master and prepare data across myriad silos to deliver previously unavailable business-changing insights. With a co-founding team led by Andy Palmer (founding CEO of Vertica) and Mike Stonebraker (Turing Award winner)

  • Industry
    Database,Computer software,Database,Computer software
  • No. of Employees
    100
  • Location
    Cambridge, MA, USA
  • Website
  • Jobs Posted

Get Similar Jobs In Your Inbox

Tamr is currently hiring Site Reliability Engineer Jobs in London, UK with average base salary of £55,000 - £80,000 / Year.

Similar Jobs View More