Senior Site Reliability Engineer

Thomson Reuters Corporation
Apply Now

Job Description

As a Senior Site Reliability Engineer, AI Platform you will be responsible for automating and accelerating the testing, release, and deployment of applications into a runtime environment. The candidate should have a strong background in development, operations, and full-stack implementations. They should be proficient in scripting and have experience with cloud computing, configuration management, containerization, networking, continuous integration/continuous deployment (CI/CD), monitoring and logging, collaboration and communication, problem-solving and troubleshooting. Knowledge of business process re-engineering principles and application development methodologies is also required. Experience with AWS infrastructure tools, scripting languages, and continuous integration tools is preferred. The candidate should be able to optimize applications and maintain infrastructure while ensuring stability and possess strong communication and collaboration skills. Experience in a research or academic environment is a plus.

About the role

  • You will work with application software developers to automate and accelerate the testing, release and deployment of applications into a runtime environment quickly and reliably.
  • You need a strong background in development, operations, and full-stack implementations.
  • Have a good experience programming in a high-level programing language such as Python, Rust, Go, C++ or other language to script installations, configurations, provisioning, and automation.
  • Provide continuous delivery solutions in a cloud environment and have experience with the core suite of tools used to manage different cloud providers.
  • As Senior Site Reliability Engineer you will establish and employ Continuous Integration practices and tools such as Jenkins or other CI tools.
  • Employ industry Continuous Delivery patterns and collaboratively work with other members to achieve successful continuous delivery solutions.
  • You will be responsible for mentoring and teaching existing team members.
  • As such, the ideal candidate must have experience clearly explaining solutions to complex problems and demonstrate the ability to lead and impart knowledge effectively to junior resources.

About yourself

  • 5 + years of experience spanning at least two IT disciplines, including technical architecture, application development, or operations
  • System Administration: Proficiency in Linux/Unix system administration, including tasks such as configuring and managing servers, monitoring system performance, and troubleshooting issues.
  • Scripting and Automation: Strong programming skills in languages like Python, Bash, or PowerShell to automate manual tasks, develop tools, and write scripts for system monitoring and maintenance.
  • Cloud Computing: Familiarity with cloud platforms like Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform (GCP). Understanding how to provision and manage infrastructure using cloud services is essential.
  • Configuration Management: Experience with configuration management tools like Ansible, Puppet, or Chef to automate and manage system configurations across multiple servers.
  • Containerization: Knowledge of containerization technologies such as Docker and container orchestration tools like Kubernetes. Being able to create, manage, and deploy applications using containers is increasingly important.
  • Networking: Understanding of network protocols, IP addressing, load balancers, firewalls, and network troubleshooting. Knowledge of concepts like TCP/IP, DNS, and HTTP is valuable.
  • Continuous Integration/Continuous Deployment (CI/CD): Familiarity with CI/CD tools like Jenkins, GitLab CI/CD, or Travis CI to automate the build, test, and deployment processes.
  • Monitoring and Logging: Proficiency in tools like Nagios, Prometheus, ELK stack (Elasticsearch, Logstash, Kibana), or Grafana to monitor system health, collect logs, and generate useful metrics.
  • Collaboration and Communication: Strong interpersonal and communication skills are crucial for working in cross-functional teams. SRE/DevOps engineers often need to collaborate with developers, system administrators, and other stakeholders.
  • Problem-Solving and Troubleshooting: Having a systematic approach to problem-solving and the ability to troubleshoot complex issues quickly is essential. SRE/DevOps engineers should be comfortable working under pressure and resolving incidents.
  • Knowledge of business process re-engineering principles and processes
  • Strong understanding of application development methodologies
  • Familiarity with a broad portfolio of AWS infrastructure tools (EBS, S3, EC2, Elastic IP, Route 53, VPC) and experience with cloud infrastructure management and automation technologies.
  • Scripting (shell, python) skills for monitoring and automation.
  • Continuous integration tools such as Jenkins, Ansible, AWS CodeBuild, GitHub actions, etc.
  • Experience optimizing applications, both stand-alone and in distributed systems to maximize performance
  • Experience maintaining an infrastructure and ensuring stability while adding new features.
  • Ability to clearly articulate design and implementation choices.
  • Ability to use a wide variety of open-source technologies and tools.
  • Comfort with frequent, incremental code testing and deployment.
  • Possess a strong grasp of automation tools.
  • Comfort with collaboration, open communication and reaching across functional borders
  • Ability to thrive in global teams with peers in different time zones.
  • Fundamental understanding of ML concepts, the new trends on AI, the Data Science life cycle, ML model development, and path to production.

Company Info.

Thomson Reuters Corporation

Thomson Reuters Corporation is a Canadian multinational media conglomerate. The company was founded in Toronto, Ontario, Canada, where it is headquartered at the Bay Adelaide Centre.

Get Similar Jobs In Your Inbox

Thomson Reuters Corporation is currently hiring Senior Site Reliability Engineer Jobs in Bengaluru, Karnataka, India with average base salary of ₹90,000 - ₹250,000 / Month.

Similar Jobs View More