Talent.com
Deze vacature is niet beschikbaar in je land.
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Axiom Software Solutions LimitedEindhoven, North Brabant, NL
30+ dagen geleden
Vacaturetype
  • Quick Apply
Functieomschrijving

Job Role- Senior SRE Engineer

Location- Eindhoven, Netherlands

Work type – Contract role

Role Overview :

In this role you are working on highly technical Software Development Environment where not a single day is the same. The team has a very broad responsibility, maintaining a great number of tools as well as a large set of infrastructure. The candidate we are looking for has a broad interest and loves to work one week as a software engineer extending our tools, and the other week to dive deep into performance optimization of our infrastructure.

Job Responsibilities

  • Exp level – JG9

System Reliability & Uptime

  • Design and implement strategies to ensure high availability, reliability, and performance of systems and services.
  • Define and track Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets.
  • Incident Management & Troubleshooting

  • Respond to system outages and incidents, lead post-mortem investigations, and implement preventive measures.
  • Create runbooks and automate recovery processes to reduce manual intervention.
  • Share the on-call rotation and be an escalation contact for incidents.
  • Infrastructure as Code (IaC)

  • Build and maintain infrastructure using tools like Terraform.
  • Ensure infrastructure is reproducible, version-controlled, and auditable.
  • Monitoring & Observability

  • Implement and manage monitoring tools (preferably Splunk).
  • Set up alerts and dashboards to track the health and performance of services.
  • Automation & Tooling

  • Automate operational tasks such as deployments, scaling, backups, and failovers.
  • Develop internal tools to support deployment pipelines and team workflows.
  • Collaboration with Development & Operations

  • Work closely with developers to design systems that are scalable and supportable.
  • Advocate for and implement best practices around CI / CD, testing, and release management.
  • Required Skillset

    o Programming & Scripting

  • Proficiency in languages like Python, Bash, or Ruby.
  • Ability to build tools, automate tasks, and debug production issues.
  • Cloud Platforms

  • Strong experience with cloud providers (GCP, Azure).
  • Knowledge of cloud-native services, networking, and security.
  • Linux / Unix Systems / Windows

  • Deep understanding of system internals, performance tuning, and debugging.
  • Containers & Orchestration

  • Experience with Docker and Kubernetes (or other orchestration platforms).
  • CI / CD & Automation Tools

  • Familiarity with Jenkins, Github Actions, ArgoCD, or similar.
  • Experience setting up and managing deployment pipelines.
  • Monitoring & Logging

  • Knowledge of observability stacks.
  • Security & Compliance Awareness
  • Understanding of securing systems and managing access control, secrets, and audit logging.
  • Soft Skills

  • Strong communication and collaboration skills.
  • Enjoy coaching more junior team members.
  • Ability to work under pressure during incidents and lead blameless post-mortems.
  • Analytical mindset and proactive problem-solving approach.
  • Maak een vacature-alert aan voor deze zoekopdracht

    Site Reliability Engineer • Eindhoven, North Brabant, NL