Site Reliability Engineer - Sre

Details of the offer

About the job
Scaleway is looking for a Site Reliability Engineer to join our teams.
Reporting to a Lead SRE, you will be responsible to ensure we can reliably serve our products for users around the world.
We expect you to have a strong background in development and system administration.
Our systems evolve constantly and the tools needed to observe and act to ensure their resilience need to evolve accordingly.
Minimum qualifications Previous experience as a developer in Go, Python or Rust
Experience in system programming with usual scripting languages (bash, Python)
Demonstrated ability to troubleshoot production systems failures
A great attitude and desire to work with a team
Passion for incremental improvements on tooling, love all things of automation
Experience with Linux systems (Ubuntu/Debian)
Experience with cloud environments architecture (baremetal, virtual machines, containers, orchestrators)
Good understanding of computer networks: TCP/IP, DNS, load-balancing, IPv6, BGP and network virtualisation
Understanding of written and spoken English, capable of writing technical documentation in English, ability to speak English if needed
Preferred qualifications Experience with infrastructure as code and continuous deployment
Experience dealing with physical hardware automation
Experience with monitoring & logging systems
Experience administering relational databases
Knowledge of one cloud platform and related use-cases
Take initiatives to propose new solutions and defend them
Team player, willing to share knowledge, opinions, and participate in regular team rituals
Good communication skills and coaching skills
Responsibilities Create or optimize existing tools & documentation that will help identify, diagnose and remediate production incidents, automating as much as possible
Troubleshoot high-impact issues working with multiple engineering teams
Take on-call responsibilities, mitigate issues encountered in production and secure the best real-time answer to our customers
Ensure a high quality of service for our customers by leveraging observability and monitoring technologies
Manage lifecycle of products in production
Help implementing best practices in stability, resiliency, scalability, security and performance across our systems
Technical Stack Python, Go, Rust
RabbitMQ
PostgreSQL
HA Proxy, Nginx, REST APIs / Flask
S3 API
Sentry, Prometheus, Grafana, ElasticSearch, Fluentd, Kibana
Ansible, AWX, Foreman, Salt
GitLab, Nexus
Ubuntu, Debian, CentOS
Jira, Confluence, Slack, GSuite
Location This position is based in our offices in Paris or Lille (France).
#J-18808-Ljbffr


Nominal Salary: To be agreed

Source: Appcast_Ppc

Requirements

Public Notice For Direct Hire - Information Technology Specialist (Information Security)

Duties The CDC utilizes Direct/Expedited Hire Authorities to fill vacancies in a variety of occupations. This vacancy is a REPOSITORY of applications. Duties...


From Centers For Disease Control And Prevention - California

Published 6 days ago

Public Notice For Direct Hire - Computer Engineer (Cybersecurity)

Duties The CDC utilizes Direct/Expedited Hire Authorities to fill vacancies in a variety of occupations. This vacancy is a REPOSITORY of applications. Duties...


From Centers For Disease Control And Prevention - California

Published 6 days ago

Sap Ixp Intern - Communications (Digital Business Services)

We help the world run better At SAP, we enable you to bring out your best. Our company culture is focused on collaboration and a shared passion to help the w...


From Sap - California

Published 6 days ago

Svb - Associate Development Program West (Sept 2025 Cohort)

Overview This is a hybrid role, with the expectation that time working will regularly take place inside and outside of a company office. Together, Silicon V...


From First Citizens Bank - California

Published 6 days ago

Built at: 2024-11-06T03:39:37.248Z