Cloud Site Reliability Engineer

Details of the offer

Cloud Site Reliability Engineer Work Authorization: USC , GC ,GC EAD ONLY Roles & Responsibilities Role: Cloud Site Reliability Engineer (SRE) Minimum 5+ years of hands-on experience supporting Kubernetes /Openshift / RKE / EKS Container platform. Experience with Python, Ansible, Golang, and shell scripting. Kubernetes /Openshift /Terraform certifications are a plus. Strong experience in major services related to Compute, Storage, Network and Security. Experience with monitoring tools like Prometheus and Dynatrace, as well as cloud native tools like Azure Monitor and Log Analytics. Strong understanding and background of working with a complex IAM infrastructure, including Active Directory, Azure AD Connect, Azure AD, and Ping Identity or other SSO solutions. Advanced knowledge of Linux OS, DNS, DHCP, Kerberos and Windows Authentication. Experience with CI/CD tools git /Jenkins, GitOps model. Excellent understanding of Linux /Windows operating systems administration. Experience in Container security and vulnerability remediation. Systematic problem-solving approach, sense of ownership and drive. Ability to juggle competing priorities and adapt to changes in project scope. Excellent interpersonal, organizational and communication (written, verbal, and presentation) skills are a must. Proven ability to work independently with minimal supervision and as part of a team with direct responsibilities. Responsible for reliability and support of Container Platform on-prem and external clouds (Azure /AWS /Google). Monitor and troubleshoot Container platform (Openshift), Rancher (RKE) and Azure (AKS) environment performance issues, connectivity issues, security issues, etc. Perform deep dives into systemic and latent reliability issues, Incident management, problem management. Identifying, analyzing, and resolving infrastructure vulnerabilities and application deployment issues. Perform blameless RCA, partner with engineering and operation teams across the organization to roll out fixes. Responsible for application onboarding and provide troubleshooting support through the lifecycle of the applications on the container platform. Identify and drive opportunities to improve automation to reduce TOIL and improve operational excellence. Partner with risk, and compliance teams to bring visibility and implement right controls and remediation of vulnerabilities. Ensure resiliency during implementation and identify/fix resiliency problems by collaborating with engineering teams. Be a key stakeholder in the design of cloud services and work with Architecture, engineering, product teams. Participate in 24x7 on-call coverage follow the sun model.


Nominal Salary: To be agreed

Source: Talent2_Ppc

Requirements

Director, Service Experience

Company Description Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchant...


Visa - Georgia

Published 8 days ago

Linux Engineer

Our client from a well-known research institute is seeking a mid/senior Linux Engineer to join their team. This is a Hybrid role in Smyrna 2-3 times a week. ...


Insight Global - Georgia

Published 5 days ago

O9 Solution Architect

Req ID:271973 NTT DATA Services strives to hire exceptional, innovative and passionate individuals who want to grow with us. If you want to be part of an inc...


Nttdata - Georgia

Published 5 days ago

Data Analyst

Link Solutions, Inc. delivers reliable and effective Information Technology services to government clients in support of critical mission needs. Delivering a...


Linksol Inc - Georgia

Published 5 days ago

Built at: 2024-11-22T02:26:17.415Z