It's an exciting time to be at Infoblox. Named a Top 25 Cyber Security Company by The Software Report and one of Inc. magazine's Best Workplaces for 2020, Infoblox is the leader in cloud-first networking and security services. Our solutions empower organizations to take full advantage of the cloud to deliver network experiences that are inherently simple, scalable, and reliable for everyone. Infoblox customers are among the largest enterprises in the world and include 70% of the Fortune 500, and our success depends on bright, energetic, talented people who share a passion for building the next generation of networking technologies—and having fun along the way.
We are looking for a CloudOps Site Reliability Engineer to join our Incident Management Engineering team located in Tacoma, WA, or remote, reporting to the manager of Cloud Operations. In this role, you will be part of the Incident Management team responsible for the monitoring and support of Infoblox cloud-based services. You will monitor and maintain the infrastructure that runs our SaaS services, as well as ensure these services are running at peak performance. You will also be responsible for maintaining the services and assisting in the automation that enables Infoblox services in the cloud.
You are the ideal candidate if you are a proactive, hands-on professional who picks up new technology quickly, has excellent interpersonal skills, and is driven to find solutions while collaborating across teams.
What you'll do:Provide real-time monitoring, triage, and escalation of critical and major issues and incoming alarms within the environmentParticipate in incident management calls and coordinate response, triage, recovery, and reporting of incidentsActively engage through the service restoration and ensure senior leadership is aware of activities being carried outExpand and mature existing incident response processes and activities, including managing and administering the runbookPartner with Engineering and NOC to prepare and present RCA reports for incidents, their impact, and resolutionImplement and utilize SRE developed tools for incident responseAssist in the development of resilient and self-scaling systemsLead complex projects focused on building and maintaining observability/monitoring for the application, monitoring key performance indicators, maintaining alerting, and continuously improving visibilityWhat you'll bring:Minimum 5 years of combined experience in DevOps, SRE, and/or incident management and monitoring toolsHands-on experience with cloud architecture and deploying infrastructure in a cloud environmentSolid networking experience, such as TCP/IP, BGP routing, load balancing, and DNSExperience with monitoring tools, such as Grafana, Loki, PagerDuty, AWS Lambda, etc.Experience with Linux distributions, including CentOS, Ubuntu, and Amazon LinuxExperience with Amazon Web Services, including EC2, VPC, ELB, S3, RDS, CloudFormation, etc.Experience with configuration management, such as Terraform, Chef, Puppet, Ansible, and/or SaltExperience with monitoring tools and CI/CD toolchain, like Git, Jenkins, or SpinnakerExperience with Python, Java, Golang, Kubernetes, Linux Containers, and Docker is preferredBachelor's degree in computer science, information security, computer engineering or electrical engineering is requiredWhat success looks like:After six months, you will…
Provide real-time monitoring, triage, and escalation of critical and major issues and incoming alarms within the environmentParticipate in incident management calls and coordinate response, triage, recovery, and reporting of incidentsAfter about a year, you will…
Partner with SRE/DevOps to resolve infrastructure maintenance tasks, internal access request/issues and management of monitoring and CI/CD toolsUse knowledge and experience to identify strategies that increase system reliability and performance through on-call rotation and process optimizationPartner with Engineering stakeholders to develop runbooks and implement application monitoring and RCA action itemsWe've got you covered:In the spirit of pay transparency, we are excited to share our compensation philosophy. At Infoblox, we believe in paying for performance. You can expect our employment offers to take many factors into consideration, including but not limited to the location of the role, internal equity, applicable past experience, individual skill set, education, and professional certifications. Please keep in mind that the range mentioned is the base salary range for the role. The typical base salary range for this position is $96,500 -$140,690 plus corporate bonus.
Our holistic benefits package includes coverage of your health, wealth, and wellness—as well as a great work environment, employee programs, and company culture. We offer a competitive salary and benefits package, including a 401k with company match and generous paid time off to help you balance your life. We have a strong culture and live our values every day—we believe in transparency, curiosity, respect, and above all, having fun while delighting our customers.
Speaking of a great work environment, here are just a few of the perks you may enjoy, depending on your location…
Onsite massages, clubs, farmers market, and fitness classesDelicious and healthy snacks and beveragesElectric vehicle charging stationsOutdoor amenities, seating, and courtyard BBQDog park and pet-friendly programsNewly remodeled offices with state-of-the-art amenitiesWhy Infoblox?We've created a culture that embraces diversity, equity, and inclusion and rewards innovation, curiosity, and creativity. We achieve remarkable results by working together in a supportive environment that focuses on continuous learning and embraces change. So, whether you're a software engineer, marketing manager, customer care pro, or product specialist, you belong here, where you will have the opportunity to grow and develop your career. Check out what it's like to be a Bloxer. We think you'll be excited to join our team.
#LI-ME1
#LI-Hybrid
#J-18808-Ljbffr