Director of Site Reliability Engineering
Do you have the right skills and experience for this role Read on to find out, and make your application.
At Power, we proudly operate and maintain cutting-edge, on-premise data centers, providing our business with unmatched control, security, and performance.
As we continue to innovate and grow, we're looking for a skilled and visionary Director of Site Reliability Engineering to lead efforts in ensuring the stability, scalability, and efficiency of these mission-critical environments.
In this role, you'll drive the strategy and execution for our infrastructure's reliability, overseeing our robust systems.
You will also play a pivotal role in shaping the future of our infrastructure while working with a talented team of engineers dedicated to delivering industry-leading performance and uptime.
What We Do Here
Our Business Technology team builds and supports a custom suite of products that power our entire business.
From marketing and sales to operations and finance, we design and deploy comprehensive solutions across the company.
We move fast like a startup—shipping rapidly, iterating quickly—while benefiting from the stability and resources of an established, well-funded, and profitable organization.
We dive into fascinating and complex challenges, each unique enough to stand alone as its own app or company.
Power is a tech powerhouse, quietly embedded within a remodeling business.
Power has been awarded Computerworld's Best Places to Work in IT, Fortune Magazine's #1 Workplace for Millennials, one of Glassdoor's Best Places to Work, one of Inc. 5000's Fastest Growing Private Companies, and Philadelphia Magazine Coolest Companies.
Position Summary
The Director of Site Reliability Engineering is the highest dedicated role within the infrastructure and networking function of Business Technology (BT) at Power, responsible for overseeing and optimizing our technology infrastructure, including data centers, servers, storage, and networking systems across the enterprise.
This individual must bring deep technical expertise, especially in managing and operating data centers, along with experience in networking and high-performance computing environments.
The role is accountable for the continuous availability, security, and performance of mission-critical applications, through top notch infrastructure management.
This technical leadership position will drive the transformation of infrastructure services, architecture, and workforce, ensuring alignment with the enterprise-wide technology strategy.
The Director of Site Reliability Engineering will also manage IT operations, proactively address infrastructure needs, and lead cross-functional initiatives to integrate emerging technologies for enhanced scalability and efficiency.
A key responsibility includes managing and optimizing Power's data center operations, ensuring proper capacity planning, energy efficiency, and operational continuity.
Primary Responsibilities
Data Center Management: Oversee the day-to-day operations of Power's data centers, including managing physical infrastructure (servers, storage arrays, power, cooling systems, etc.
), ensuring high availability, and conducting capacity planning to support business growth.
Infrastructure & Networking Architecture: Provide strategic direction for the architecture of Power's infrastructure, ensuring it can scale efficiently and securely while supporting Nitro and other critical applications.
Service Continuity & Disaster Recovery: Develop and maintain high-availability design, monitoring, alerting systems, and disaster recovery plans to ensure uninterrupted access to applications and services.
Technology Innovation: Lead the integration of emerging infrastructure and networking technologies, such as edge computing, AI-driven automation, and software-defined networking (SDN) into the infrastructure roadmap.
Performance & Incident Management: Oversee infrastructure performance metrics and own the incident response process.
Lead post-mortem reviews for infrastructure-related incidents and hold teams accountable for actioning improvements.
Stakeholder Management: Build and maintain strong relationships with BT leaders, business units, and external partners.
Act as a trusted advisor, providing strategic input on how infrastructure can meet business goals.
Vendor & Cost Management: Manage relationships with hardware and software vendors, optimizing procurement and technology investment strategies to align with budgetary goals.
Monitor costs and project future expenditures based on infrastructure growth trends.
Team Leadership & Development: Mentor and guide the Site Reliability Engineering (SRE) team, fostering a culture of technical excellence and innovation.
Define roles, hire talent, and manage performance to ensure the infrastructure team meets the evolving needs of the business.
Infrastructure as Code & Automation: Drive the adoption of automation practices, such as infrastructure as code (IaC), to streamline the deployment, monitoring, and management of infrastructure services.
Compliance & Governance: Ensure infrastructure operations comply with industry standards, regulations, and security best practices, contributing to the governance of enterprise technology standards.
Cross-functional Collaboration: Collaborate with other BT leaders (e.g., Platform Engineering, Quality Assurance, Application Development) to create a cohesive technology ecosystem, aligning infrastructure with development, deployment, and operational practices.
Job Requirements
Experience
Minimum 10 years of experience in technology infrastructure, including data center operations and networking.
At least 5 years of leadership experience in infrastructure or IT operations, with a focus on data centers, servers, storage, and networking.
Proven track record of managing complex infrastructure projects involving high-availability systems and large-scale networking environments.
Experience managing vendor partnerships and overseeing large infrastructure procurement processes, including hardware, software, and service contracts.
Demonstrated success in delivering technology initiatives in dynamic, complex environments, ideally with Agile, DevOps, and SRE frameworks.
Knowledge/Skills
Leadership & Communication: Exceptional leadership skills with a focus on developing people and high-performing teams.
Strong verbal and written communication skills, capable of translating complex technical concepts for non-technical stakeholders.
Technical Expertise: Deep knowledge of data center operations, including server architectures, storage systems, virtualization, and networking protocols..
Networking: Strong knowledge of networking technologies, including LAN, WAN, SDN, firewalls, load balancers, and monitoring tools.
Experience managing large-scale, distributed network architectures.
Automation & DevOps: Familiarity with infrastructure automation tools (e.g., Terraform, Ansible) and DevOps practices for continuous deployment and automated monitoring.
Emerging Technologies: Awareness of and ability to learn and integrate cutting-edge infrastructure technologies, edge computing, and AI/ML-powered infrastructure management.
Business Acumen: Strong understanding of business objectives and the role of infrastructure in enabling enterprise growth.
Ability to align technical initiatives with broader business goals.
Cost Management: Experience managing large-scale IT budgets and optimizing infrastructure spend.
Benefits
Full medical, dental, life and disability insurance plans that can be tailored to your specific needs and the needs of your family
A competitive 401(k) retirement savings program matched by Power
Competitive salary
All the tech you need - We'll pay for whatever hardware and software you need to work and make sure you're regularly upgraded to the latest versions
Personal development - We provide books, courses and conferences
Paid parental leave - When the time comes to welcome a new member of the family, we offer paid parental leave
2 events per year focused on internal development and improvement
Local candidates only, hybrid work week
Power Home Remodeling Group is an equal opportunity employer, and we are committed to hiring a diverse and talented workforce.
If you have a disability or special need that requires accommodation, please submit the accommodation request to ******
Center 1 (19052), United States of America, McLean, VirginiaSr. Distinguished Engineer - Platform OperationsAt Capital One, we believe that AI and machine le...
From Capital One - Delaware
Published 6 days ago
Lane Valente Industries is a Leading International construction and facility maintenance company with offices throughout the U.S. and Canada. We have a stro...
From Lane Valente Industries - Delaware
Published 6 days ago
130 5th Ave (22130), United States of America, New York, New YorkAI Engineer - Principal Associate LevelOverview: At Capital One, we are creating trustworthy...
From Capital One - Delaware
Published 6 days ago
Residential New Construction Plumbing Sobieski New Homes is looking to hire a full-time Residential New Construction Plumbing to win repeat business by doing...
From Sobieski - Delaware
Published 6 days ago
Built at: 2024-11-05T12:32:07.998Z