Core Skills Needed: Expertise in Datadog Administration: Proven experience in setting up Datadog from scratch. Extensive experience integrating Datadog with both cloud applications and on-premises systems. Strong skills with Datadog Application Performance Monitoring (APM). Ability to analyze the current environment to enhance configurations and permissions, and develop a strategy for long-term system health and maintenance. Proficient in setting up alerts, thresholds, and other capabilities to notify the team of environmental changes, usage spikes, etc. Responsibilities: Lead Datadog Solutions Design and Implementation: Oversee the design and implementation of Datadog solutions to monitor and manage diverse infrastructure, applications, and services. Collaborate with IT and development teams for effective integration of Datadog into the existing technology stack. Integrate Datadog SaaS with cloud platforms, container orchestration tools, and on-premises hosts for comprehensive monitoring. Participate in Datadog Agent-based integrations, deployments, custom YAML configurations, and revision control options. Coordinate with DevOps teams to automate monitoring and response processes. Serve as the primary point of contact for Datadog-related inquiries, providing expert guidance and best practices. Stay updated on the latest Datadog features, integrations, and industry trends to optimize monitoring strategies. Act as a specialist in one or more Datadog product areas; reproduce technical issues and delve into Datadog's integrations. Lead a team of monitoring engineers, providing mentorship, training, and technical guidance; collaborate with cross-functional teams to align monitoring strategies with business goals. Design and implement performance monitoring solutions using Datadog to identify and address potential bottlenecks and inefficiencies. Work closely with system administrators and engineers to optimize resource utilization and enhance overall system performance. Maintain detailed documentation of Datadog configurations, monitoring policies, and incident response procedures. Conduct regular knowledge-sharing sessions to empower the broader technical team with Datadog expertise. Qualifications: Experience: Minimum of 10-12 years of recent experience in the DevOps/SRE space, with at least 7 years specifically involving Datadog. Relevant certifications such as Datadog Certified Associate or Datadog Certified Professional are preferred. Experience with SIEM (Security Information and Event Management) migration from tools such as New Relic, Splunk, AppDynamics, etc. Extensive hands-on experience with Datadog, including dashboards, alerts, and log analysis. Scripting experience using Python, PowerShell, and/or Bash. Excellent knowledge of Windows and Linux administration with an explorer's character.