Senior Software Engineer- Reliability

Details of the offer

Luma's mission is to build multimodal AI to expand human imagination and capabilities. We believe that multimodality is critical for intelligence. To go beyond language models and build more aware, capable and useful systems, the next step function change will come from vision. So, we are working on training and scaling up multimodal foundation models for systems that can see and understand, show and explain, and eventually interact with our world to effect change.
The SRE role at Luma AI sits in the infrastructure team, and is responsible for defining, measuring and improving the reliability of Luma's GPU clusters. The SRE team works closely with the research teams to improve the functioning of the existing research platform and build the future platform. This is the team that helps make the infrastructure enabling progress at the leading AI lab. Startup Mindset Value velocity and execution.Communicate clearly.Focus on building what will matter to users and the product.Be resourceful at finding creative ways to overcome challenges.Experience Proven work experience 5+ yrs as a reliability engineer, production engineer, infrastructure software engineer or a similar role in a fast-paced, rapidly scaling company.Strong proficiency in GPU cloud infrastructure, including the underlying concepts of scheduling, scaling, cloud storage, networking and security.Proficiency in programming/scripting languages.Experience with containerization technologies and container orchestration platforms like Kubernetes or equivalent.Knowledge of IaC tools such as Terraform or CloudFormation or equivalent.Excellent problem-solving and troubleshooting skills.Strong communication and collaboration skills.Experience with observability tools; examples include DataDog, Prometheus, Grafana, Splunk and ELK stack or similar.Knowledge of security best practices in cloud environments.Good to have experience as an SRE within the AI/ML space is strongly preferred.Benefits Equity grant to reflect the incredible value you will bring to Luma, with annual refreshes.Excellent salary and benefits.Full health, dental, and vision coverage.Latest and greatest gear.Stipends towards wellness, house cleaner, and phone/internet.Unlimited paid time off with 12 days minimum.Unlimited sick days.Why Join Luma AI You will get to work with the world's best AI researchers, shipping their research to millions of users around the world.You will be equipped with all the tools, technologies, resources, and AI tools you need to get the job done.We build. We ship. Your work will matter to people.We are building a very widely usable product, and you'll get to work on an equally wide variety of challenging problems.We have fantastic traction from early customers, whom you'll get to work directly with.We have the backing of some of the best VCs in Silicon Valley and Angels from across the industry.$180,000 - $250,000 a year
In addition to cash base pay, you'll also receive a sizable grant of Luma's equity. The pay range for this position is for the Bay Area. Base pay offered may vary depending on job-related knowledge, skills, candidate location, and experience. Your application is reviewed by real people.


#J-18808-Ljbffr


Nominal Salary: To be agreed

Source: Jobleads

Requirements

Enterprise Architect, Information Architecture

Keurig Dr Pepper is seeking an Enterprise Architect, Information Architecture who can play an instrumental role in driving our data-driven business transform...


Keurig Dr Pepper - California

Published 14 days ago

Machine Learning

W-2 Open Positions Need to be Filled Immediately. Consultant must be on our company payroll, Corp-to-Corp (C2C) is not allowed. Candidates encouraged to appl...


Ethereum Technologies Llc - California

Published 14 days ago

Software Engineer - Sensors

Hivemapper is a decentralized global mapping data network built by 10s of thousands of people. 2024 will see us 10x the number of contributors to the network...


Hive - California

Published 14 days ago

Staff Software Engineer - Incident Management San Francisco

Staff Software Engineer - Incident ManagementPagerDuty empowers teams of all kinds to do the critical work that moves business forward through the PagerDuty ...


Pager - California

Published 14 days ago

Built at: 2024-12-18T10:09:13.942Z