Who is Recruiting from Scratch: Recruiting from Scratch is a talent firm that focuses on placing the best candidate for our clients. Our team is 100% remote and we work with teams across North America, South America, and Europe to help them hire. Senior ML Infrastructure Engineer | AI Infrastructure Scale-Up | SF Based Base: $180K - $300K + Equity (0.1-3%) | Visa Sponsorship Available
Are you excited about building the future of AI infrastructure? We're scaling our inference systems to handle millions of LLM requests daily, and we need exceptional talent to drive this growth.
The Role: We're seeking a Senior ML Infrastructure Engineer to architect and implement large-scale, fault-tolerant systems. You'll be joining a team that's pushing the boundaries of AI infrastructure, handling hundreds of millions of API calls daily.
What You'll Do:
Design and implement distributed systems for our inference network Develop resource allocation models across heterogeneous hardware Optimize network performance metrics (latency, throughput, availability) Build robust monitoring and observability systems Drive architectural decisions and best practices Collaborate directly with founders and engineering teams What You Bring:
5+ years building high-performance, scalable distributed systems Strong programming skills in TypeScript, Python, and either Go, Rust, or C++ Experience with Kubernetes/Nomad orchestration Hands-on experience with AI tooling (ChatGPT, Claude, Cursor) GPU programming and optimization skills (CUDA experience is a plus) Startup experience (pre-seed to series A) Bonus Points:
Experience with LLM inference engines (vLLM, TensorRT-LLM) Track record of scaling distributed systems Location & Details:
San Francisco, CA (In-person) Full-time W-2 position
#J-18808-Ljbffr