Software Engineer, Llm Inference Engine And Product

Details of the offer

Job title: Software Engineer, LLM Inference Engine and Product / Member of Technical Staff
Who We Are
WaveForms AI is an Audio Large Language Models (LLMs) company building the future of audio intelligence through advanced research and products.
Our models will transform human-AI interactions making them more natural, engaging and immersive.
Role overview: The Software Engineer, LLM Inference Engine and Product will focus on developing and optimizing a real-time inference engine for multimodal large language models (LLMs) that handle audio and text inputs seamlessly.
This role involves leveraging technologies such as LiveKit, RTC engines, WebRTC, and FastAPI to create an efficient, real-time API layer.
You will contribute to cutting-edge AI systems that enable smooth user experiences across platforms, including iOS, Android, and desktop.
Key Responsibilities Real-time Inference Development: Build and optimize a robust inference engine that supports multimodal LLMs, handling real-time audio and text inputs.
Technology Integration: Leverage tools like LiveKit, RTC engines, WebRTC, and FastAPI to enable low-latency, real-time communication and inference.
End-to-End Pipeline Design: Create and maintain the complete inference pipeline, from data ingestion to model serving, ensuring real-time performance.
Cross-platform Compatibility: Ensure the inference engine operates efficiently across various platforms, including mobile (iOS/Android) and desktop.
Optimization & Performance Tuning: Optimize the inference system to reduce latency, improve throughput, and enhance user experience.
API Development: Design and maintain scalable APIs to support real-time LLM interaction for diverse applications.
Required Skills & Qualifications Inference Engine Expertise: Proven experience in building and optimizing inference engines for multimodal AI systems, particularly combining audio and text inputs.
Technical Proficiency: Strong experience with LiveKit, RTC engines, WebRTC, and FastAPI for real-time communication and model inference.
Real-time System Design: Expertise in creating real-time pipelines and maintaining low-latency performance in production systems.
Cross-platform Development: Familiarity with iOS, Android, and desktop app development, ensuring seamless integration with inference systems.
Performance Optimization: Proficiency in optimizing inference engines to reduce latency and improve computational efficiency.
API Development: Experience in designing and maintaining APIs for real-time AI applications.

#J-18808-Ljbffr

Nominal Salary: To be agreed

Source: Jobleads

Job Function:

Engineering

Requirements

Similar offers

See more similar offers

Solutions Engineering Senior Manager

We are seeking a strategic and innovative Solutions Engineering Senior Manager to build our Sales Engineering team. Reporting directly to the Head of Growth,...

Dynamo Ai - California

Published 6 days ago

Analog And Mixed Signal Ic Design Engineer Intern

Overview Keysight is on the forefront of technology innovation, delivering breakthroughs and trusted insights to the world's visionaries and innovators in el...

Keysight Technologies Sales Spain Sl. - California

Published 6 days ago

Lead Ois Infrastructure And Deployment Engineer

Lead OIS Infrastructure and Deployment EngineerLocation: San Jose Time Type: Full time Posted on: Posted 3 Days Ago Job Requisition ID: R2024-1822 Are you a ...

Elekta Ab - California

Published 6 days ago

Senior Software Engineer, Time Products Engineering San Francisco, Ca

Rippling gives businesses one place to run HR, IT, and Finance. It brings together all of the workforce systems that are normally scattered across a company,...

Rippling - California

Published 6 days ago

Built at: 2025-01-22T04:30:30.312Z