At Databricks, we are obsessed with enabling data teams to solve the world's toughest problems, from security threat detection to cancer drug development. We do this by building and running the world's best data and AI infrastructure platform, so our customers can focus on the high value challenges that are central to their own missions.
We are the Search Platform team, responsible for managing a complex system that indexes and enables searching of all user data created on Databricks platform. Our system supports the in-product search feature and various GenAI functionality, acting as a layer for information retrieval, including being the RAG layer for world class Data Intelligence Platform.
As a Software Engineer on our team, you will be at the forefront of our mission to develop the premier Data Intelligence platform, DatabricksIQ.
Responsibilities: Develop and manage our indexing infrastructure that operates independently across all major cloud platforms. Manage distinct real-time and offline indexing for various data types and processes terabytes of data in each region. Work on expanding our capabilities by indexing more data, enhancing reliability, and supporting a broader range of use cases beyond just search. Collaborate with the team to tackle complex challenges in distributed systems and cutting-edge technologies like vector search, LLM, RAGs. Competencies: BS (or higher) in Computer Science, or a related field. 6+ years of production level experience in one of: Java, Scala, C++, or similar language. 5+ years experience working in a related system. Experience in Search, distributed systems, search indexing, ETL pipeline, ElasticSearch, real-time message processing, vector search, scalable system design. Ability to identify performance issues, root cause problems, and be able to come up with potential solutions. Excellent cross-functional and communication skills, consensus builder.
#J-18808-Ljbffr