We are hiring for a Sr. AI/ML Data Scientist to support a multi-year federal program in Woodlawn, MD. Successful candidates must be able to work 2-3 days ONSITE at the government facility. Candidates must be within a 2-hour commuting radius of the government onsite facility. Successful candidates will possess a strong knowledge of AI/ML/LLM, Python, NLP, Generative AI, and experience with the clinical domain.
Position Description: Staying updated on the new methods in NLP, ML, and Generative AI understanding real-world challenges and developing automated data solutions. Developing, testing, and deploying new techniques for NLP understanding scalable development/deployment of ML and Generative AI approaches (such as Large Language Models (LLMs)). Training and optimizing NLP/LLM models and creating a Python-based pipeline. Determining the nature of analytic problems, evaluating options, and offering recommendations for resolution. Advising on the methods and data needed and/or available to evaluate the (intelligence or data) problem. Collaborating with data collectors and analysts to identify and close gaps on complex monitoring problems. Required Skills: Bachelor's degree in Applied Mathematics, Computer Science, or Information Science with industry experience in NLP, data science, AI/ML/LLM engineering. Minimum 8 years of Data Scientist experience. Must be able to obtain and maintain a Public Trust. Contract requirement. Experience with Natural Language Processing (NLP), Generative AI, and Large Language Models (LLM). Fluency in Python programming, version control, and collaboration with GIT, standard Python packages (e.g., Pandas, NumPy, Matplotlib), and ML frameworks. Knowledge of TensorFlow, PyTorch, scikit-learn, NLTK, Azure ML (optional), Amazon Web Services EC2. Experience with scalable data engineering frameworks such as Apache Spark and orchestration frameworks such as Airflow, and/or experience with semantic search. Expert knowledge in conducting data analysis and applying advanced statistical concepts and machine learning methods to build, train, test, and evaluate a variety of supervised and unsupervised analytic models. Experience with ML model deployment and operations like DevOps, MLOps, LLMOps. Experience with NLP and Generative AI libraries like regular expressions (like spaCy, LangChain), text annotation tools, and semantic frameworks. Experience with statistical and machine learning software such as Pandas and scikit-learn. Prior experience working on applications that relate to the clinical domain. Ability to clean and process large amounts of real-world data. Experience retrieving and manipulating data from various data sources including Db2, Oracle, SQL Server, Hadoop, and flat files. Experience with database management systems, e.g., MySQL, SQLite, SQL, etc. Either experience with, or the ability and willingness to learn distributed processing via the Hadoop ecosystem, i.e., Spark, Impala, and Hive. Excellent analytical skills to identify potential risks and propose effective solutions. Clear communication skills to convey complex technical concepts to various partners. Ability to collaborate with cross-functional teams. Providing problem-solving skills, proven communication in written and verbal formats to various audiences, including executive leadership. Desired Skills: Prior experience with federal or state governments IT projects. Prior experience working on applications that relate to the clinical domain. Experience working in an analytical research environment. Experience with statistical and machine learning software such as Pandas and scikit-learn. Experience in parallel processing such as GPU programming with CUDA. Experience using markup languages such as LaTeX, HTML, etc. Natural Language Processing for anomaly detection. Position Details: Clearance: Ability to Obtain a Public Trust
US Citizenship or Authorization to work in the US required
Travel: < 10% (CONUS)
Centurion Consulting Group, LLC is an Equal Opportunity Employer EOE M/F/D/V
No third parties or subcontractors.
#J-18808-Ljbffr