Data Scientist // Houston, TX (Hybrid) 12+yrs exp H1B OR H4EAD - C2C.

Houston 1 days agoContractor External
Negotiable
Data Scientist – NLP & AI (Healthcare) Positions: 5 Location: Houston, TX (Hybrid – 3 days onsite per week) Experience: 12+ years About the Role As a Senior Data Scientist – NLP & AI, you will be part of an agile healthcare AI team focused on building intelligent clinical data solutions. The role involves developing advanced NLP models, integrating large language models (LLMs) and agentic workflows, and leveraging AWS big data platforms to improve clinical data processing, insights, and usability. This is a hands-on role for experienced professionals with strong healthcare NLP, LLM, and cloud data engineering expertise. Key Responsibilities • Analyze and process clinical textual data using AI-driven NLP techniques and advanced machine learning / deep learning models. • Design, modify, and enhance existing workflows by incorporating LLMs and agentic AI frameworks (e.g., LangGraph) for healthcare use cases. • Develop and maintain NLP modules using Python and related scripting languages. • Perform data preprocessing, quality analysis, and performance validation for NLP outputs. • Create systematic testing procedures, error-handling mechanisms, and documentation for NLP modules. • Build and optimize ETL pipelines to ingest and process data from diverse sources, including MCP servers. • Leverage SQL and AWS big data technologies (EMR, Spark/pySpark) to support scalable data workflows. • Collaborate closely with engineering and platform teams to ensure efficiency, scalability, and reliability. • Utilize AWS services, particularly AWS Bedrock, to develop and deploy generative AI solutions. • Work with relational databases such as PostgreSQL or MySQL for data storage and retrieval. Required Skills & Experience • 12+ years of experience in data science, NLP, or AI-focused roles. • Strong proficiency in Python for NLP and machine learning development. • Proven experience with clinical NLP, including ML and deep learning approaches. • Hands-on experience with large language models (LLMs) and agentic workflows (e.g., LangGraph). • Expertise in SQL and AWS big data platforms, including EMR and Spark/pySpark. • Practical knowledge of AWS services, with emphasis on AWS Bedrock for generative AI. • Experience working with relational databases (PostgreSQL, MySQL). Nice-to-Have Skills • Experience building generative AI solutions in healthcare environments. • Familiarity with healthcare data standards such as HL7, FHIR, and CCDA. • Background in automated testing and validation frameworks for NLP models. • Experience creating technical documentation, user manuals, and specifications. • Exposure to LangChain or similar AI orchestration frameworks. • Strong collaboration skills with cross-functional teams (engineering, product, data). Education • Engineering or Science degree (BE / ME / BTech / MTech / BSc / MSc). • Technical certifications across relevant technologies are a plus.