Dice is the leading career destination for tech experts at every stage of their careers. Our client, Dizer Corp, is seeking the following. Apply via Dice today!
Title : Data Engineer with Machine Learning
Location : Remote
Description
• Design and develop data pipelines for Generative AI projects by leveraging a combination of technologies, including Vector DB, Graph DB, Airflow, Spark, PySpark, Python, LangChain, AWS Functions, Redshift, and SSIS. This will involve the logical and efficient integration of these tools to create seamless, high-performance data flows that efficiently support the data requirements of our cutting-edge AI initiatives. Collaborate with data scientists, AI researchers, and other stakeholders to understand data requirements and translate them into effective data engineering solutions.
• Demonstrate familiarity with data integration services such as AWS Glue and Azure Data Factory, showcasing the ability to effectively utilize these platforms for seamless data ingestion, transformation, and orchestration across various sources and destinations.
• Possess proficiency in constructing data warehouses and data lakes, demonstrating a strong foundation in organizing and consolidating large volumes of structured and unstructured data for efficient storage, retrieval, and analysis.
• Optimize and maintain data pipelines to ensure high-performance, reliable, and scalable data processing.
• Develop and implement data validation and quality assurance procedures to ensure the accuracy and consistency of the data used in Generative AI projects.
• Stay current with emerging trends and technologies in the fields of data engineering, Generative AI, and related areas to ensure the continued success of our projects.
• Collaborate with team members on documentation, knowledge sharing, and best practices for data engineering within a Generative AI context.
• Ensure data privacy and security compliance in accordance with industry standards and regulations.
• Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
• Strong experience with data engineering technologies, including Vector DB, Graph DB, Airflow, Spark, PySpark, Python, langchain, AWS Functions, Redshift, and SSIS.
• Familiarity with Generative AI concepts and technologies, such as GPT-4, Transformers, and other natural language processing techniques.
• Strong understanding of data warehousing concepts, ETL processes, and data modeling.
• Knowledge of cloud computing platforms, such as AWS, Azure, or Google Cloud Platform, is a plus.
• Experience with big data technologies, such as Hadoop, Hive, or Presto, is a plus.
• Familiarity with machine learning frameworks, such as TensorFlow or PyTorch, is a plus.
• A continuous learning mindset and a passion for staying up-to-date with the latest advancements in data engineering and Generative AI.
Thanks & Regards
Satish Reddy |Lead Technical Recruiter |Dizer Corp.
1912 Mentor Ave | Painesville | OH 44077
Direct : |Work : Ext: 134 |
Linkedin: