Key decision factors:
• Cloudera, Spark, Scala, HDFS, Hive
• Experience with Cloudera upgrades and migrations
• E2E data engineers, do their own automation and testing – thus relatively senior people (8yrs+)
Key Responsibilities:
· Design, build, and maintain data processing systems to migrate and refactor data pipelines in CDP – with related version upgrades of underlying components.
• 4+ years of experience in data engineering and migration, preferably with Cloudera Data Platform.
• Strong understanding of data processing and migration tools, such as Spark, HDFS, Scala, Airflow
• Perform data analysis, data profiling, and data mapping to ensure data integrity during the migration process.
• Develop and implement data migration strategies, including data validation and testing.
• Collaborate with cross-functional teams to ensure that data is migrated in a timely and efficient manner.
• Create technical documentation, including data migration plans, data flow diagrams, and process documentation.
• Provide technical guidance and support to team members and stakeholders.
• Experience with scripting languages, such as Python or Shell scripting.
• Cloudera Developer certification