Role: GCP Consultant
Location: Remote
JOB description:
In-depth understanding of Google's product technology and underlying architecture:
• Big Query - Warehouse/ data marts - Through understanding of Big Query internals to write efficient queries for ELT needs, creation of views/materialized views, creation of reusable store procedures, etc.
• DataFlow (Apache Beam) - reusable Flex templates/ data processing frameworks using Java for both batch and stream needs.
• c - Real-time streaming of database changes or events.
• Experience in designing, building, and deploying production-level data pipelines using Kafka; Strong experience working on Event-Driven Architecture
• Strong knowledge of the Kafka Connect framework, with experience using several connectors, types HTTP REST proxy, JMS, File, SFTP, JDBC, etc.
• Experience in handling huge volumes of streaming messages from Kafka
• Cloud Composer (Apache Airflow) - to build, monitor, and orchestrate the pipeline
• Knowledge on Bigtable
• Cloud SQL, Compute Engine, Cloud Function, Cloud Run and App Engine, Cloud Storage
• Experience with open-source distributed storage and processing utilities in the Apache Hadoop family.
• Extensive knowledge on processing various file formats orc, Avro, csv, json, xml etc.
• Knowledge/experience in any ETL tools like DataStage/Informatica - Ability to understand existing on-premises ETL workflows and redesign them in GCP.
• Experience and expertise on Terraform to deploy the GCP's in CI/CD.
• Knowledge/ Experience in connecting to on-prem API from google cloud.