We are seeking a Hands-On Data Architect to design, build, and operate a high-scale, event-driven data platform supporting payment and channel operations. This role combines strong data architecture fundamentals, deep streaming expertise, and hands-on engineering in a regulated, high-throughput environment.
You will lead the evolution from legacy data ingestion patterns to a modern AWS-based lakehouse and streaming architecture, handling tens of millions of events per day, while applying domain-driven design (DDD) and data-as-a-product principles.
This is a builder role, not a documentation-only architect position.
Key Responsibilities
Data Products & Architecture
• Design and deliver core data products including:
• Channel Operations Warehouse (high-performance, ~30 days retention)
• Channel Analytics Lake (long-term retention, 7+ years)
• Define and expose data APIs and status/statement services with clear SLAs.
• Architect an AWS lakehouse using S3, Glue, Athena, Iceberg, with Redshift for BI and operational analytics.
• Enable dashboards and reporting using Amazon QuickSight (or equivalent BI tools).
Streaming & Event-Driven Architecture
• Design and implement real-time streaming pipelines using:
• Kafka (Confluent or AWS MSK)
• AWS Kinesis / Kinesis Firehose
• EventBridge for AWS-native event routing
• Define patterns for:
• Ordering, replay, retention, and idempotency
• At-least-once and exactly-once processing
• Dead-letter queues (DLQs) and failure recovery
• Implement CDC pipelines from Aurora PostgreSQL into Kafka and the lakehouse.
Event Contracts & Schema Management
• Define and govern event contracts using Avro or Protobuf.
• Manage schema evolution through Schema Registry, including:
• Compatibility rules
• Versioning strategies
• Backward and forward compatibility
• Align domain events with Kafka topics and analytical storage models.
Migration & Modernization
• Assess existing 'as-is' ingestion mechanisms (APIs, files, SWIFT feeds, Kafka, relational stores).
• Design and execute migration waves, cutover strategies, and rollback runbooks.
• Ensure minimal disruption during platform transitions.
Governance, Quality & Security
• Apply data-as-a-product and data mesh principles:
• Clear ownership
• Quality SLAs
• Access controls
• Retention and lineage
• Implement security best practices:
• Data classification
• KMS-based encryption
• Tokenization where required
• Least-privilege IAM
• Immutable audit logging
Observability, Reliability & FinOps
• Build observability for streaming and data platforms using:
• CloudWatch, Prometheus, Grafana
• Track operational KPIs:
• Throughput (TPS)
• Processing lag
• Success/error rates
• Cost per million events
• Define actionable alerts, dashboards, and operational runbooks.
• Design for high availability with multi-AZ / multi-region patterns, meeting defined RPO/RTO targets.
Hands-On Engineering
• Write and review production-grade code using:
• Python, Scala, SQL
• Spark / AWS Glue
• AWS Lambda & Step Functions
• Build infrastructure using Terraform (IaC).
• Implement CI/CD pipelines (GitLab, Jenkins).
• Enforce automated testing, performance profiling, and secure coding practices.
Required Skills & Experience
Streaming & Event-Driven Systems
• Strong experience with Kafka (Confluent) and/or AWS MSK
• Experience with AWS Kinesis / Firehose
• Deep understanding of:
• Event ordering and replay
• Delivery semantics
• Outbox and CDC patterns
• Practical experience using EventBridge for event routing and filtering
AWS Data Platform
• Hands-on experience with:
• S3, Glue, Athena
• Redshift
• Step Functions and Lambda
• Familiarity with Iceberg-based lakehouse architectures
• Experience building streaming pipelines into S3 and Glue
Payments & Financial Messaging
• Experience with payments data and flows
• Knowledge of ISO 20022 messages:
• PAIN, PACS, CAMT
• Understanding of payment lifecycle, reconciliation, and statements
• Exposure to API, file-based, and SWIFT-based integration channels
Data Architecture Fundamentals (Must-Have)
• Logical data modeling (ER diagrams, normalization up to 3NF/BCNF)
• Physical data modeling:
• Partitioning strategies
• Indexing
• SCD types
• Strong understanding of:
• Transactional vs analytical schemas
• Star schema, Data Vault, and 3NF trade-offs
• Practical experience with:
• CQRS and event sourcing
• Event-driven architecture
• Domain-driven design (bounded contexts, aggregates, domain events)