In this role, you will:
In Depth knowledge and hands on exp using python (pyspark) on real time data streaming technology - Kafka, spark, lambda
Experience in building data processing pipelines on AWS using AWS MSK, EMR (Spark Streaming), Dynamo DB , Lambda , Glue, Athena
Knowledge of device data ingestion and processing using AWS IoT core, IoT rules and event bridge
Design, implement and optimize Kafka and Spark based NRT data processing pipelines
Expertise in building reusable, cloud native, scalable and reliable frameworks, and tools
Design and implement reusable and cost-effective solution to meet functional and nonfunctional requirements like availability, latency, fault tolerance
Architectural & Design Skills
Designing scalable data pipelines for IoT telemetry
Real-time vs batch processing architecture
Data governance, security, and compliance
Cost optimization strategies on AWS
Technical Skill Set
Cloud & Infrastructure (AWS)
Amazon EMR – for big data processing using Spark/Hadoop
AWS Lambda, Step Functions – for serverless workflows
S3, DynamoDB, RDS – for data storage and management
IAM, KMS, CloudWatch, CloudTrail – for security and monitoring
AWS IoT Core for IoT device integration
Big Data & Analytics
Apache Spark & PySpark – for distributed data processing
Data ingestion using Kinesis, Kafka, or AWS IoT Analytics
ETL pipeline design and optimization
Data lake architecture using S3 + Glue + Athena
Programming & Scripting
Python – core language for scripting, automation, and data processing
Boto3 – AWS SDK for Python
SQL – for querying structured data
Shell scripting – for automation on EMR or EC2
Education Qualification
Bachelor’s degree in engineering with minimum 5+ years of experience in relevant technologies.
Technical Expertise:
Excellent knowledge of software design and coding principles
Experience working in an Agile environment
Familiarity with versatile implementation options
Demonstrates knowledge on technical topics, such as caching, APIs, data transfer, scalability, and security
Experience in building and managing big data solutions, Data Lakes, Data Warehouses, Data Integration, Data Migration, and Business Intelligence/Artificial Intelligence solutions on the Cloud (AWS)
Experience in architecting and implementing data mesh and data fabric solutions specifically leveraging AWS services, including designing domain-oriented data architectures, data products, and data access patterns in a multi-tenant environment.
Expertise in API integration, Subscription based APIs, Multi tenancy, enabling efficient data exchange and synchronization between various applications and platforms.
Familiarity with advanced data management principles and best practices within AWS environments, including data as a service, data modelling, data lineage, data cataloguing, and metadata management.
Develop and maintain data models, schemas, and databases while ensuring high performance, security, and reliability in a global context.
Expertise in data modelling, database design principles, and best practices for data management within a global context.
Business Acumen:
Demonstrates the initiative to explore alternate technology and approaches to solving problems
Skilled in breaking down problems, documenting problem statements and estimating efforts
Any question or remark? just write us a message
If you would like to discuss anything related to payment, account, licensing,
partnerships, or have pre-sales questions, you’re at the right place.