🔆 Overview

Crayon Data, an esteemed client of ours, stands as a leading provider of AI-led revenue acceleration solutions. Headquartered in Singapore, with a local footprint extending to India and the UAE, Crayon Data specialises in harnessing artificial intelligence to propel revenue growth for businesses worldwide. We are privileged to collaborate with Crayon Data, combining our technical expertise to augment their innovative solutions. Together, we endeavour to redefine possibilities in the domain of revenue acceleration through AI.

🎢 Challenges

⦿ Data Sync should happen within few hours(2-4)

⦿ Data Size is huge around 100TB

⦿ Multiple AWS Accounts so Cross-Account Privileges/Permissions needed

⦿ No Loss of Data on S3

⦿ Only one Prod Env

🏆 Solution

⦿ Creating Infrastructure Resources(VPC, Subnets, RouteTables, IGW,S3) using Cloudformation(IaC) in the AWS Environment.

⦿ Launching EC2's from AMI's from existing Blue account in the New VPC.

⦿ Creating a VPC Peering connection between Account A and Account B.

⦿ Create VPC endpoints for services (EC2/S3/EMR).

⦿ Create Cross Account Roles.

⦿ Apache Airflow run's on the EC2 servers so once we launch it from AMI's test against new S3 buckets, EMR resources.

⦿ Start Data Migration using S3 CLI tools aws s3 cp to copy data from Account A to Account B.

⦿ Use aws s3 sync to make sure data is being continuously being synced.

⦿ Once testing in the Green AWS B Account is completed with New s3 and New Emr environment plan the cutover.

⦿ Shutdown the Blue Environment, make the requisite DNS changes and turn ON the Green Environment.

📌 echnologies/Tools/Services Used-

1. AWS CloudWatch

2. AWS EC2

3. AWS EBS

4. AWS NACL & AWS Security Groups

5. IAM Roles & Policies

6. AWS ALB

7. AWS VPC Endpoints

8. AWS Classic ELB

9. AWS S3

10. AWS EMR

11. AWS Route53

12. AWS CLI

13. AWS Cloudformation

14. Apache Airflow

15. AWS SSM

📈 Impact

Data synced from DataLake Account A to Account B S3 Buckets in less than 2 hours with S3 sync optimizations done by CloudWayZ.

Customer reported that there was no data loss and RTO was reported to be less than an hour.

100 TB data synced using S3 sync which did not cost any AWS Service charges only AWS data In & Out charges initially team was planning to use AWS Data Sync which could have cost a lot more.

Saved Cost where using Right Sized Instances, not GP3 SSD volumes, NAT Gateway's were removed etc.

Talk About Your Business