Cloud Based Data Lake for Enhanced Performance

Applied Cloud Computing
5 min readAug 16, 2021

--

The world of Financial Services is a rich playground for real-time data analytics. It has all the necessary ingredients; exploding data volumes, millisecond latencies, extreme volatilities, and the need to detect complex patterns in real-time and act on them immediately. The ability to correlate, analyze and act on data, such as trading data, market prices, company updates, and other information coming through multiple sources at lightning speed is imperative to organizations within this industry.

Real-time insights and data in motion via analytics help organizations gain the business intelligence they need for digital transformation. From a business perspective, the potential benefits it can offer an organization are many — you can use location and contextual data to create better customer experiences, create radically new data-based products for your business, make more informed decisions in complex scenarios, carry out effective monitoring and analysis, detect even the smallest change and trigger immediate action, and extend your solutions to analyze the past, present, and the future.

One of India’s leading financial services firms had its data ware house at an on-premises setup. They had more than 6 Applications pushing data daily and had to run manual ETL (Extract, Transform and Load) processes to generate reports and KPI’s. Their ability to provide fresh, up-to-the-minute data to customers and partners was being hampered. To unlock insights from the data, the firm needed to overcome limitations in its on-premises data warehouse, which had grown complex and fragmented over time. They also needed to expand their reporting capabilities to address the needs of business units and meet increasingly complex reporting requirements for planning and compliance purposes. As customer data volumes increased with business growth; they recognized the need for scalable infrastructure.

The firm was looking for a Cloud-based solution that could help them scale & process data on desired response time. Also, on top of the current solution, they wanted to build an API layer that would create interactive reports and dashboards for visualizing the processed data.

For them, this meant engaging the services of Amazon Web Services (AWS). As an advanced AWS consulting partner, AWS recommended Applied Cloud Computing (ACC) to help them overcome the above challenges.

Solution

With the help of Applied Cloud Computing, the firm addressed its challenges by implementing several AWS solutions to tap the hidden potential of its unstructured data.

ACC worked with them to leverage Amazon Simple Storage Service (Amazon S3) to build the company’s new customer data lake and incorporated Amazon Redshift as a data warehouse to enable query-in-place and analytics capabilities to analyze petabytes of structured and semi-structured data across the firm’s data lake. Finally, AWS Glue was adopted as a fully managed extract, transform, and load (ETL) service to enable its data analysts to easily prepare and load data for analytics.

We at ACC considered three applications for which the data warehouse would be replicated on the cloud.

  • CBOS — It stored all back-office-related data.
  • CRM — It stored data related to Client Mapping with Advisor & Sales Mapping.
  • 64 Bit — It stored all Trading Related Information. Major data would be shared by this application on a daily basis.

A total of three glue jobs were performed to replicate the data warehouse on the cloud as well as analyze the data once it was cleaned and stored in a structured format.

In addition to this, we at ACC build an API layer that would help create dynamic reports and interactive dashboards to visualize the processed data for the firm so they could derive meaningful insights from them.

Benefits

Scaling Up with Business Growth : With rapid customer growth, the last thing the firm needed to worry about was whether the data infrastructure could keep up with business demands. ACC provided them with a solution that was highly agile and flexible, allowing them to scale quickly and service their customers with better insights for improved end-user satisfaction — all the while keeping costs under control.

Well-integrated Data Architecture : By building the new data lake, the firm could bring together all its data from various sources, and at any scale, into one central repository. The firm could also run different types of analytics–from dashboards and visualizations to big data processing, real-time analytics, and machine learning–to guide better financial decisions.

Leveraging Parquet for higher performance : Parquet is a columnar data format that provides superior performance and allows Amazon Redshift to scan significantly fewer data. With less I/O, queries run faster and we pay less per query.

Lower cost : From a cost perspective, one pays standard rates for their data in Amazon S3, and only small amounts per query to analyze data with Amazon Redshift. Using the Parquet format, one can significantly reduce the amount of data scanned. The costs are much lower, and users get fast results even for large complex queries. With the pay-as-you-go model one can further manage their cost. By rightsizing cloud resources, and continually optimizing the use of software licenses, our goal was to put the best cost optimization strategy in place.

Enhanced Performance : In an industry as dynamic as Financial Services, speed matters. The firm was finding it difficult with its current on-premises data warehouse to service its global customer base. We provide them with a solution that served 99.99 uptime and the ability to make real-time updates, which is essential for delivering financial data.

Improved Security : As financial data transactions were a part of the daily routine for them, a highly secure system was to be implemented. A modern cloud solution with a secure connection between the on-premises data warehouse and the cloud network to avoid even the smallest of chances of any cyber threats and/or attacks, that could result in massive outages was ensured.

--

--

Applied Cloud Computing
Applied Cloud Computing

Written by Applied Cloud Computing

Applied Cloud Computing (ACC) is an IT Services & Consulting Company. It helps customer in Product Engineering, Digitalization, Big Data & Security Assessment.

No responses yet