Summary

Overview

Work History

Education

Skills

Timeline

Reshma G

Cary,NC

Summary

Responsive expert experienced in monitoring database performance, troubleshooting issues and optimizing database environment. Possesses strong analytical skills, excellent problem-solving abilities, and deep understanding of database technologies and systems. Equally confident working independently and collaboratively as needed and utilizing excellent communication skills.

Overview

years of professional experience

Work History

Data Engineer

TCS

01.2024 - Current

Developed Spark applications using Pyspark and Spark-SQL for data extraction, transformation, and aggregation from multiple file formats
Experience in Developing Spark applications using Spark - SQL in Databricks for data extraction, transformation, and aggregation from multiple file formats for analysing & and transforming the data to uncover insights into customer usage patterns
Developed Snowflake Data Flows using Snow Pipes and implement CDC using Streams and automate them using Cron-Jobs/Tasks
Built orchestration jobs to recreate tables in Snowflake to replicate SQL Server table structure and load the data
Experience with Snowflake Multi-Cluster Warehouses, and Snowflake Virtual Warehouses in building Snow pipes
Experience with AWS services related to data processing, such as Amazon EMR, Amazon S3, and AWS Glue, and proficiency in scripting languages such as Python and Bash for data processing and automation tasks
Worked with various formats of files like delimited text files, click stream log files, Apache log files, Avro files, JSON files, and XML Files
Mastered in using different columnar file formats like RC, ORC, and Parquet formats
Created and configured Snowflake warehouse strategy to move a terabyte of data from S3 into Snowflake via PUT scripts.

Data Engineer

AmeriGas

04.2023 - 12.2023

Design and Develop ETL Processes in AWS Glue to migrate data from external sources like S3, ORC/Parquet/Text Files into AWS Redshift
Collected data using Spark Streaming from AWS S3 bucket in near-real-time and performed necessary Transformations and Aggregation on the fly to build the common learner data model and persist the data in HDFS
Designed and Developed ETL Processes in AWS Glue to migrate Campaign data from external sources like S3, ORC/Parquet/Text Files into AWS Redshift
Involved in file movements between HDFS and AWS S3 and extensively worked with S3 bucket in AWS
Design, develop, and orchestrate data pipelines for real-time and batch data processing using AWS Redshift
Worked on structured, unstructured, and semi-structured data across various sources to identify patterns in data and implement data quality metrics using necessary Python scripts based on source.
Automated routine tasks using Python scripts, increasing team productivity and reducing manual errors.

Senior Data Engineer

EPIQ

03.2021 - 12.2021

Worked on Snowflake Schemas and Data Warehousing and processed batch and streaming data load pipeline using Snow Pipe from Data Lake to AWS S3 bucket
Used AWS data pipeline for Data Extraction, Transformation, and Loading from homogeneous or heterogeneous data sources and built various graphs for business decision-making
Created and configured Snowflake warehouse strategy to move a terabyte of data from S3 into Snowflake via PUT scripts
Loaded data from AWS S3 bucket to create and configure Snowflake warehouse strategy to move a terabyte of data from S3 into Snowflake via PUT scripts
Updated Python scripts to match training data with the database stored in AWS Cloud Search, so that each document is assigned a response label for further classification
Experience in building Snow pipes, Snowflake Clone, and Time Travel.

Big Data Engineer

Synechron

08.2018 - 01.2021

Architect & implement medium to large-scale BI solutions on Azure using Azure Data Platform services (Azure Data Lake, Data Factory, Data Lake Analytics, Stream Analytics, Azure SQL DW, and NoSQL DB)
Handling, transforming & managing Big Data using Big Data Frameworks & NoSQL databases
Worked on data migration from On-prem SQL server to Cloud databases (Azure Synapse Analytics (DW) & Azure SQL DB)
Have good experience working with Azure BLOB and Data Lake storage and loading data into Azure SQL Synapse analytics (DW)
Created SQL tables with referential integrity and developed queries using SQL, SQL PLUS, and PL/SQL
Performed data analysis and data profiling using complex SQL queries on various source systems including MS SQL Server and Teradata
Creating Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform, and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool, and backward.
Optimized data processing by implementing Hadoop and Spark frameworks for big data management.

Senior ETL Developer

IBM

03.2012 - 07.2018

Transforming business problems into Big Data solutions and defining Big Data strategy and Roadmap
Installing, configuring, and maintaining Data Pipelines
Developed Scala scripts, UDFs using both Data frames/SQL/Data sets and RDD in Spark for Data Aggregation, queries, and writing data back into OLTP system through Sqoop
Experienced in working with various kinds of data sources such as Teradata and Oracle
Successfully loaded files to HDFS from Teradata, and load loaded from HDFS to hive and impala
Responsible for wide-ranging data ingestion using Sqoop and HDFS commands
Accumulate ‘partitioned’ data in various storage formats like text, JSON, Parquet, etc
Involved in loading data from LINUX file system to HDFS
Create new mapping designs using various tools in Informatica Designer like Source Analyzer, Warehouse Designer, Mapplet Designer, and Mapping Designer
Performed data manipulations using various Informatica Transformations like Filter, Expression, Lookup (Connected and Un-Connected), Aggregate, Update Strategy, Normalizer, Joiner, Router, Sorter and Union.
Mentored junior developers, sharing knowledge of best practices in ETL development and design.

Education

Masters in Big Data Analytics -

University of Central Missouri

Missouri City, MO

12.2022

Skills

Data Modeling Techniques
Python Programming
Data Modeling
Data Pipeline Design

Big Data Processing
Spark Framework
AWS Glue, AWS Redshift, Lambda Functions
Azure Data Factory, Azure Synapse, Azure Data Lake

Timeline

Data Engineer

TCS

01.2024 - Current

Data Engineer

AmeriGas

04.2023 - 12.2023

Senior Data Engineer

EPIQ

03.2021 - 12.2021

Big Data Engineer

Synechron

08.2018 - 01.2021

Senior ETL Developer

IBM

03.2012 - 07.2018

Masters in Big Data Analytics -

University of Central Missouri

Reshma G

Summary

Overview

Work History

Data Engineer

Data Engineer

Senior Data Engineer

Big Data Engineer

Senior ETL Developer

Education

Masters in Big Data Analytics -

Skills

Timeline

Data Engineer

Data Engineer

Senior Data Engineer

Big Data Engineer

Senior ETL Developer

Masters in Big Data Analytics -

Similar Profiles

Akshay Ashok KoulAkshay Ashok Koul

Parthasarathy SarkarParthasarathy Sarkar

Vaibhav BapatVaibhav Bapat

ANINDITA SARKARANINDITA SARKAR

Margarita KaliviotisMargarita Kaliviotis