Summary
Overview
Work History
Education
Skills
Timeline
Generic

Reshma G

Cary,NC

Summary

Responsive expert experienced in monitoring database performance, troubleshooting issues and optimizing database environment. Possesses strong analytical skills, excellent problem-solving abilities, and deep understanding of database technologies and systems. Equally confident working independently and collaboratively as needed and utilizing excellent communication skills.

Overview

13
13
years of professional experience

Work History

Data Engineer

TCS
01.2024 - Current
  • Developed Spark applications using Pyspark and Spark-SQL for data extraction, transformation, and aggregation from multiple file formats
  • Experience in Developing Spark applications using Spark - SQL in Databricks for data extraction, transformation, and aggregation from multiple file formats for analysing & and transforming the data to uncover insights into customer usage patterns
  • Developed Snowflake Data Flows using Snow Pipes and implement CDC using Streams and automate them using Cron-Jobs/Tasks
  • Built orchestration jobs to recreate tables in Snowflake to replicate SQL Server table structure and load the data
  • Experience with Snowflake Multi-Cluster Warehouses, and Snowflake Virtual Warehouses in building Snow pipes
  • Experience with AWS services related to data processing, such as Amazon EMR, Amazon S3, and AWS Glue, and proficiency in scripting languages such as Python and Bash for data processing and automation tasks
  • Worked with various formats of files like delimited text files, click stream log files, Apache log files, Avro files, JSON files, and XML Files
  • Mastered in using different columnar file formats like RC, ORC, and Parquet formats
  • Created and configured Snowflake warehouse strategy to move a terabyte of data from S3 into Snowflake via PUT scripts.

Data Engineer

AmeriGas
04.2023 - 12.2023
  • Design and Develop ETL Processes in AWS Glue to migrate data from external sources like S3, ORC/Parquet/Text Files into AWS Redshift
  • Collected data using Spark Streaming from AWS S3 bucket in near-real-time and performed necessary Transformations and Aggregation on the fly to build the common learner data model and persist the data in HDFS
  • Designed and Developed ETL Processes in AWS Glue to migrate Campaign data from external sources like S3, ORC/Parquet/Text Files into AWS Redshift
  • Involved in file movements between HDFS and AWS S3 and extensively worked with S3 bucket in AWS
  • Design, develop, and orchestrate data pipelines for real-time and batch data processing using AWS Redshift
  • Worked on structured, unstructured, and semi-structured data across various sources to identify patterns in data and implement data quality metrics using necessary Python scripts based on source.
  • Automated routine tasks using Python scripts, increasing team productivity and reducing manual errors.

Senior Data Engineer

EPIQ
03.2021 - 12.2021
  • Worked on Snowflake Schemas and Data Warehousing and processed batch and streaming data load pipeline using Snow Pipe from Data Lake to AWS S3 bucket
  • Used AWS data pipeline for Data Extraction, Transformation, and Loading from homogeneous or heterogeneous data sources and built various graphs for business decision-making
  • Created and configured Snowflake warehouse strategy to move a terabyte of data from S3 into Snowflake via PUT scripts
  • Loaded data from AWS S3 bucket to create and configure Snowflake warehouse strategy to move a terabyte of data from S3 into Snowflake via PUT scripts
  • Updated Python scripts to match training data with the database stored in AWS Cloud Search, so that each document is assigned a response label for further classification
  • Experience in building Snow pipes, Snowflake Clone, and Time Travel.

Big Data Engineer

Synechron
08.2018 - 01.2021
  • Architect & implement medium to large-scale BI solutions on Azure using Azure Data Platform services (Azure Data Lake, Data Factory, Data Lake Analytics, Stream Analytics, Azure SQL DW, and NoSQL DB)
  • Handling, transforming & managing Big Data using Big Data Frameworks & NoSQL databases
  • Worked on data migration from On-prem SQL server to Cloud databases (Azure Synapse Analytics (DW) & Azure SQL DB)
  • Have good experience working with Azure BLOB and Data Lake storage and loading data into Azure SQL Synapse analytics (DW)
  • Created SQL tables with referential integrity and developed queries using SQL, SQL PLUS, and PL/SQL
  • Performed data analysis and data profiling using complex SQL queries on various source systems including MS SQL Server and Teradata
  • Creating Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform, and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool, and backward.
  • Optimized data processing by implementing Hadoop and Spark frameworks for big data management.

Senior ETL Developer

IBM
03.2012 - 07.2018
  • Transforming business problems into Big Data solutions and defining Big Data strategy and Roadmap
  • Installing, configuring, and maintaining Data Pipelines
  • Developed Scala scripts, UDFs using both Data frames/SQL/Data sets and RDD in Spark for Data Aggregation, queries, and writing data back into OLTP system through Sqoop
  • Experienced in working with various kinds of data sources such as Teradata and Oracle
  • Successfully loaded files to HDFS from Teradata, and load loaded from HDFS to hive and impala
  • Responsible for wide-ranging data ingestion using Sqoop and HDFS commands
  • Accumulate ‘partitioned’ data in various storage formats like text, JSON, Parquet, etc
  • Involved in loading data from LINUX file system to HDFS
  • Create new mapping designs using various tools in Informatica Designer like Source Analyzer, Warehouse Designer, Mapplet Designer, and Mapping Designer
  • Performed data manipulations using various Informatica Transformations like Filter, Expression, Lookup (Connected and Un-Connected), Aggregate, Update Strategy, Normalizer, Joiner, Router, Sorter and Union.
  • Mentored junior developers, sharing knowledge of best practices in ETL development and design.

Education

Masters in Big Data Analytics -

University of Central Missouri
Missouri City, MO
12.2022

Skills

  • Data Modeling Techniques
  • Python Programming
  • Data Modeling
  • Data Pipeline Design
  • Big Data Processing
  • Spark Framework
  • AWS Glue, AWS Redshift, Lambda Functions
  • Azure Data Factory, Azure Synapse, Azure Data Lake

Timeline

Data Engineer

TCS
01.2024 - Current

Data Engineer

AmeriGas
04.2023 - 12.2023

Senior Data Engineer

EPIQ
03.2021 - 12.2021

Big Data Engineer

Synechron
08.2018 - 01.2021

Senior ETL Developer

IBM
03.2012 - 07.2018

Masters in Big Data Analytics -

University of Central Missouri
Reshma G