Pranusha Simhadri - Software Engineer -II

Summary

Detail-oriented and highly skilled Software Engineer with 5+ years of experience in designing, implementing, and maintaining data pipelines and infrastructure. Proficient in various programming languages and technologies, adept at optimizing data systems for efficiency and reliability. Seeking to leverage expertise in data engineering to contribute to the success of a dynamic organization.

Overview

11

years of professional experience

Work History

Software Engineer -II

S & P Global

Princeton

02.2020 - 03.2025

· Led the design, development, and deployment of scalable ETL pipelines using AWS Glueand AWS Data Pipeline, processing and transforming large datasets from multiple sources.

· Implemented real-time and batch data ingestion frameworks with Amazon EMRand Apache Spark, optimizing for high-performance distributed computing.

· Built serverless ETL workflows with AWS Lambda, improving scalability and reducing operational overhead.

· Designed and optimized Amazon Redshiftschemas for efficient data warehousing and query performance.

· Developed real-time data streaming solutions using Kafkaand integrated them with AWS Lambdato process and analyze large volumes of data.

· Utilized Hadoopand Hivefor large-scale batch data processing and analytics.

· Built and maintained data processing pipelines for No-SQL DB (e.g., MongoDB, Cassandra) to support high-volume, unstructured data analysis.

· Created and optimized ETL workflowswith Informatica IICS,and DataStageto migrate and integrate data across various systems.

· Integrated data quality and governance processes using AWS Glue Data Catalogand metadata management.

· Collaborated with cross-functional teams to implement CI/CD pipelineswith Jenkinsand Terraformfor automated deployments and infrastructure management.

· Ensured end-to-end monitoring and troubleshooting of data pipelines using AWS CloudWatchand Splunk.

· Designed and implemented data warehousing solutions using Snowflake to store and analyze structured and semi-structured data, ensuring scalability and performance optimization.

· Optimized query performance in Snowflake through clustering keys and partitioning strategies, reducing query latency.

· Integrated Snowflake with AWS services (S3, Lambda) for seamless data ingestion and processing.

Software Engineer

Parexel

Hyderabad

01.2014 - 06.2018

· Clients: Bayer, Roche, Johnson & Johnson, Pfizer, Genentech, Novartis, Nova Nordisk, Merk, GSK

· Developed Oracle DB driven Java Interactive Voice Response(IVR) & Interactive Web Response(IWR) calls for Clinical trial process (EDC, IRT, eTMF, CTMS, clinical data warehouses)

· Involved in the full software developing cycle from collecting requirements/analysis, application design, developing code, testing, debugging, and maintenance

· Used Oracle 11g as a database, a cluster of Tomcats as the application server, Eclipse as the developing IDE

· Scripted complex SQL queries, Stored procedures, and SQL Jobs to improve the efficiency of Internal Process

· Analyzed client requirements and developed logic models

· Provided Knowledge Transfer to other support teams

· Clients: Bayer, Roche, Johnson & Johnson, Pfizer, Genentech, Novartis, Nova Nordisk, Merk, GSK

Education

Master of Science - Computer Science

Villanova University

Villanova, PA

05-2020

Bachelor of Science - Computer Science

GITAM University

Vizag, AP, India

05-2013

Skills

Cloud Platforms:AWS (AWS Glue, AWS Data Pipeline, Amazon EMR, AWS Lambda, Amazon Redshift)

Big Data Technologies:Hadoop, Hive, Spark, No-SQL DB(Cassandra), Kafka, API

Programming Languages:Python, Scala, Java, SQL, PySpark

ETL Tools:SSIS, Informatica IICS,DataStage

Data Warehousing:Redshift, Snowflake, Amazon S3, PostgreSQL

DevOps & CI/CD:Terraform, Docker, Git, Jenkins

Workflow Orchestration:Apache Airflow, AWS Step Functions

Monitoring & Logging: CloudWatch, Splunk,

Academic Projects:

Data Science with Python:

· Conducted comprehensive exploratory data analysis, statistical analysis, and data visualization using Numpy, Pandas, Seaborn, Matplotlib, Plotly, Folium packages in Python on a huge dataset of Barcelona city.

· Sentiment analysis with Movie review datasets from Twitter API :Conducted sentiment analysis on movie reviews gathered from Twitter, leveraging the Twitter APIto extract real-time tweets related to popular movies and evaluate public sentiment towards them. The goal was to assess how audiences feel about specific movies by analyzing their Twitter posts and categorizing their sentiment into positive, negative, or neutral. Utilized Tweepy and Twitter API to collect a large volume of tweets related to specific movies, hashtags, or movie titles in real-time. Implemented query filtering to extract relevant tweets, focusing on specific time frames, keywords, and geographical regions. Cleaned and preprocessed the raw tweet data by removing stopwords, special characters, URLs, and non-alphanumeric characters to prepare it for text analysis.Applied machine learning techniques, such as Naive Bayes, Logistic Regression, and Support Vector Machines (SVM), to classify tweets as positive, negative, or neutral. Used scikit-learn and TensorFlowfor model training and evaluation, tuning hyperparameters to achieve optimal performance. Created visualizations with Matplotlib and Seabornto display sentiment distributions across movie reviews, showing how movie audiences felt about a movie over time or based on tweet volume. Built interactive sentiment dashboards using Plotly and Dash, enabling users to explore sentiment analysis results on different movies, track trending opinions, and view sentiment shifts. Evaluated model performance using metrics such as accuracy, precision, recall, and F1-scoreto assess the effectiveness of the sentiment analysis model.

DevOps: Tools and techniques:

· Designed an application using Microsoft Azure Cloud services suitable for continuous deployment systems with a DevOps focus.

· Created deployment processes by developing continuous integration tools like Jenkins.

· Maintained GIT repositories for DevOps Environment: version control and build automation integrating git into Jenkins.

· Website Development in Microsoft Azure cloud, Built a website in C# and Visual Studio using Azure WebApp function in Azure cloud services.

· Provisioned MySQL to the web app using the Azure portal. Created Azure function App and containers to the storage.

JAVA:

· RestAPI using Spring boot, Hibernate in JAVA8, Created Services to consume REST APIs and to communicate between components using Dependency Injection.

· Applied Object-Oriented Programming (OOP) principles by using Interface, Abstract, Overriding, and Overloading. Developed server-side applications to interact with the database using Spring Boot and Hibernate to query the database and perform other CRUD operations.

· Used Rest Controller in Spring framework to create RESTful Web services and JSON objects for communication.

· Implemented SL4J for logging in the project.

· Developed test classes in JUnit for unit testing.

· Used Postman to test the RESTful API for HTTP requests such as GET, POST, and PUT.

· Implemented Swagger to document the REST Services

Timeline