Nagarjuna_Damarla

Nagarjuna Damarla Email:[email protected]

Mobile No: +91-9941618664

OBJECTIVE

I look forward to associate myself with an organization, where I can contribute the best of my abilities towards the growth of the organization and also which gives an opportunity to share and upgrade my knowledge by taking a challenging position.

PROFESSIONAL SUMMARY

Having around 4 Years of IT Experience in BigData Processing tools SPARK, AWS cloud, HDFS, Map Reduce, HiveQL, Pig, HBase, SQOOP, FLUME and BI reporting tools Endeca & Unix shell Scripting

2 Years of exclusive experience in Big data ECO System components Spark2.0, HDFS, Map Reduce, ApachePig, SQOOP, Hive, Flume, HBase

Hands-on experience on using AWS EC2 r3.2xlarge instance

Proficient working experience in writing PySpark jobs using SparkSQL, configuring and deploying Airflow DAG`s

Involved in deployment activities like deploying control services, emr and airflow configurations in Production environment.

Involved in writing Pig and Hive scripts

Good knowledge on Core Java and Object Oriented Programming concepts

Good communication, interpersonal, analytical skills, and strong ability to perform as part of team.

Interested in learning new concepts to keep updated in technology trends.

Smart working and enthusiastic.

Got appreciations from clients and got Q1 Quarterly Award – 2015 for my contribution in project.

EDUCATION:

Master of Computer Applications with 70% from Osmania University - 2013 Bachelor of Computer Applications with 75% from Nagarjuna University - 2010

PROFESSIONAL EXPERIENCE:

mailto:[email protected]

Worked as a Hadoop Developer for Cognizant Hyderabad from 2013 to 2016 Currently working as a Spark Developer for TATA Consultancy Services since May

2016 TECHNICAL SKILLS

Technologies Spark2.0, Python 3.5, HDFS, Map Reduce, Apache Pig, Hive, Sqoop, Flume

Schedulers AirflowContinuous Integration Jenkins, Git, Code commitDatabases My SQL , Facets Sybase

Operating System Windows 7,Windows 8 and Linux

IDE & Editors Eclipse 3.0, Edit Plus

PROJECT PROFILES

PROJECT #3

Title : Client Letters – Batch RewriteClient : CambiaDuration : May 2016 to current monthRole : Spark DeveloperEnvironment : Spark2.0, Amazon 2.7.2, Python3.5, AWS (IAM, VPN, EC2, RDS,

S3), Jenkins, Code commit, Git and AirflowDescription :

Cambia needs to migrate the template from Client Letter platform to Docs platform of RedCard. Existing system has no tracking process and having issues like duplicate letters. As a process of Batch letter rewrite we need to convert existing Sybase stored procedures into Pyspark jobs. Airflow is used as Scheduler for the new batch letters.The new system has to read the facets completion status from control-M and schedule the jobs with Airflow and able to generate an xml in AWS s3 location. Which is further used by Redcard.

Responsibilities:

1. Involved in technical discussions, architecture design meetings and estimation reviews.

2. Analyse PDM (Product Data Management) document, and develop Pyspark ETL jobs which extracts the data related to requested letter from FACETS REPLICA DB(sybase) tables.

3. Done installation setups for vagrant, Jupyter Notebook and PySpark in Eclipse.4. Involved in project presentations on Spark and AWS.5. Developed client letters using Pyspark code from Sybase stored procedures.6. Coordinated with onsite team on requirements for the development.7. Stretched work hours to complete tasks in-time, given weekend support whenever

required.8. Installed Airflow and configured DAG`s to schedule batch letters.9. Involved in production deployment activities like Control services deployment,

EMR deployment, Airflow deployment on different Environments.

PROJECT #2

Title : Capital One Banking Project Client : Capital OneDuration : Aug 2015 to March 2016Role : DeveloperEnvironment : Java, Red hat Linux, Hadoop, HDFS, Hive, HBase, Pig, Sqoop,

Map Reduce Description :

Capital One Financial Corporation is an American bank holding company specializing in credit cards, home loans, banking and savings products. Project deals with developing new platform in Hadoop using historical data in My-SQL Data base and various other file formats. Provide storage and Maintenance of data spread across all branches. Main aim of the project is to centralize the data and perform Analytical processing.

Responsibilities:

1. Used Sqoop to import the data from My-SQL to Hadoop Distributed File System (HDFS) and later analyzed the imported data using Hadoop Components

2. Established custom Map Reduce programs in order to analyze data and used Pig Latin to clean unwanted data.

3. Did various performance optimizations like using distributed cache for small datasets, Partitioning and Bucketing in Hive.

4. Create tables in hive and load the data into that particular table after that load the same data into HBase using Hive-HBase integration.

5. Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts.

PROJECT #3

Title : Data Analysis of Great Eastern Life Insurance

Client : Great Eastern Life InsuranceDuration : Nov 2014 to Aug 2015Role : DeveloperEnvironment : Java, Red hat Linux, Hadoop, HDFS, Hive, Hbase, Pig, Sqoop,

Oozie, MapReduce, MsExcel, Pentaho report designerDescription :

Primary goal is implement hadoop as solution for the ETL On terabytes of data and generate various reports based on client requirements. Great Eastern Life Insurance, which has number of insurance products, once the products specification, design and is available for sale, it is presented to the customers through agents and online. Customers login on to the online portal and shows interest on the product. Every customer who oversees the insurance prod uct does not buy the product in the real scenario. Great Eastern Life Insurance want to find out the customers details whoever has shown interest. Based upon this data predictive analysis can be performed.Responsibilities:

Installed and configured Hadoop Map Reduce, HDFS, Developed multiple Map Reduce jobs in java for data cleaning and preprocessing.

Built data pipeline using Pig and Java Map Reduce to store onto HDFS. Applied transformations and filtered both traffic using Pig. Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster Developed Simple to complex Map/reduce Jobs using Hive and Pig Handled importing of data from various data sources, performed transformations

using Hive, MapReduce, loaded data into HDFS and Extracted the data fromMySQL into HDFS using Sqoop

Installed Oozie workflow engine to run multiple Hive and Pig jobs Involved in HiveQL. Involved in Pig Latin. Importing and exporting Data from Mysql/Oracle to HDFS using SQOOP.

Project #4

Title : Endeca iPlus 3.1Environment: Endeca 3.1, SQL Role : DeveloperHardware : Virtual Machines, UNIXDuration : Nov 2013 – sep 2014

Description:

Deliver the best Incentive System, to meet the needs of Customers and Dealers, provide a state of the art system that will allow to be more efficient, flexible and capture

increased sales, market share and profit, the existing SIMS R2.2 Business Processes have been modified, new functionalities and new reports have been added. Business Intelligence has been introduced to provide better reporting solutions through Endeca. Following are the key changes that have been implemented in IPlus.

Responsibilities:

1. Create Endeca UI Reports with respective components like charts, results table, and crosstabs.

2. Fine tuning of queries, for the better performance.3. Involving in Requirement gathering and analysis.4. Validate 3.1 endeca reports against 2.2.1 reports.5. Prepared unit test cases for the reports.6. Took the ownership of handling more challenging components both in terms of

design and configuration.

Nagarjuna_Damarla

Documents

Transcript of Nagarjuna_Damarla