CAPSTONE PROJECT: BAP-R - imarticus.org · Registered Office: Imarticus Learning Pvt. Ltd. 5th...

2
Registered Office: Imarticus Learning Pvt. Ltd. 5th Floor, B-Wing, Kaledonia, HDIL Building, Sahar Road, Andheri (E), Mumbai-400058 CIN No: U74900MH2012PTC230745 022 61419500 www.imarticus.org [email protected] MUMBAI PUNE BANGALORE CHENNAI GURUGRAM | | | | CAPSTONE PROJECT: BAP-R PROJECT TITLE: Predict survival of passengers on the Titanic ship. OBJECTIVE: Perform the analysis of what sorts of people were likely to survive using the tools of machine learning taught during the BAP-R course. DESCRIPTION: On April 15, 1912, during her maiden voyage, the Titanic sank after colliding with an iceberg, killing 1502 out of 2224 passengers and crew. This tragedy shocked the international community and lead to better safety regulations for ships. One of the reasons that the shipwreck lead to such loss of life was that there were not enough lifeboats for the passengers and crew. Although there was some element of luck involved in surviving the sinking, some groups of people were more likely to survive than others, such as women, children, and the upper-class. EVALUATION: n After dividing the data randomly into two groups, a 'training set' and a 'test set' in 70:30 ratio. Generate predictions on the test data set by first build your model on training dataset. n For each passenger in the test set, you must predict whether or not they survived the sinking (0 for deceased, 1 for survived). n Your evaluation is based is based on the overall accuracy of the predictions i.e. n Overall Accuracy = n No. of Survived Accurately Predicted as Survived + No. of deceased Accurately Predicted as deceased / Total Number of Passengers. n You need to submit the actual R code (with comments on each step performed) n You need to submit a write up on the capstone project methodology, steps performed, result & interpretation. n You must a minimum overall accuracy of 60% in test dataset to pass the capstone project.

Transcript of CAPSTONE PROJECT: BAP-R - imarticus.org · Registered Office: Imarticus Learning Pvt. Ltd. 5th...

RegisteredOffice:ImarticusLearningPvt.Ltd.

5thFloor,B-Wing,Kaledonia,HDILBuilding,SaharRoad,Andheri(E),Mumbai-400058

CINNo:U74900MH2012PTC230745

02261419500 www.imarticus.org [email protected]

MUMBAI PUNE BANGALORE CHENNAI GURUGRAM | | | |

CAPSTONE PROJECT: BAP-R

PROJECTTITLE:

PredictsurvivalofpassengersontheTitanicship.

OBJECTIVE:

Performtheanalysisofwhatsortsofpeoplewere likely tosurviveusing the toolsof

machinelearningtaughtduringtheBAP-Rcourse.

DESCRIPTION:

OnApril15,1912,duringhermaidenvoyage,theTitanicsankaftercollidingwithan

iceberg, killing 1502 out of 2224 passengers and crew. This tragedy shocked the

internationalcommunityandleadtobettersafetyregulationsforships.

Oneofthereasonsthattheshipwreckleadtosuchlossoflifewasthattherewerenot

enoughlifeboatsforthepassengersandcrew.Althoughtherewassomeelementofluck

involvedinsurvivingthesinking,somegroupsofpeopleweremorelikelytosurvivethan

others,suchaswomen,children,andtheupper-class.

EVALUATION:

n Afterdividingthedatarandomlyintotwogroups,a 'trainingset'anda 'testset' in

70:30ratio.Generatepredictionson the testdatasetby firstbuildyourmodelon

trainingdataset.

n Foreachpassengerinthetestset,youmustpredictwhetherornottheysurvivedthe

sinking(0fordeceased,1forsurvived).

n Yourevaluationisbasedisbasedontheoverallaccuracyofthepredictionsi.e.

n OverallAccuracy=

n No. of Survived Accurately Predicted as Survived + No. of deceased Accurately

Predictedasdeceased/TotalNumberofPassengers.

n YouneedtosubmittheactualRcode(withcommentsoneachstepperformed)

n Youneedtosubmitawriteuponthecapstoneprojectmethodology,stepsperformed,

result&interpretation.

n Youmustaminimumoverallaccuracyof60%intestdataset to passthecapstone

project.

RegisteredOffice:ImarticusLearningPvt.Ltd.

5thFloor,B-Wing,Kaledonia,HDILBuilding,SaharRoad,Andheri(E),Mumbai-400058

CINNo:U74900MH2012PTC230745

02261419500 www.imarticus.org [email protected]

MUMBAI PUNE BANGALORE CHENNAI GURUGRAM | | | |

DATASETDESCRIPTION:

VARIABLEDESCRIPTIONS:

survival Survival

(0=No;1=Yes)

pclass PassengerClass

(1=1st;2=2nd;3=3rd)

name Name

sex Sex

age Age

sibsp NumberofSiblings/SpousesAboard

parch NumberofParents/ChildrenAboard

ticket TicketNumber

fare PassengerFare

cabin Cabin

embarked PortofEmbarkation

(C=Cherbourg;Q=Queenstown;S=Southampton)