Introduction to the Course, Social Science Approach, ALM Context and Proseminar Objectives

34
Graduate Research Methods and Scholarly Writing in the Social Sciences: Government and History Harvard Summer School: SSCI S- 100b Section 2 (32761) Joe Bond 6/24/2013

description

Graduate Research Methods and Scholarly Writing in the Social Sciences: Government and History Harvard Summer School: SSCI S-100b Section 2 (32761) Joe Bond 6/24/2013. Introduction to the Course, Social Science Approach, ALM Context and Proseminar Objectives - PowerPoint PPT Presentation

Transcript of Introduction to the Course, Social Science Approach, ALM Context and Proseminar Objectives

Slide 1

Graduate Research Methods and Scholarly Writing in the Social Sciences: Government and HistoryHarvard Summer School: SSCI S-100b Section 2 (32761)

Joe Bond6/24/2013

Introduction to the Course, Social Science Approach, ALM Context and Proseminar Objectives

Course Requirements and Grading Facilitation (1 minimum) & participation in class discussions 7% In-class exercises (8 of 10 required, but NOT graded) 8% Argument writing assignment (1ST paper) 10% Book review writing assignment (2nd paper) 15% Literature review writing assignment (3rd paper) 25% Mid-term exam 10% Research design writing assignment (4th paper) 20% Final class presentation5%

Harvard Extension School is not a traditional graduate program. Explain.

Volunteers for next weeks facilitation

IntroductionsBasics will be brief but the .ppt will be posted on the course websiteQualitative vs. Quantitative vs. Mixed MethodsLargely a moot debate (more and more studies utilize mixed methods)Your questions should always determine your methodological approach, not the reverseWhy and when to use A and why and when to use B depends:Is there a relationship between regime type and violent conflict?What are the odds that the nation of Fester will fail in the next 5 years?What is the role of political culture as it relates to negotiations?What would have happened if Germany refrained from invading Poland in 1939?Your choice of methods depends on how you operationalize your variables (e.g. how do you intend to measure political culture, etc.?)

Variables

Independent variables (IVs) are those variables that help explain a dependent variable

Independent variables must be antecedent to dependent variables (e.g. relationship between education and income)

Dependent variables (DVs) are the things you are trying to explain

Example: Relationship between SAT scores (IV) and success in college (DV)

Dependent variable should always be labeled along the y axis of a graphLevel of MeasurementWhy is it important?

Nominal: (measures not ranked: gender, religion, etc.)Ordinal (measures rank ordered: economic class)Interval (measures equally ranked: income)Ratio as characterized in the social sciences (the measure has an absolute zero: mass, length, time)

Think Nominal, Ordinal, Interval, Ratio (NOIR)!AssociationAn Association between two variables: the values of one variable tend to coincide (vary or covary) with the values of the anotherExample 1: the relationship between sex education and teen pregnancy. Teen pregnancy as the DV, sex education as an IV (note: in this example we treat the latter as antecedent to the former)We might hypothesize that increased exposure to sex education programs help mitigate the incidences of teen pregnancy (i.e. they vary: as X goes up, Y goes down)Example 2: the relationship between education (IV) and income (DV)We might hypothesize the more education one has, the higher ones future income will be (i.e. they covary: as X goes up, Y goes up)

An anomaly. Wrong career trajectory.CorrelationA statistical term that indicates the strength and direction of a linear relationship between variables (e.g. the relationship between education and income)

IMPORTANT! Association or correlation DOES NOT imply causation

example 1: drowning (DV) and consumption of ice cream (IV) they covary (as ice cream consumption goes up, incidents of drowning increases)example 2: childrens shoe size (IV) and math performance (DV) they covary (as shoe size gets bigger, math skills go up)Example 2 also highlights the importance of definitions, operationalization and transparencyexample 1: ice cream consumption is a proxy for temperatureexample 2: shoe size is a proxy for age

More on CorrelationCorrelation is a measure of the direction and degree of strength between two or more variables

A correlation coefficient (r or Pearsons r) is a numerical index of that relationship

The magnitude of the correlation coefficient indicates the strength of the relationship between variables (i.e. -1 to +1)

+1 means a perfect positive correlation (co-vary) while -1 shows a perfect negative correlation (vary)

The closer the correlation coefficient is to +1 or -1,the stronger the relationship

But even a strong [negative or positive] correlation is meaningless if the level of error (significance) is large (e.g. p < 0.5 vs. p < 0.01)

Hypotheses & Null HypothesesH1: as education increases, likelihood of voting increases

H0: education has no effect () on the likelihood of a person voting

Why do we test the null hypothesis?the strongest proof is the inability to disproveerror cannot be eliminatedlike it or not, facts change

Avoid words like this proves or this is irrefutable proof; instead, use supports, lends support to, etc.Types of AnalysisAnalysis may have exploratory, descriptive, explanatory, and predictive objectives or some combination of these aims

Evaluation research is a 5th type that is not discussed here, albeit it is no less importantExploratory ResearchUndertaken when very little is known about a phenomenon

Forms the foundation for subsequent descriptive and explanatory research

In the early 1980s we did not have a good handle on how many Americans were infected with HIV/AIDS or even what caused of it.

This sort of research is often linked with activismDescriptive ResearchServes to identify important areas of inquiry

Often serves as the first step in explanatory inquiry

Addresses whether a phenomenon is a common occurrence or a rate event

Describe the U.S. electorate and electoral behavior:Jewish Americans tend to vote for democratsCatholics tended to vote democratic but the abortion issue has created a riftLatinos tended to vote overwhelmingly democratic but this began to change in 1999 and swung back again in 2008

Examples: Observational Research, Historical Research, etc.Explanatory ResearchScientific inquiry usually does not end with description but proceeds to explanation

Descriptive findings are likely to lead to the investigation of the factors associated with the outcome and to attempts to understand how these factors contribute to the occurrence of the outcome

Understanding how something works allows us to better predict the future (applies to both qualitative and quantitative research)Examples: Lessons Learned, Counterfactual Thought Experiments, Regression Analysis, etc.

Prediction: optimistic/happy pop hits predict a bull market six months in advanceTypically follows explanatory research but not always!State Failures, Stock Predictions, etc.

Model, below, yields between 50-55% excess returns with no compounding using events data

mile Durkheims Suicide (1897): An Example of the Research ProcessDurkheims VariablesInductive Approach or Theory Building

Dependent Variable(s) (what is he trying to explain): RATES of SUICIDE in Europe (1800s)

Independent Variables (those things that help explain the Dependent Variable(s)): CLIMATE, AGE, GENDER, POLITICAL TURMOIL, RELIGION (limited to Christianity), MARITAL STATUS, DEPENDENTS, ETC.

Recall Levels of Measurement (NOIR)Nominal (cant be ranked)Ordinal (ranked with unequal or arbitrary intervals)Interval (equal intervals)Ratio (as interval with true zero)

Some of Durkheims Descriptive FindingsSuicide rates are higher for widowed, single and divorced men than married menSuicide rates are higher for people without children than with childrenSuicide more pronounced in colder climatesSuicide rates are higher among Protestants than Catholics

Differences between Protestants and CatholicsSuicide is [more of] a sin for Catholics Role of coronersif no suicide note is left, it comes down to the coroner's interpretation (circa 1897)Differences in social integrationCatholics tend to have higher levels of social integrationthink the movie My Big, Fat Greek Wedding.

The Notion of Integration: Going Beyond ReligionCatholic countries tend to be more integrated than Protestant countries, with closer family tiesthis is why people who are married and/or have children commit less suicidesimply put, they have more to live for

This is even reflected in physical proximity when speaking with others

Social bonds are composed of two factors:social integration: attachment to other individuals within societysocial regulation: attachment to society's norms

Suicide rates may increase when extremities in these factors occur

Building a Theory: Social Integrationabnormally high or low levels of social integration may result in increased suicide rates;low levels of social integration result in disorganized society (chaos);high levels of social integration drive some to suicide in order to avoid becoming burdens on societyDurkheims Suicide TypologyEgoistic suicideTies attaching the individual to society are weakFew social ties to keep the individual from taking his or her own life (Why not?)Altruistic suicideIndividuals are extremely attached to society and have no life of their own (self-emulation)They believe their death can bring about a benefit to the societyAnomic suicideWeak social regulation between the society's norms and the individual (life becomes too unpredictable and uncertain)Often brought on by dramatic changes in economic and/or social circumstances (e.g. wars, recessions and other turmoil, etc.)Fatalistic suicideSocial regulation is completely instilled in the individual (suicide bombers)No hope of change against an oppressive societyResearch Cycle as an Iterative Process

Durkheim used an inductive approach, moving from steps #2 & #3 to build step #1 (observation theory)Most quantitative research involves deductive research (i.e. theory empirical testing)Group Exercise (groups of 2 or 3)

Form groups of 3 or 4For each group, define one of the four concepts, belowoperationalize the concept (i.e. how would you measure the concept in your research?)Reconvene in 5 8 minutes (max)

Attractiveness Democracy Leadership Love

State FailuresState Failure project (1994)objective: then VP Gore asked the CIA to predict which states will fail 5 years outanalyzed thousands of [structural] variablesfound that 3 variables could predict failures 85% of the time looking out 5 yearsinfant mortality (a proxy? for what?)level of democratizationopenness to tradeother salient factors: youth bulge, religious distributions, etc.We will return to this later on in the semesterAfrica Prospects: Predicting State Failure with Structural DataAfrica ProspectsPurpose: to assess the vulnerability of countries to conflict escalation based on its profile or set of structural indicators.Overall Accuracy: is defined as the ratio of correct classifications (C) to all classifications (A). Accuracy = C/A * 100%.Recall: is defined as the ratio of correct classifications (C) to the observed classification (O). Recall = C/O * 100% and represents the ability of the algorithm to classify the conflicts as they were observed.Precision: is defined as the ratio of correct classifications (C) to correct (C) and incorrect classifications (I). Precision = C/(C+I) *100%. Illuminates the algorithms false positives; specifically, the higher the ratio the lower the false positives.

28Near-perfect forecast modelHigh recall = 99%High precision = 100%High accuracy =99.5%

Bad forecast model #1 (almost every country will be unstable) High recall= 99% (1% miss rate)Low precision = 5% (95% false positive rate)Low accuracy =40-50%NET IS CAST TOO BROADLYBad forecast model #2 (few countries will be unstable) Low recall = 5% (95% miss rate) High precision =100% (0% false positives) Low accuracy =40-50%NET IS CAST TOO NARROWLYCountries forecast to be unstable at some level of intensityCountries that experience instabilityCountries that DO NOT experience instability# of correct predictions# of predictions madeRecall# of correctly predicted conflicts# of conflicts that occurred# of correctly predicted conflicts# of conflicts predicted to occurPrecisionOverall AccuracyFalse positivesMissesmissesForecasting Performance Metrics: Definitions and Illustrations295-15 Year Validation of Forecasting

Average Performance Scores For Different Training Sets / Forecast PeriodsForecast PeriodLow precision scores (high false positive rates) in the out years indicate that the world was more stable than would have been expected given macro-structural conditions.However, high recall scores indicate the net is cast wide enough to correctly forecast conflicts that DO occur (errors fall on the side of caution).AccuracyRecallPrecision

29Independent VariablesCaloric Intake: Estimate of the average number of calories consumed per person, per day.GDP per Capita: Annual gross domestic product per person measured in constant 1995 U.S. dollars. Male/Female Infant Mortality: Number of deaths of male and female children under 1 year of age per 1,000 live births. Life Expectancy: Average life expectancy (males and females combined).Youth Bulge: Ratio of population aged 15-29 to those aged 30-69.Among others..31

32

Instability LevelsHigh intensity(if combined probability > 67%Moderate intensity(if combined probability > 67%

None/Low intensity(if combined probability > 67%3 Levels of instability intensityMaximum level/intensity of conflict per country-year; source: KOSIMO Data Project, Heidelberg Institute of International Conflict Research (HIIK), 1975-2003. http://www.hiik.de/de/index_d.htmRepresents a high threshold of instability

Dependent Variable: Index of InstabilityKey Assumption: country is unstable if (and only if) the government or its opponent(s)threatens or initiates a conflict to restore equilibrium or harmony with respect to its internal or external relations.Steps1. Compile a time series data set of the selected target variables intensity 2. Compile a time series data set of candidate indicators associated with the targets intensity 3. Train an algorithm that explains the historical target intensities with the candidate indicators 4. Calculate performance measures of the explanation from a time series of historical test data 5. Generate vulnerability scores based on projections from the current value of the indicators 6. Calculate a confidence level for the forecasts using the likelihood of occurrence at each intensity 7. Iterate from step #1 for experimentation with alternative target variables 8. Iterate from step #2 for experimentation with alternative explanatory variables or indicators

In-Class Writing Exercise 1 June 24, 2013

Educating Sergeant Pantzke (7:35) Should take no longer than 10-15 minutes

On the opposite side of this paper only, take a position: The U.S. government should [should not] decide which schools can receive GI bill funding. For example, veterans working their way through Harvard should be able to use GI bill funds whereas vets working on a degree at the University of Phoenix should be prohibited to fund their education through the GI bill.Include any evaluation criteria that come to mind if you take the position that some schools but not others should qualify for GI bill funding.Sheet119951996199719981999accuracy0.7956204380.82352941180.81159420290.80714285710.79432624110.8064426302recall0.80769230770.91666666670.880.8627450980.86792452830.867005720120precision0.67272727270.6491228070.6491228070.65517241380.62711864410.65065278890.79002920320.76908929681990199119921993199419951996199719981999accuracy0.81746031750.84920634920.79411764710.74468085110.74468085110.73381294960.80434782610.77372262770.76595744680.76760563380.7795592515recall0.79591836730.83333333330.71428571430.66129032260.6718750.66101694920.77083333330.71153846150.71698113210.73076923080.7267841844precision0.70454545450.73809523810.750.67391304350.68085106380.62222222220.62790697670.6190476190.59090909090.56818181820.6575672527avg accuracy0.90098520140.77220747640.7444689404198519861987198819891990199119921993199419951996199719981999100.80588720610.950.91935483870.90983606560.86507936510.86065573770.8320.78195488720.80.74285714290.70422535210.70212765960.77372262770.76056338030.73793103450.7480.72691000690.950.90243902440.90.850.80851063830.760.66037735850.65454545450.61290322580.58730158730.58928571430.67391304350.67307692310.65384615380.62745098040.70257473680.89743589740.86046511630.83333333330.73170731710.78378378380.76923076920.71794871790.78947368420.70.6190476190.55813953490.58974358970.59523809520.54545454550.5476190476Length / Period of training set1989-931994-981999-0314 years (1975-88)80% 69% 73%77% 68% 60%77% 75% 47%19 Years (1975-93)--82% 82% 68%77% 84% 54%24 Years (1975-98)----78% 87% 57%

Sheet2

Sheet3

Conflict Type

4 WarExamples

WWII, Gulf War, Six Days WarDefinition

Systematic, collective use of force by regular troops

3- Violent crisisNorthern Ireland, Basque separatists, ethnic conflict in BosniaSporadic, irregular use of force, war-in-sight crises

2- CrisisRussian Federation vs. Ukraine over possession of strategic weapons

Mostly non-violent

1- None