Research Terminology for The Social Sciences. Data is a collection of observations Observations...
-
Upload
jonas-singleton -
Category
Documents
-
view
214 -
download
2
Transcript of Research Terminology for The Social Sciences. Data is a collection of observations Observations...
Research Terminology forThe Social Sciences
Data is a collection of observations Observations have associated attributes These attributes are variables A collection of data is often called a “data set”
What are variables? A measure that takes different values for different observations
Across a population (cross-sectional) Across time (cross-temporal) Both! (Panel data)
Independent/explanatory variables are variables we think have an effect on other variables Control variables are a special category
Dependent/outcome variables are the variables we are trying to explain or predict
What is Data?
Features of variables Take on some set of values
Different values have different meanings Could be numerical, meaning they have number values attached
Continuous Discrete or Limited
Could be categorical, meaning they have descriptive terms attached Ordinal (the categories have numerical ranks associated with them) Typological (the categories are descriptive and do not represent
some ordering/ranking/values)
Unpacking Variables
Determine what kind of data will be needed based upon your research question Quantitative?
Large-N Measurable in a clear and consistent way
Qualitative? Case studies Not easily quantifiable
The Holy Grail of Social Science Research: Turning Quantitative Data into Qualitative
Measures
Research Design
Libraries have a large collection of data sets that are ready to be used, in common software formats Digital Centers have software suites for all steps of data
collection process Bibliographic packages Data management software Data analysis software
Reference librarians are useful resources for discovery Sometimes, you may need to collect original data
Field work: going out and gathering data from observations Archival work: finding the data in other information sources
and aggregating it into a data set
Collecting Data
Operationalization is the process of turning theoretical concepts into measurements Matching theory with variables Ideological framework
The type of problem should suggest an appropriate measure Matching levels
Macro vs. micro, and everything in between Matching observations
Individuals? Pairs? Groups? Matching meanings
This is the hardest
Operationalization
Models are statements about the way variables related to one another Two basic types in social science: analytical and formal Analytical Models
Describe the causal relationships between variables Rely upon probability and statistics
Formal Models Describe a simplified version of reality Variables become elements of this simplified reality Rely upon theoretical frameworks
Both types of models can be tested with data
Using Variables
Mixed methods analysis is the “gold standard” Combination of quantitative and qualitative data Formal models
Mathematical representations of decisions Game theory
Matching the research design to the hypothesis under investigation is critical How questions are asked and answered What counts as evidence?
Research Methods
Descriptive Statistics These are measures designed to help you “picture” your data Means, Medians, Modes Standard Deviations, Variances
Exploratory Visualization These are graphs that depict visually information contained in
descriptive statistics Distribution plots
Histograms Density plots
Simple correlation plots Graphing two variables, one on each axis (i.e., X & Y) You can get more complicated later!
Discovering the Data
Simple inferences Correlations/covariances
These measures show the relationships between and among variables Commonly referred to as ANOVA – ANalysis of VAriance
ANOVA is about comparing two (or more) samples, groups, populations
Basic Linear Models These models explore Simple regression: one dependent variable, one independent
variable This is really just a correlation
Multivariate regression: one dependent variable, many independent variables This technique looks at simultaneous correlations among several variables
Analyzing the Data
Models for non-continuous/limited/discrete variables Logit and probit models: the dependent variable can take two values Tobit models: the dependent variable can take a set of values Ordered logit, ordered probit, and multinomial logit models: the dependent
variable can take a small and discrete set of values Models for complex data
Simultaneous equations models (SEMs): the dependent variable can also effect the independent variable Instrumental variables are a technique used to deal with this issue
Time-series and panel data models The data cover multiple years and may have serial correlations (i.e., the values
for one year are highly correlated with values from the previous year) Non-linear models
The relationships between the variables are not of the form Y= mX + B
Advanced Data Analysis