Building Your Own Data Set - Ajit Phadnis
-
Upload
tanvi-gupta -
Category
Education
-
view
93 -
download
1
Transcript of Building Your Own Data Set - Ajit Phadnis
When does a research become significant?
1. Addresses a pressing issue which has been overlooked2. Brings a new perspective to the way an issue has been looked3. Introduces new data to investigate an already examined or
new issueMany researchers begin with ambitions of ‘breaking in’ with new theories or challenging established ones, but often times bringing in new data may be the best way to make a ‘mark’
‘Borrowed’ vs ‘Built’ data ‘Borrowed’ data-set - Where the data for all variables (dependent, independent) of interest come from one database; your data-set is therefore a subset of a larger database
‘Built’ data-set - Where more than one data source is used to construct a data-set
=> The researcher can build this data-set from a combination of primary and secondary sources
Primary and secondary data Primary data: Data collected by the investigator himself/ herself for a specific purpose Advantage: Control over quality of the data, can collect additional data Disadvantage: Cost of gathering data Secondary data: Data collected by someone else for some other purpose (but being utilized by the investigator for another purpose) Advantage: Less cost, can get temporal data Disadvantage: Cannot be sure of quality!
Caveat with secondary data Illustration from National Election Survey 2014 study:
Q15 (a) Which party is better for administration?
Now look at this question in the backdrop of what was asked earlier in the Questionnaire Q1 (a) Whom did you vote for?
Q15 (a) Which party is better for administration?
Do you think the responses would be identical if Q 15(a) was asked as an independent question?
Why do many empirical researchers NOT attempt to build data-sets?• They believe that Large ‘n’ sample sizes are a must to prove any point Borrowed data is a good means to deflect questions regarding data Time pressures working within a ‘Publish or Perish’ culture What if I collect all that data and nothing proves ‘significant’?
• But what they lose out on in the bargain Data drives the research question rather than the other way round => Torture the data till it
confesses! Pushes you to look for data from other countries where Indian researchers have little contextual
knowledge. Further we collectively end up with very little research on India There is far less enthusiasm in conducting a research that you were not driven about in the first
place => How is this different from a corporate job that you have quit to come to research?* There are few lucky ones who manage to get ‘perfect’ borrowed data to answer their self-driven research question!
A personal journey with building data
Background of political science research (esp. parties) in India
Many research would qualify as historical narratives
Abundant empirical research in the qualitative domain: election studies, party studies, leadership studies
Relatively few attempts at quantitative research: mostly descriptive kinds and few statistical efforts unlike the abundant quantitative literature in public policy
I estimate that a large portion of quantitative research on political parties exercises data from the National Election Study
In general, there is a perception that data for cross-party comparisons in India is very difficult to get! (e.g.: no. of members, selection of candidates, intra-party careers, leadership selection)
Research interest Investigating intra-party functioning of Indian political parties NES gives limited information on party members and supporters, but not much
can be derived about party’s internal processes Specifically I wished to look at intra-party career paths that parties offered to
their members Proliferating literature on party switching in many countries; no such studies on
party switching in India This presented an opportunity for me to connect the phenomena of party
switching with intra-party career paths presented by parties My specific contention: “Parties that offer systematic career paths are likely to
experience lower levels of party switching”
Operationalizing intra-party careers How does one represent ‘systematic intra-party careers’ in quantitative terms? I conceptualized that systematic intra-party careers should have two properties
Party career lengths should be long => implying that members grow up the ranksParty career lengths should be predictable => Members should follow a similar career path
Next question: Where do we get data for intra-party careers? First step is to explore possible information sources where politician career
data may be available: Election Commission, party Constitutions It then occurred to me that it may be possible to locate information on the
career backgrounds of party legislators => published on Lok Sabha website
Illustration of data source Triangulated with other sourcesMP’s personal websiteMP’s social media profilePolitical party websiteCandidate interviews on Mera
NetaNational newspaper reports
How the data-set looks?S.No. State Party Name
Local assembly/
govt.
State assembly/
Council
State body/ Minister/ Chairman
Comm.
National assembly
National body/ Minister/
Chairman Comm.
Party ancillary (Youth,
women)
Caste/ Commun
ity in-charge
Local partyState party
National party
Career Score
1 Mah SSAdhalrao Patil,Shri
Shivaji0 0 0 1 0 0 0 0 0 0 1
2 WB AITC Adhikari,Shri Deepak 0 0 0 0 0 0 0 0 0 0 0
3WB AITC
Adhikari,Shri Sisir Kumar
1 1 0 1 1 0 0 0 0 0 4
4 WB AITC Adhikari,Shri Suvendu 1 1 0 1 0 0 0 0 0 0 35 UP BJP Adityanath ,Shri Yogi 0 0 0 1 0 0 0 0 0 0 16 Mah SS Adsul,Shri Anandrao 0 0 1 1 1 0 0 0 0 0 37 Guj BJP Advani,Shri Lal 1 0 0 1 1 0 0 1 0 1 58 UP BJP Agrawal,Shri Rajendra 0 0 0 1 0 0 0 1 1 0 39 Ker IUML Ahamed,Shri E. 1 1 1 1 1 0 0 0 1 1 7
10Mah BJP
Ahir,Shri Hansraj Gangaram
0 1 1 1 0 0 0 0 0 0 3
11 Raj BJP Ahlawat,Smt. Santosh 1 1 0 0 0 1 0 1 1 0 512 WB BJP Ahluwalia,Shri S.S. 0 0 0 1 1 0 0 0 0 1 313 WB AITC Ahmed,Shri Sultan 1 1 0 1 1 1 0 0 0 1 6
14 Ass AIUDFAjmal,Maulana Badruddin 0 1 0 1 1 0 0 0 0 1 4
15 Ass AIUDF Ajmal,Shri Sirajuddin 0 1 1 0 0 0 0 0 0 0 216 WB AITC Ali,Shri Idris 0 0 0 0 0 0 1 0 0 0 117 Kar BJP Ananth Kumar,Shri 0 0 0 1 1 1 0 0 1 1 5
18 Kar BJPAngadi,Shri Suresh
Chanabasappa0 0 0 1 0 0 0 1 0 0 2
19 Ker INC Antony,Shri Anto 0 0 0 1 0 1 0 1 1 0 4
20 Bih NCP Anwar ,Shri Tariq 0 0 1 1 1 1 0 1 1 1 7
No Position Points
1. Local governing bodies 1.02. State assembly 1.0
3. State body/ minister 1.0
4. National assembly 1.0
5. National body/ minister 1.0
6. Party at local level 1.07. Party at state level 1.08. Party at national level 1.09. Ancillary party bodies 1.0
10. Party community groups 1.0 Career Score 0-10
The data collection effort Roughly it took me between 18-20 mins to gather the background
profile of one MP So far I have completed coding the profiles of 540 MPs from the 16th
Lok SabhaÞApproximate time taken for this effort ~ 10,000 mins => 170 hours In order to beef up the sample sizes I will also be coding the 540 MP profiles from the 15th Lok Sabha So my data collection effort is only half done!
Gains from a built data-set The data-set becomes a strong selling point for your paper A good data-set has the potential to be used for more than one
research project Someone who uses your data in future all cites you. So more
citations! It is easier to ask others for data if you have data to share It is your contribution to the universe of data in a particular domain
Important do’s and don’ts Your proposal is the basis on which you propose to gather data. Get brief
proposals (2-3 pages) vetted by good academic minds before beginning data collection.
Ensure that you have created a Coding Manual before you start collecting data. Edit the manual as you come across data that do not fall under your initial classification
After collecting 10% of data, check whether the data trend broadly matches your initial expectations. Conduct such recurring tests
Consider collecting additional data (with no additional cost) that may be useful for future research