Custom Dictionaries in Computer-Aided Text Analysis

13
University of Oklahoma PRICE College of Business Custom Dictionaries in Custom Dictionaries in Computer-Aided Text Computer-Aided Text Analysis Analysis Aaron F. McKenny Aaron F. McKenny

description

Aaron F. McKennyPrice College of Business, University of Oklahoma

Transcript of Custom Dictionaries in Computer-Aided Text Analysis

Page 1: Custom Dictionaries in Computer-Aided Text Analysis

Universityof

Oklahoma

Universityof

Oklahoma

PRICECollege of Business

PRICECollege of Business

Custom Dictionaries in Custom Dictionaries in Computer-Aided Text AnalysisComputer-Aided Text Analysis

Aaron F. McKennyAaron F. McKenny

Page 2: Custom Dictionaries in Computer-Aided Text Analysis

Universityof

Oklahoma

Universityof

Oklahoma

PRICECollege of Business

PRICECollege of Business

Computer-Aided Text AnalysisComputer-Aided Text Analysis

• Word choice provides valuable information in the Word choice provides valuable information in the context of a particular organizational narrative.context of a particular organizational narrative.

• Process of hundreds of documents quickly with Process of hundreds of documents quickly with extremely high reliabilities.extremely high reliabilities.

• Complementary to manual codingComplementary to manual coding• Many toolsMany tools

– CAT ScannerCAT Scanner– DICTIONDICTION– LIWCLIWC– ……

Page 3: Custom Dictionaries in Computer-Aided Text Analysis

Universityof

Oklahoma

Universityof

Oklahoma

PRICECollege of Business

PRICECollege of Business

How CATA WorksHow CATA Works

• Dictionary-based codingDictionary-based coding– Completely automated, computer does the Completely automated, computer does the

codingcoding– Dictionaries (lists of words) are created and Dictionaries (lists of words) are created and

validated prior to analysisvalidated prior to analysis

• Computer looks for words from the Computer looks for words from the dictionary in the narratives being analyzeddictionary in the narratives being analyzed

– When it finds a word, it increments the value When it finds a word, it increments the value for that dictionary by 1for that dictionary by 1

Page 4: Custom Dictionaries in Computer-Aided Text Analysis

Universityof

Oklahoma

Universityof

Oklahoma

PRICECollege of Business

PRICECollege of Business

Example of Dictionary-based codingExample of Dictionary-based coding

• Simple dictionary: Simple dictionary: InnovativenessInnovativeness– ““Innovative” “Innovation” “Innovate” “Research” Innovative” “Innovation” “Innovate” “Research”

“Inventions” “Inventive” “Creative” “Creativity”“Inventions” “Inventive” “Creative” “Creativity”

• Simple narrative to analyze:Simple narrative to analyze:– ““The creativity of our research and development The creativity of our research and development

team make this organization one of the most team make this organization one of the most innovative in the industry, with patents on over innovative in the industry, with patents on over 2,300 inventions.”2,300 inventions.”

• Computer-aided text analysis result:Computer-aided text analysis result:– Innovativeness: 4Innovativeness: 4

Page 5: Custom Dictionaries in Computer-Aided Text Analysis

Universityof

Oklahoma

Universityof

Oklahoma

PRICECollege of Business

PRICECollege of Business

Standard vs. Custom Standard vs. Custom DictionariesDictionaries

• Standard dictionaries Standard dictionaries – Dictionaries that ship with the softwareDictionaries that ship with the software– Developed/validated by othersDeveloped/validated by others

• Custom dictionariesCustom dictionaries– You provide the list of words to tabulateYou provide the list of words to tabulate– As valid as you make itAs valid as you make it

Page 6: Custom Dictionaries in Computer-Aided Text Analysis

Universityof

Oklahoma

Universityof

Oklahoma

PRICECollege of Business

PRICECollege of Business

Developing A Valid Custom Developing A Valid Custom DictionaryDictionary

• Same forms of validity apply to CATA as to Same forms of validity apply to CATA as to most other measuresmost other measures

– InternalInternal– DiscriminantDiscriminant– Content, etc.Content, etc.

• Two-step approachTwo-step approach– DeductiveDeductive– InductiveInductive

Page 7: Custom Dictionaries in Computer-Aided Text Analysis

Universityof

Oklahoma

Universityof

Oklahoma

PRICECollege of Business

PRICECollege of Business

Developing A Valid Custom Developing A Valid Custom DictionaryDictionary

• Use multiple raters to determine what Use multiple raters to determine what words to includewords to include

– Assess interrater reliabilityAssess interrater reliability– Construct vs method experts?Construct vs method experts?

• Narrative mattersNarrative matters– Context of wordsContext of words– Level of analysisLevel of analysis

Page 8: Custom Dictionaries in Computer-Aided Text Analysis

Universityof

Oklahoma

Universityof

Oklahoma

PRICECollege of Business

PRICECollege of Business

Useful Dictionary Creation ToolsUseful Dictionary Creation Tools

• Text file cleanerText file cleaner

– CAT Scanner (built-in functionality)CAT Scanner (built-in functionality)

• Inductive word list generationInductive word list generation– DICTION 5 - list of insistence wordsDICTION 5 - list of insistence words

• In DICTION 6 you can’t copy-and-paste the listIn DICTION 6 you can’t copy-and-paste the list

– CAT Scanner – generate inductive word listCAT Scanner – generate inductive word list

Page 9: Custom Dictionaries in Computer-Aided Text Analysis

Universityof

Oklahoma

Universityof

Oklahoma

PRICECollege of Business

PRICECollege of Business

Useful Dictionary Creation ToolsUseful Dictionary Creation Tools

• Dictionary Judging Sheet/Reliability Dictionary Judging Sheet/Reliability CalculatorCalculator

– You enter the wordsYou enter the words– Judges identify whether words fit construct or Judges identify whether words fit construct or

notnot– It calculates interrater reliabilityIt calculates interrater reliability– Support for up to 3 judges, but easy to extend.Support for up to 3 judges, but easy to extend.– Available at: Available at:

http://www.amckenny.com/CATScanner/resources.phphttp://www.amckenny.com/CATScanner/resources.php

Page 10: Custom Dictionaries in Computer-Aided Text Analysis

Universityof

Oklahoma

Universityof

Oklahoma

PRICECollege of Business

PRICECollege of Business

CATA ToolkitCATA Toolkit

• CAT ScannerCAT Scanner– FreeFree– http://www.amckenny.com/CATScanner/http://www.amckenny.com/CATScanner/

• DICTIONDICTION– $179$179– http://www.dictionsoftware.com/http://www.dictionsoftware.com/

• LIWCLIWC– $30 (lite); $90$30 (lite); $90– http://www.liwc.net/http://www.liwc.net/

Page 11: Custom Dictionaries in Computer-Aided Text Analysis

Universityof

Oklahoma

Universityof

Oklahoma

PRICECollege of Business

PRICECollege of Business

Useful PublicationsUseful Publications• ** Short, J. C., Broberg, J. C., Cogliser, C. C., & Brigham, K. H. ** Short, J. C., Broberg, J. C., Cogliser, C. C., & Brigham, K. H.

(2010). Construct validation using computer-aided text analysis (2010). Construct validation using computer-aided text analysis (CATA): An illustration using entrepreneurial orientation. (CATA): An illustration using entrepreneurial orientation. Organizational Research Methods, 13Organizational Research Methods, 13, 320-347., 320-347.

• ** McKenny AF, Short JC, Payne GT. (In Press). Using CATA to ** McKenny AF, Short JC, Payne GT. (In Press). Using CATA to elevate constructs in organizational research: validating an elevate constructs in organizational research: validating an organizational-level measure of psychological capital. organizational-level measure of psychological capital. Organizational Organizational Research MethodsResearch Methods. doi: 10.1177/1094428112459910.. doi: 10.1177/1094428112459910.

• Duriau, V. J., Reger, R. K., & Pfarrer, M. D. (2007). A content Duriau, V. J., Reger, R. K., & Pfarrer, M. D. (2007). A content analysis of the content analysis literature in the organization studies: analysis of the content analysis literature in the organization studies: Research themes, data sources, and methodological refinements. Research themes, data sources, and methodological refinements. Organizational Research MethodsOrganizational Research Methods, 10, 5-34., 10, 5-34.

** - Related to CATA dictionary creation** - Related to CATA dictionary creation

Page 12: Custom Dictionaries in Computer-Aided Text Analysis

Universityof

Oklahoma

Universityof

Oklahoma

PRICECollege of Business

PRICECollege of Business

Useful PublicationsUseful Publications

• Pennebaker, J., Mehl, M., & Niederhoffer, K. (2003). Pennebaker, J., Mehl, M., & Niederhoffer, K. (2003). Psychological aspects of natural language use: Our words, our Psychological aspects of natural language use: Our words, our selves. Annual Review of Psychology, 54, 547-577.selves. Annual Review of Psychology, 54, 547-577.

• Morris, R. (1994). Computerized content analysis in Morris, R. (1994). Computerized content analysis in management research: A demonstration of advantages & management research: A demonstration of advantages & limitations. limitations. Journal of ManagementJournal of Management, 20, 903-931., 20, 903-931.

• Short, J. C., & Palmer, T. B. (2008). The application of Short, J. C., & Palmer, T. B. (2008). The application of DICTION to content analysis research in strategic DICTION to content analysis research in strategic management. management. Organizational Research MethodsOrganizational Research Methods, 11, 727-752., 11, 727-752.

• Kabanoff, B. (1997). Computers can read as well as count: Kabanoff, B. (1997). Computers can read as well as count: Computer-aided text analysis in organizational Computer-aided text analysis in organizational research. research. Journal of Organizational BehaviorJournal of Organizational Behavior, 18(S1), 507-, 18(S1), 507-511.511.

Page 13: Custom Dictionaries in Computer-Aided Text Analysis

Universityof

Oklahoma

Universityof

Oklahoma

PRICECollege of Business

PRICECollege of Business

Other Useful ResourcesOther Useful Resources

• CAT Scanner siteCAT Scanner site– http://www.amckenny.com/CATScannerhttp://www.amckenny.com/CATScanner

• University of Georgia content analysis siteUniversity of Georgia content analysis site– http://www.terry.uga.edu/management/http://www.terry.uga.edu/management/

contentanalysis/contentanalysis/

• CARMA Short CoursesCARMA Short Courses– Dr. Jeremy Short gives a short course on Dr. Jeremy Short gives a short course on

CATA dictionary creation. CATA dictionary creation.