RET560 Research Methods Course Material V01

201
i KWAME NKRUMAHUNIVERSITY OF SCIENCE AND TECHNOLOGY, KUMASI (MSc. RENEWABLE ENERGY TECHNOLOGY, 2011/2012) RET 560: Research Methods [Credit: 3] Prof. Abeeku BREW-HAMMOND Publisher ’s Information

description

research methods

Transcript of RET560 Research Methods Course Material V01

Page 1: RET560 Research Methods Course Material V01

i

KWAME NKRUMAHUNIVERSITY OF SCIENCE AND

TECHNOLOGY, KUMASI

(MSc. RENEWABLE ENERGY TECHNOLOGY, 2011/2012)

RET 560: Research Methods

[Credit: 3]

Prof. Abeeku BREW-HAMMOND

Publisher’s Information

Page 2: RET560 Research Methods Course Material V01

ii

© IDL, 2009

All rights reserved. No part of this book may be reproduced or utilized in any form or by any

means, electronic or mechanical, including photocopying, recording or by any information

storage and retrieval system, without the permission from the copyright holders.

For any information contact:

Dean

Institute of Distance Learning

New Library Building

Kwame Nkrumah University of Science and Technology

Kumasi, Ghana

Phone: +233-51-60013

+233-51-61287

+233-51-60023

Fax: +233-51-60014

E-mail: [email protected]

[email protected]

[email protected]

[email protected]

Web: www.idl-knust.edu.gh

www.kvcit.org

ISBN:

Editors:

Publisher’s notes to the Learners:

Page 3: RET560 Research Methods Course Material V01

iii

1. Icons: -the following icons have been used to give readers a quick access to where similar

information may be found in the text of this course material. Writer may use them as and when

necessary in their writing. Facilitator and learners should take note of them.

Icon #1

Learning Objective

Icon #2

Learning Activity

Icon #3

Unit Assignments

Icon #4

Review

Icon #5

Summary

Icon #6

Time For Activity

Icon #7

Self Assessment

Icon #8

Group Discussion

Icon #9

Read

Icon #10

New Terms Icon #11

Answer Tips

Icon #12

Note/Learning Tip

Icon #13

Pause Icon #14

Interactive CD

Icon #15

Online

2. Guidelines for making use of learning support (virtual classroom, etc.)

This course material is also available online at the virtual classroom (v-classroom) Learning

Management System. You may access it at www.kvcit.org

Page 4: RET560 Research Methods Course Material V01

iv

Course Writers

Abeeku BREW-HAMMOND

Associate Professor of Mechanical Engineering

Director of Energy Centre at KNUST

College of Engineering, KNUST

David Ato Quansah

Lecturer

Mechanical Engineering

College of Engineering, KNUST

Owusu Amponsah

Lecturer

Department of Planning

College of Architecture and Planning, KNUST

Wahib Faisal Adams

Lecturer

Mechanical Engineering

College of Engineering, KNUST

Page 5: RET560 Research Methods Course Material V01

v

Acknowledgement

The authors are indebted to Dr Gabriel Takyi, Lecturer in the Department of Mechanical

Engineering, for managing the whole of the MSc RETS e-Learning programme, including the

course materials development process.

Thanks also go to Mr Ebenezer Nyarko Kumi for invaluable assistance to Prof Abeeku Brew-

Hammond in the writing of the second half of this document.

Page 6: RET560 Research Methods Course Material V01

vi

Course Introduction

This course forms part of the Master of Science Degree Programme in Renewable Energy

Technologies via E-Learning. It is a 3 credit-hour course with 2 hours of teaching and 2 hours

tutorial per week. The programme is hosted by the Department of Mechanical Engineering

under the auspices of The Energy Center, KNUST.

COURSE OVERVIEW

Research methods in engineering and the physical sciences: design of experiments,

Instrumentation, Data acquisition and analysis, Error analysis, mathematical modelling and

computer simulation, statistical analysis, interpretation and presentation of experimental results

and simulations; Research methodology in the social sciences: qualitative and quantitative

research, design of surveys and questionnaires, case study design, sampling and interview

techniques, analytical techniques (analysis of variance, analytic generalisation, etc); Preparation

of research proposals including thesis research design, reporting and publication of findings

(thesis writing, preparation of conference papers and journal articles, posters, etc), critical

reviews of journal papers and other publications, oral presentations using PowerPoint, Software

applications for data analysis (SPSS, STATA, etc)

COURSE OBJECTIVES

By the end of the course the student should be able to do the following:

1. Develop Methodology for Research Projects/Thesis Research involving

Engineering/Physical Science and Social Science Research Methods;

2. Write His/Her Thesis Synopsis and Research Proposals/Concept Notes; and

3. Write Journal/Conference Papers.

COURSE OUTLINE

Unit 1: Introduction to Research Proposals and Thesis Synopsis

Unit 2: Engineering Research Design and Data Analysis

Unit 3: Social Science Research Design and Data Analysis

Unit 4: Statistical Analysis with STATA and SPSS

Unit 5: Introduction to writing of Journal Articles, Conference Papers and Theses

Page 7: RET560 Research Methods Course Material V01

vii

COURSE STUDY GUIDE

Week # Unit/Session FFFS/Practical/Exam/Quiz

1 General Introduction +

Unit 1

2 Unit 2 Take-Home Quiz/Exercise No. 1 – 10%

3 Unit 3

4 Unit 3 Cont’d

5 Unit 4

6 Unit 4 Cont’d Take-Home Quiz/Exercise No. 2 – 10%

7 Tutorial to review Units 1 - 4

8 Unit 5 Take-Home Exercise No. 3 – 10%

9 Unit 6

10 Unit 6 Cont’d Take-Home Exercise No. 4 – 10%

11 Unit 7 Take-Home Exercise No. 5 – 10%

12 Tutorial to review Units 5 – 7

13 Final Written Examination on All Units – 60%

14 Mini-Project Presentation – 20%

GRADING

Continuous assessment: 30%

End of semester examination: 70%

RESOURCES

You will require a basic knowledge of engineering science and mathematics as well as access to

the internet and a computer for this course.

Page 8: RET560 Research Methods Course Material V01

viii

READING LIST

1. Journal Articles, Recommended Textbooks, etc.

Annabel, B.K. (2006). Using interviews as research instruments, Language Institute

Chulalongkorn University publications.

Beavon, J. R. (2009). The origins of experimental error. Retrieved August 5, 2010, from

http://home.clara.net/rod.beavon/err_orig.htm

Becker, H. S. and Pamela, R. (Eds) 1986. Writing for Social Scientist: How To Start And Finish

Your Thesis, Book, Or Article. London: University of Chicago Press Ltd.

Bell, J., (2004) (3rd edn) Doing Your Research Project: A Guide for First -time Researchers in

Educational and Social Science, UK: Open University Press.

Bell, J. (2004). Doing Your Research Project, A Guide for First-time Researchers in Education

and Social Science, 3rd edn. Berkshire, UK, Open University Press.

Bell, J. (2010). Doing Your Research Project: a Guide For First-time Researchers in Education

and Social Science. 5th edn. Maidenhead: Open University Press

Brian, Allison (Eds.) 1996, 1998, 2000. Research Skills for Students. London: Kogan Page

Limited.

Chapin, P. G. (2004). Research Projects and Research Proposals; A Guide to Scientists Seeking

Funding. UK: Cambridge University Press.

Colleen, H. (2009). Researcher as Goldilocks. Bournemouth University. International Journal of

Evidence Based Coaching and Mentoring, Special Issue 3:11-19.

Dawson C. (2009). Introduction To Research Methods; A Practical Guide For Anyone

Undertaking A Research Project, 4th Edition. UK: How to Contents.

Denscombe, M. (2010) The good research guide. 4th edn. Maidenhead: Open University Press

Duane, D. (2000). Introduction to Measurements & Error Analysis. Retrieved February 12,

2012, from The University of North Carolina at Chapel Hill, Department of Physics and

Astronomy : http://www.physics.unc.edu/~deardorf/uncertainty/UNCguide.html

Page 9: RET560 Research Methods Course Material V01

ix

Eade, Deborah (Ed.) 2003. Development Methods and Approaches: Critical Reflections. Oxford;

OXFAM GB.

Eric M. Uslaner December, (1999). Brief Guide to STATA Commands

emathzone. (2012). Continuous Random Variable. Retrieved Feb 2012, from emathzone:

http://www.emathzone.com/tutorials/basic-statistics/continuous-random-variable.html

Frankfort-Nachmias, C. and Nachmias, D. (1996). Research Methods in Social Science, 5th

Edition, New York, St. Martin’s Press Inc.

Gagnon, S. (Undated). How cold is liquid nitrogen? Retrieved from Jefferson Lab:

http://education.jlab.org/qa/liquidnitrogen_01.html

Ghanfoor A. (2006). Manual for synopsis and thesis preparation. University of Agriculture,

Faisalabad, Pakistan.

Harrison, D. M. (2008). Error Analysis in Experimental Physical Science. Retrieved September

25, 2010, from University of Toronto:

http://www.upscale.utoronto.ca/PVB/Harrison/ErrorAnalysis/

Hart, C. (1998) Doing a literature review: releasing the social science imagination. Thousand

Oaks, Sage

Harvey, G. (1998) Writing with sources: a guide for students. Indiana: Hackett Publishing

Ivan Iachine, Lars Korsholm,Henrik Støvring, Kirstin Vach, Werner Vac (2004).Stata Reference

Manual

James H. Stock and Mark W. Watson, (2003). Introduction to Econometrics

Julie Pallant, (2002). A step by step guide to data analysis using SPSS for Windows

School of Graduate Studies-KNUST. (undated). Manual for thesis preparation for Masters and

Doctoral degrees awarded by the Kwame Nkrumah University of Science and Technology.

School of Graduate Studies, KNUST, Kumasi, Ghana.

Kenneth L. Simons, (2010). Useful Stata Commands

Kumekpor, T.K.B. (2002). Research Methods and Techniques of Social Research, Accra,

SunLife Publications.

Kurt Schmidheiny, (2008). Sort Guides to Microeconometrics, Unversitat Pompeu Fabra

Page 10: RET560 Research Methods Course Material V01

x

Lazić Z. R, 2004. Design of experiments in chemical engineering: a practical guide, WILEY-

VCH Verlag GmbH & Co KGaA Weinheim

Lester, J. (2005) Writing research papers: a complete guide. 11th edn. New York, Longman

Manfred W. Keil, (2010). STATA 10 Tutorial

Montgomery, D. C., Runger, G. C., & Hubele, N. F. (2000). Engineering Statstics (2nd Edition

ed.). New York: John Wiley & Sons, Inc.

Moore, N. (undated). How to do Research. Third Edition. London: Library Association

Publishing.

Narasimhan, B. (1996). The Normal Distribution. Retrieved Jan 30, 2012, from Stanford

University : http://www-stat.stanford.edu/~naras/jsm/NormalDensity/NormalDensity.html

Neale, P., Thapa, S. and Boyce, C. (2006). Preparing a Case Study: a guide for designing and

conducting a Case Study for Evaluation Input, pathfinder International, Watertown,

Massachusetts.

Neville C. (2010). The Complete Guide To Referencing And Avoiding Plagiarism, 2nd edition.

UK: Open University Press.

Nsowah-Nuamah, N.N.N. (2005). A Handbook of Descriptive Statistics for Social and Biological

Sciences. Accra: Acadec Press.

Ogden, T.E. and Goldberg, I. A. (2002). Research Proposals; A Guide To Success, 3rd Edition.

USA: Academic Press

Seawright, J. and Gerring, J. (2008). “Case Selection Techniques in Case Study Research : A

Menu of Qualitative and Quantitative Options”, Political Research Quarterly 2008 61: 294.

Singleton, R.A., Jr. Bruce C. S. and Straits, M.M. (1993). Approaches to Social Research.

Second Edition. Oxford University Press, New York.

Steinar, K. (1996). Interviews: An Introduction to Qualitative Research Interviewing. SAGE

Publications, California.

Susan B. Gerber, Kristin Voelkl Finn, (1999). Using SPSS For Windows. New York:State

University of New York Graduate School of Education

Page 11: RET560 Research Methods Course Material V01

xi

Taylor, J. R. (2004). An Introduction to Error Analysis: the study of uncertainties in physical

measurements. CA: University Science Books.

The Health Communication Unit (1999). Conducting Survey Research, The Health

Communication Unit, at the Centre for Health Promotion, University of Toronto.

Urdan, T. C. (2010). Statistics in Plain English. New York: Taylor & Francis Group.

WWF (2005). Logical Framework Analysis. Retrieved on 1st February, 2012 from:

http://www.artemis-services.com/downloads/logical-framework.pdf

Zaidah, Z. (2007). Case study as a research method. Universiti Teknologi Malaysia, Jurnal

Kemanusiaan bil.9, Jun.

2. Websites, CD ROMs, etc

NIST/SEMATECH e-Handbook of Statistical Methods, http://www.itl.nist.gov/div898/handbook/

https://classshares.student.usp.ac.fj/EN400/2007%20Lecture%20Materials/Sections%201,%202,

%20and%203%20EN400.pdf

http://www.engr.sjsu.edu/bjfurman/courses/ME120/me120pdf/UncertaintyAnal.pdf

http://www.sonoma.edu/aa/gs/guidelines/toc.shtml http://www.mhhe.com/mayfieldpub/tsw/toc.htm

Page 12: RET560 Research Methods Course Material V01

xii

Table of Contents

Publisher’s Information ................................................................................................................................. i

Course Writers ............................................................................................................................................. iv

Acknowledgement ......................................................................................................................................... v

Course Introduction ..................................................................................................................................... vi

Table of Contents ........................................................................................................................................ xii

List of Tables .............................................................................................................................................. xvii

List of Figures ............................................................................................................................................ xvii

Unit 1 ......................................................................................................................................................... 1

INTRODUCTION TO RESEARCH PROPOSALS AND THESIS SYNOPSES

PREPARATION ...................................................................................................................................... 1

SESSION 1.1: CONCEPT NOTES .......................................................................................................... 2

1.1.1 Introduction to Concept Notes ..................................................................................................... 2

1.1.2 Structure of Concept Notes ................................................................................................... 2

SESSION 1.2: RESEARCH PROPOSALS .............................................................................................. 3

1.2.1 Introduction to Research Proposals .............................................................................................. 3

1.2.2 Structure of research proposals .................................................................................................... 3

1.2.3 Logical framework ....................................................................................................................... 5

1.2.4 Detailed Budget ........................................................................................................................... 8

SESSION 1.3: THESIS SYNOPSES ...................................................................................................... 10

1.3.1 Introduction to Thesis Synopses ................................................................................................ 10

1.3.2 Structure of Thesis Synopses ..................................................................................................... 10

Unit 2 ....................................................................................................................................................... 14

2.1.1 Motivation for Research in Engineering and some basic concepts ............................................ 16

Page 13: RET560 Research Methods Course Material V01

xiii

2.1.2 Classification of Engineering Experiments ................................................................................ 17

2.1.3 Research Questions in Engineering ........................................................................................... 17

2.1.4 Experiment Design Process ....................................................................................................... 17

2.2.1 Sources of Error in Experimental Work ..................................................................................... 21

2.2.1.1 Instrumental Errors - A Closer Look .................................................................................. 22

2.2.2 Estimating Uncertainties ............................................................................................................ 23

2.3.1 Probability Distributions and Standard Errors ............................................................................... 28

2.3.2 Properties of Probability Density Function ................................................................................ 29

2.3.3 Mean and Variance .................................................................................................................... 30

2.3.4 The Normal Distribution (also called the bell-curve) ................................................................ 32

2.3.5 Skewed Distributions ................................................................................................................. 33

2.3.6 Standardization and Z-Scores .................................................................................................... 34

2.4.1 Error of the Mean ....................................................................................................................... 38

2.4.2 Central Limit Theorem ............................................................................................................... 40

2.4.3 The t-distribution........................................................................................................................ 40

2.5.1 Part 1 – Examples in Normal Distributions and z-Scores .......................................................... 43

2.5.2 Part 2 – Applying Normal Distribution to Engineering Problems ............................................. 46

Unit 3 ....................................................................................................................................................... 49

SOCIAL SCIENCE RESEARCH DESIGN AND DATA ANALYSIS .................................. 49

3.2.2 Purpose of Case Studies ................................................................................................................. 50

SESSION 3.3 OTHER TYPES OF RESEARCH DESIGNS ............................................................. 50

SESSION 3.4 RESEARCH ETHICS.......................................................................................................... 50

3.4.2 Balancing Costs and benefits in Research ..................................................................................... 50

3.4.3 Informed Consent ........................................................................................................................... 50

3.4.4 Competence .................................................................................................................................... 50

Page 14: RET560 Research Methods Course Material V01

xiv

3.4.5 Privacy ........................................................................................................................................... 50

Merits of Structured questions ................................................................................................................ 56

Demerits of Structured questions ............................................................................................................ 56

Merits of Structured questions ................................................................................................................ 57

Demerits of open-ended questions .............................................................................................................. 57

c) Contingency questions ............................................................................................................................ 57

Characteristics of a Good Sample Design .......................................................................................... 62

Advantages .............................................................................................................................................. 71

Disadvantages ......................................................................................................................................... 71

3.2.2 Purpose of Case Studies ................................................................................................................. 79

SESSION 3.3 OTHER TYPES OF RESEARCH DESIGN ............................................................... 82

3.3.1.1 Purpose of Observational Research ............................................................................................... 83

3.3.1.2 Steps in carrying out Observational Research................................................................................ 83

3.3.1.3 Types of Observational Research ............................................................................................... 83

3.3.1.4 Limitations of Observational Research ....................................................................................... 83

3.3.2.1 Steps in carrying out ethnographic studies ................................................................................. 84

3.3.2.2 Advantages .................................................................................................................................. 84

3.3.3.1 Purpose of Historical Research ................................................................................................... 86

3.3.3.2 Steps in conducting historical research ....................................................................................... 86

3.3.3.3 Limitations of historical research ................................................................................................ 87

SESSION 3.4 RESEARCH ETHICS.......................................................................................................... 91

3.4.2 Balancing Costs and benefits in Research ..................................................................................... 91

3.4.3 Informed Consent ........................................................................................................................... 92

3.4.4 Competence .................................................................................................................................... 92

3.4.5 Privacy ........................................................................................................................................... 93

Page 15: RET560 Research Methods Course Material V01

xv

Unit 4 ....................................................................................................................................................... 97

STATISTICAL ANALYSIS WITH STATA AND SPSS .......................................................... 97

SESSION 4.1: INTRODUCTION TO SPSS .......................................................................................... 98

4.1.1 The Nature of SPSS ................................................................................................................... 98

4.1.2 Data Management .................................................................................................................... 104

4.1.3 Descriptive Statistics ................................................................................................................ 113

SESSION 4.2: INTRODUCTION TO STATA .................................................................................... 124

4.2.1 The Stata Environment ............................................................................................................. 125

4.2.2 Data Management .................................................................................................................... 130

4.2.3 Descriptive Statistics In Stata .................................................................................................. 136

Unit 5 ..................................................................................................................................................... 145

INTRODUCTION TO JOURNAL ARTICLES, CONFERENCE PAPERS AND

THESES WRITING ............................................................................................................................ 145

SESSION 5.1: RESEARCH AND THESIS REPORTS ....................................................................... 146

5.1.1 Thesis Report Writing .............................................................................................................. 146

5.1.2 Research Report Writing .......................................................................................................... 148

SESSION 5.2: JOURNAL ARTICLES AND CONFERENCE PAPER PREPARATION ................. 150

SESSION 5.4: SESSION 5.3: ABSTRACTS AND SUMMARIES AND REFERENCING .............. 152

5.3.1 Abstracts and Summaries ......................................................................................................... 152

5.3.2 Tables and Figures ................................................................................................................... 152

5.3.2 Referencing .............................................................................................................................. 152

5.3.3 Referencing Formats ................................................................................................................ 153

5.3.4 Introduction to referencing software packages ........................................................................ 156

COURSE SUMMARY ....................................................................................................................... 159

APPENDIX A1..................................................................................................................................... 160

Page 16: RET560 Research Methods Course Material V01

xvi

APPENDIX A2..................................................................................................................................... 167

APPENDIX B ....................................................................................................................................... 175

Page 17: RET560 Research Methods Course Material V01

xvii

List of Tables

Table 1.1: Typical Structure of a Logical Framework .................................................................................. 7

Table 1.2: Example of a Research Budget .................................................................................................... 8

Table 2.1: Determining the average, average deviation and standard deviation ......................................... 23

Table 2.2: Basic rules in error propagation ................................................................................................. 26

Table 3.1: Advantages and Disadvantages of the Interview Methods ........................................................ 53

Table 3.2: Advantage sand Disadvantages of Open-ended and Close questions ........................................ 59

Table 3.3: Sampling techniques: Advantages and disadvantages ............................................................... 71

List of Figures

Figure 2.1: Power output vs. insolation angle for polycrystalline silicon solar panel ............................... 20

Figure 2.2: Power output for fixed orientation and tracking polycrystalline silicon solar panel ................ 21

Figure 2.3: Determining Instrumental Limits of Error and Least Count .................................................... 23

Figure 2.4: Plot of f(x) vs X ........................................................................................................................ 29

Figure 2.5: samples are drawn from populations ........................................................................................ 30

Figure 2.6: the normal distribution is bell-shaped. ..................................................................................... 32

Figure 2.7: Positively skewed distribution .................................................................................................. 33

Figure 2.8: Negatively skewed distribution ................................................................................................ 34

Figure 2.9: Interpreting the z-score ............................................................................................................. 35

Figure 2.10: distribution of the means of the samples ................................................................................ 39

Figure 2.11: Average difference between expected value and sample mean .............................................. 40

Page 18: RET560 Research Methods Course Material V01

1

Unit 1

INTRODUCTION TO RESEARCH PROPOSALS AND

THESIS SYNOPSES PREPARATION

Introduction

This unit seeks to introduce students to the basic preparations preceding a research project. This

includes the preparation of concept notes, which are mostly directed towards donor/funding

agencies; research proposals, which are the full proposals stating the need for the research as

well as the expected results and thesis synopsis which are basically research proposals but

specifically for academic purposes.

Learning Objectives

After reading this unit you should be able to:

1. Write a concept note capable of securing funding

for a research project specifically your masters’

research project.

2. Prepare a research proposal as well as thesis

synopsis for your masters’ Thesis.

UNIT CONTENT

SESSION 1.1: CONCEPT NOTES

1.1.1 Introduction to Concept Notes

1.1.2 Structure of Concept Notes

SESSION 1.2: RESEARCH PROPOSALS

1.2.1 Introduction to Research Proposals

1.2.2 Structure of research proposals

1.2.3 Logical framework

1.2.4 Detailed Budget

SESSION 1.3: THESIS SYNOPSES

1.3.1 Introduction to Thesis Synopses

1.3.2 Structure of Thesis Synopses

Page 19: RET560 Research Methods Course Material V01

2

SESSION 1.1: CONCEPT NOTES

1.1.1 Introduction to Concept Notes

A concept note is a brief summary of a proposed research project, usually prepared for

donors or sponsors. It should not be more than 550 words or 3 to 7 pages. It should outline

the background to the project and state the research problem to be investigated. It should

also give the objectives and the methodology to be used for the research and spell out the

timeframe as well as a summary of budget for the project.

1.1.2 Structure of Concept Notes

1.1.2.1 Research Title

The title of the research should be concise and should focus the reader’s attention to the

critical theme of the proposed research. It should be short, usually not more than one line

in length and devoid of unnecessary punctuations as well as repetition of words.

1.1.2.2 Background

This contains a review of the main research work and current issues specific to the

subject area. It should also contain what is already known about the research subject. It is

important to note that, the background is not the same as the literature review with the

latter not necessary for concept notes. It is usually about 200 words in length.

1.1.2.3 Research Problem

This section should outline clearly without ambiguity the research problem to be

investigated. It shouldn’t be more than 200 words.

1.1.2.4 Objectives

The main objectives as well as the specific objectives of the proposed research should be

clearly outlined in this section.

1.1.2.5 Methodology

This section outlines clearly the proposed methodology to be used for the research work.

It spells out exactly how the research work will be carried out and the procedures

involved. It should usually be about 100 words in length.

1.1.2.6 Project location, timeframe and budget

Page 20: RET560 Research Methods Course Material V01

3

The proposed site for the research, time frame for the completion of the research as well

as a summary of the budget should clearly be outlined in this section.

Self Assessment 1.1

1. What is the purpose of a concept note?

2. What are the major components of a concept note?

SESSION 1.2: RESEARCH PROPOSALS

1.2.1 Introduction to Research Proposals

A research proposal as defined by the School of Advanced Study- University of London, is

a piece of work that, ideally, would convince scholars that your project has the following

three merits: conceptual innovation; methodological rigour; and rich substantive content.

A research proposal is therefore supposed to;

Provide a logical presentation of the research idea

Illustrate the significance of the idea

Relate the idea to past literature

Outline the activities for the proposed research project

A Research proposal may be written for any of the following reasons; to request funding

for a research project, as a task in tertiary education (in which case it is referred to as a

thesis synopsis), or as a condition for employment at a research institution.

1.2.2 Structure of research proposals

The structure of most research proposals include a title, introduction and background,

statement and significance of research problem, research objectives, literature review,

methodology and hypotheses, expected or preliminary results, researchers details,

timetable, detailed budget and finally references.

1.2.2.1 Title

Page 21: RET560 Research Methods Course Material V01

4

The title must be succinct and should give the reader an overview of what to expect in the

main document. It should be on the first page of the proposal, short with not more than 20

words and should be devoid of unnecessary punctuations and repetition of keywords.

1.2.2.2 Abstract

The abstract is a concise summary of the main points of the proposal and should be kept as

short as possible without leaving out any important point. It should be a maximum of 500

words.

1.2.2.3 Background

This contains a summary of the background information to the research problem and the

context within which the study will take place. It draws a relation between the study,

research idea and the policy environment. It should contain what is already known about

the research area and how the research will compliment what is already known. It is made

up of a maximum of 1,000 words.

1.2.2.4 Statement and Significance of Research Problem

It is important to state clearly the research problem and the significance of the research to

the community. This section of the proposal should be able to answer questions such as;

what is going to be studied/investigated? Why is it important to subject this subject? It

shouldn’t be more than 250 words in length.

1.2.2.5 Research Objectives

It is important to outline the key objective(s) of the research which spells out what the

researcher seeks to accomplish. A single principal objective with two or three specific

objectives is usually enough. They should be listed in order of importance and should be a

maximum of 200 words.

1.2.2.6 Literature Review

The Literature review should provide a brief description of available literature in terms of

research works done, policy statements and their implications as well as the identification

Page 22: RET560 Research Methods Course Material V01

5

of shortfalls to be studied and complimented. This section should indicate how existing

literature contributes to the proposed research and how the proposed research is also going

to add to existing work. It should be a maximum of 3,000 words.

1.2.2.7 Methodology

A methodology is a system of methods and principles employed in performing specific

tasks; in this case a research project. The main research techniques to be used for the

project must be described in details in this section. The methodology should also be able to

answers questions such as ‘will the study be based on existing information, interviews or a

combination of both? This section should also give a thorough description of the data

required, the nature of the fieldwork to be undertaken and how the data collected will be

analyzed. In the case of a survey using questionnaires, the sampling procedure as well as

approximate sample size should be stated (a draft questionnaire may be included). It is

important to state clearly how the data collected will address the research question. It

should be a maximum of 1,000 words.

1.2.2.8 Expected/Preliminary Results

This section gives a good indication of what is expected out of the research. It joins the

data analysis and possible outcomes to the theory and questions that have been raised. It

should include the following;

Scope of inference (i.e., to what extent are the results applicable to other locations,

times, or situations?)

Pitfalls that may be encountered

Limitations to proposed methods

1.2.3 Logical framework

Logical framework is a tool that helps in the planning, monitoring and evaluation of

project. It is an effective planning tool for defining inputs, outputs, timelines as well as

performance indicators for a particular project. It provides a structure for specifying the

components of an activity and for relating them to one another. It has the power to

communicate a project's objectives clearly and simply on a single page as well as the

ability to incorporate the full range of views of all stakeholders of a project.

The logical framework is presented in the form of a 4x4-matrix in which the overall

objectives, the project purpose, the mid-term results, and the activities of a project are

systematically presented in the first column of the matrix. The second and third columns

of the matrix present the corresponding indicators and their sources of information while

Page 23: RET560 Research Methods Course Material V01

6

the fourth column presents important assumptions that are beyond the direct control of the

project but need to be fulfilled in order to ensure a successful implementation of the

project. A logical framework can only done after a thorough analysis of problems,

objectives and strategies to be employed in the project. Table 1.1 shows a typical structure

of a logical framework.

Page 24: RET560 Research Methods Course Material V01

7

Table 1.1: Typical Structure of a Logical Framework

Intervention Objectively verifiable Sources and means of Assumptions

logic indicators of achievement verification

Overall What are the overall broader

What are the key indicators

related What are the sources of

objectives objectives to which the action to the overall objectives?

information for these

indicators?

will contribute?

Specific What specific objective is the Which indicators clearly show What are the sources of Which factors and conditions outside

objective action intended to achieve to that the objective of the

information that exist or can

be the Beneficiary's responsibility

contribute to the overall objectives? action has been achieved?

collected? What are the

methods are necessary to achieve that

required to get this

information? objective? (external conditions)

Which risks should be taken

into consideration?

Expected The results are the outputs envisaged to

What are the indicators to

measure What are the sources of What external conditions must be met

results achieve the specific objective. whether and to what extent the

information for these

indicators? to obtain the expected results

What are the expected results? action achieves the expected on schedule?

(enumerate them) results?

Activities

What are the key activities to be carried

out Means: What are the sources of

What pre-conditions are required

before

and in what sequence in order to

produce What are the means required to information about action the action starts?

the expected results? implement these activities, e. g. progress?

What conditions outside the

Beneficiary's

(group the activities by result) personnel, equipment, training, Costs direct control have to be met

studies, supplies, operational What are the action costs? for the implementation of the planned

facilities, etc. How are they classified? activities?

(breakdown in the Budget

for the Action)

Source: The European Commission

Page 25: RET560 Research Methods Course Material V01

8

1.2.4 Detailed Budget

A detailed budget is an itemized list accounting for every expense required to complete the

project. Itemized budgets are essential even if the granting agency does not require the

submission of a detailed budget (Ingersoll&Eberhard, 1999). It is important to note that the

researcher could easily overestimate or underestimate the cost of completing the study if

serious considerations are not given to all potential expenses. Some of the items included

in a research proposal budget can be divided roughly into the following categories:

personnel, consultation, subcontracts, equipment and supplies, travel, facilities,

administration costs and miscellaneous costs. Table 1.2 shows the typical structure of a

detailed research budget.

Table 1.2: Example of a Research Budget

Costs Unit # of units Unit rate

Costs

1. Human Resources

1.1 Salaries (gross salaries including social security charges and other related costs, local staff)4

1.2 Salaries (gross salaries including social security charges and other related costs, expat/int. staff) Per month

1.3 Per diems for missions/travel

Subtotal Human Resources

2. Travel

2.1. International travel Per flight

2.2 Local transportation Per month

Subtotal Travel

3. Equipment and supplies7

3.1 Purchase or rent of vehicles Per vehicle

3.2 Furniture, computer equipment

3.3 Machines, tools…

3.4 Spare parts/equipment for machines, tools

3.5 Other (please specify)

Subtotal Equipment and supplies

4. Local office

4.1 Vehicle costs Per month

4.2 Office rent Per month

4.3 Consumables - office supplies Per month

4.4 Other services (tel/fax, electricity/heating, maintenance) Per month

Subtotal Local office

5. Other costs, services8

5.1 Publications9

5.2 Studies, research9

5.3 Expenditure verification

5.4 Evaluation costs

5.5 Translation, interpreters

5.6 Financial services (bank guarantee costs etc.)

Page 26: RET560 Research Methods Course Material V01

9

5.7 Costs of conferences/seminars9

5.8. Visibility actions10

Subtotal Other costs, services

6. Other

Subtotal Other

7. Subtotal direct eligible costs of the Action (1-6) (excluding taxes)

8. Provision for contingency reserve (maximum 5% of 7, subtotal of direct eligible costs of the Action) (excluding taxes)

9. Total direct eligible costs of the Action (7+ 8) (excluding taxes)

10. Administrative costs (maximum 7% of 9, total direct eligible costs of the Action) (excluding taxes)

11. Total eligible costs (9+10) (excluding taxes)

12. Taxes11

13. Total eligible/accepted12 costs of the Action (11+12)

Self Assessment 1.2

1. What is the difference between a concept note and a proposal?

2. Explain the concept of a logical framework and outline its importance.

Page 27: RET560 Research Methods Course Material V01

10

SESSION 1.3: THESIS SYNOPSES

1.3.1 Introduction to Thesis Synopses

A thesis synopsis is an academic research proposal which should establish the area of the

research project, define clearly the central research question, and outline the methods to be

employed for the research. It should be developed in consultation with members of staff

(such as a proposed supervisor or the school postgraduate coordinator). It is important to

note that the initial ideas for the research could be refined during the course of the study.

1.3.2 Structure of Thesis Synopses

1.3.2.1 Title

The title must be concise and should give the reader an overview of what to expect in the

main document. It should be on the first page and must be the same as the title of the

thesis. It should be short with not more than 20 words and should be devoid of unnecessary

punctuations and repetition of keywords.

1.3.2.2 Introduction/Background

Outline briefly the relevance of the research work to be presented in the thesis in this

section. The introduction should be precise and include only relevant background

material in that particular field of study. It is important to provide information on past

works, by other researchers, by way of giving appropriate references. Maximum one

page, preferably half a page is allotted to this section.

1.3.2.3 Justification/Motivation

This section develops further the introductory/background materials provided in the

introduction; adding some of the major achievements made in the chosen area of

research. It should clearly indicate the existing challenges and why further research is

required to address those challenges. It is very necessary to stress on the importance of

the research problem identified as well as the technical challenges one has to address to

solve the problem so as to emphasis on the quality of the research work. Maximum one

page, preferably half a page is allotted to this section.

1.3.2.4 Objectives and Scope

Page 28: RET560 Research Methods Course Material V01

11

State clearly the main as well as the specific objectives for the research and define the

conceptual, analytical, experimental and/or methodological boundaries within which the

exercise should be carried out.

1.3.2.5 Methods

It is important to outline how you will approach your research topic. One should

demonstrate, in this section, that the chosen method or approach will serve to advance the

thesis. If you need to gather data, describe how you will go about this. This might involve

archival research, interviews with stakeholders, or various forms of fieldwork. There are

many established research methodologies. If your approach is experimental or

comparative, outline how this approach will yield results.

1.3.2.6 Work plan / Project Timelines

A project plan outlines in specific detail how a project will be conducted, who will work

on which part, and when and in what order each part will be accomplished. Develop this

section with some care, since it will provide you a means of measuring your progress in

relation to your allotted time. This section should detail the timing of specific activities to

be implemented towards the achievement of the specific objectives within a reasonable

1.3.2.7 Budget and Available Resources

It is important to present the full budget as well the various resources available for the

research work in this section. This section should indicate any bibliographic, laboratory,

computing or other physical resources required to execute the study and a budget for

projected expenditures including stipend/allowances where needed

1.3.2.8 References

List the references in the same order as they are referred to in the synopsis make sure all

references listed here are properly referred in the text. It is best to get into the habit of

using a standard referencing system (preferably in conformity with the Harvard System)

so that material can be transferred into your thesis. Do not cite from memory without

referencing.

1.3.2.10 Signature(s)

Page 29: RET560 Research Methods Course Material V01

12

It is very important for signatures attesting to the fact that your proposed Supervisor(s) is

(are) in agreement with your proposed study as elaborated in the synopsis.

Self Assessment 1.3

1. How different is a thesis synopsis from a research proposal?

Learning Track Activities

Unit Summary

Concept notes, research proposals and thesis synopsis are the first things that come to mind when

one thinks of a research work. These documents give various levels of information about the

research work and are mostly intended for different stakeholders. This chapter introduces

students to the preparation of concept notes, which are mostly directed towards donor/funding

agencies; research proposals, which are the full proposals stating the need for the research as

well as the expected results and thesis synopsis which are basically research proposals but

specifically for academic purposes. It is intended that at the end of this unit, the student should

be able to write a concept note capable of securing funding for a research project specifically

your masters’ research project and also prepare a research proposal as well as thesis synopsis for

your masters’ Thesis.

Key terms/ New Words in Unit

1. Thesis

2. Synopsis

3. Proposal

4. Logical framework

Page 30: RET560 Research Methods Course Material V01

13

Unit Assignments 1

1. Prepare a zero-order draft of your thesis synopsis in Power-Point format for

presentation to the class.

2. The presentation should last no more than 10 minutes to be followed by

another 10 minutes of questions and answers. This assignment will fetch 5

marks.

3. Following the presentation you will be required to do a first-draft of your

thesis synopsis (word-processed) for submission within one week. The draft

synopsis will also fetch 5 marks.

Page 31: RET560 Research Methods Course Material V01

14

Unit 2

ENGINEERING RESEARCH DESIGN AND DATA ANALYSIS

Introduction

This Unit introduces the student to concepts and methods in engineering research. The first

section (2.1) presents various contexts in engineering practice which necessitate research and

classifies experiments that may be undertaken as part of the research. Procedures for the design

of experiments are also presented.

Section 2.2 is on experimental errors, and catalogues various sources from which errors can be

introduced into our experimental work. Students are also presented with tools for the analysis of

such errors and how they are propagated as measurements are repeated and computations are

done.

Section 2.3 looks at probability distributions and standard errors. In this section the student is

introduced to probability density functions and their common features. The normal distribution

(the most widely used) is then discussed along with the concept standard scores and the

procedure procedures for its application. The Poisson and Binomial probability distributions are

also briefly presented to conclude the section.

Section 2.4 is on standard errors and considers Errors of the Mean, the Central Limit Theorem

and the t-Distribution.

The Unit concludes with Section 2.5, on examples in normal distributions and their application to

engineering problems.

Learning Objectives

After reading this unit you should be able to:

3. Clearly understand the importance of research in engineering,

4. Establish methodology for engineering experiments,

5. Identify sources of error in experiments and be able to minimize or

eliminate them.

6. Report inherent errors in experimental measurements

Page 32: RET560 Research Methods Course Material V01

15

Unit content

SESSION 2.1: INTRODUCTION TO ENGINEERING RESEARCH

2.1.1 Motivation for Research in Engineering and some basic concepts

2.1.2 Classification of Engineering Experiments

2.1.3 Research Questions in Engineering

2.1.4 Experiment Design Process

SESSION 2.2: EXPERIMENTAL ERROR PROPAGATION AND ANALYSIS

2.2.1 Sources of Error in Experimental Work

2.2.2 Estimating Uncertainties

SESSION 2.3: PROBABILITY DISTRIBUTIONS AND STANDARD ERRORS

2.3.1 Properties of Probability Density Function

2.3.2 Mean and Variance

2.3.3 The Normal Distribution

2.3.4 Skewed Distributions

2.3.4 Standardization and Z-Scores

SESSION 2.4: STANDARD ERRORS

2.4.1 Error of the Mean

2.4.2 Central Limit Theorem

2.4.3 The t-distribution

SESSION 2.5: EXAMPLES IN NORMAL DISTRIBUTIONS APPLIED TO

ENGINEERING PROBLEMS

2.5.1 Part 1 – Examples in Normal Distributions and z-Scores

7. Analyze engineering data using the normal probability curve

Page 33: RET560 Research Methods Course Material V01

16

2.5.2 Part 2 – Applying Normal Distribution to Engineering Problems

SESSION 2.1: INTRODUCTION TO ENGINEERING RESEARCH

2.1.1 Motivation for Research in Engineering and some basic concepts

Research in engineering is necessitated by factors which include either an advantage which could

be realized by improving on an existing technology, (e.g. an existing drilling machine) or to

address a problem.

More formally, engineering research may be described as a systematic, rigorous approach to

engineering problem-solving that applies principles and techniques to collect data, to ensure the

generation of valid, defensible and supportable engineering conclusions. This is usually carried

out under the constraint of a minimal expenditure of engineering runs, time, and money.1

To guarantee the integrity of the research process and to obtain high quality results and usable

conclusion, a number of practices are recommended below:

• Following the standards of the scientific method

• Purpose clearly defined

• Research process detailed (for replicability by others)

• Research design thoroughly planned

• High ethical standards applied

• Limitations frankly revealed

• Adequate analysis for decision maker’s needs

• Findings succinctly presented

• Conclusions must reflect research objectives

1 (US-NIST- National Institute of Standards and Technology)

Page 34: RET560 Research Methods Course Material V01

17

2.1.2 Classification of Engineering Experiments

As part of research in engineering, experiments may be conducted, which for one or more of the

following reasons:

A theoretical relationship between two or more variables is already known (or at least

suspected) and an experiment is needed to verify or quantify this relationship.

A theoretical relationship between two or more variables is not available but rather

sought through an experiment.

A new product is being developed and a test is needed to confirm that it meets the design

specifications, before committing it to production.

2.1.3 Research Questions in Engineering

The engineer is interested in assessing whether a change in a single factor has in fact resulted in a

change/improvement to the process as a whole.

The engineer is interested in "understanding" the process as a whole in the sense that he/she

wishes (after design and analysis) to have in hand a ranked list of important through unimportant

factors (most important to least important) that affect the process.

The engineer is interested in functionally modeling the process with the output being a good-

fitting (high predictive power) mathematical function, and to have good estimates (maximal

accuracy) of the coefficients in that function.

The engineer is interested in determining optimal settings of the process factors; that is, to

determine for each factor the level of the factor that optimizes the process response.

2.1.4 Experiment Design Process

In conducting experiments in engineering research, the following procedure is prescribed to

assist the researcher obtain valid and defensible conclusions:

1. Scientific/Engineering Concept

2. Questions Posed

3. Equipment /Materials

4. Design of Procedure

5. Analysis of Results

6. Conclusions

The procedure prescribed above may be expanded further into a flow process for the design of

experiments as presented below:

Page 35: RET560 Research Methods Course Material V01

18

Process Flow for Design of Experiments

1. Define the goals and objectives of the experiment. While the goal may be general, the

objectives need to be more specific and measurable, directly or indirectly;

2. Research any relevant theory and previously published data from similar experiments.

Performing computer simulations may also be part of this research, assuming that

appropriate software is available. The purpose of this step is to have an idea about what

to expect from the experiment;

3. Select the dependent and independent variable(s) to be measured;

4. Select appropriate methods for measuring these variables;

5. Choose appropriate equipment and instrumentation;

6. Select the proper range of the independent variable(s);

7. Determine an appropriate number of data points needed for each type of measurement;

8. Data analysis and reporting - qualitative analysis and quantitative analysis.

Additional Skills

In addition to the steps outlined above, the researcher must be careful to:

1. Familiarize himself/herself with the equipment to be used;

2. Ensure that instruments are properly calibrated;

3. 3. Follow the proper procedure to collect the data and / or measure the performance of the

product, e.g. reading from the meniscus in volume measurements.

Analyzing and interpreting data constitutes an important component of research, and the

researcher should be able to:

1. Carry out the necessary calculations;

2. Perform an error and uncertainty analysis;

3. Tabulate and plot the results using appropriate choice of variables and software

(such as STATA, SPSS, Microsoft Excel, etc)

4. Make observations and draw conclusions regarding the variation of the

parameters involved;

5. Compare results with predictions from theory or design calculations and attempt

an explanation of any discrepancies observed.

EXAMPLE

A student is tasked to investigate and compare the power output of a solar panel with a fixed

orientation to that of a solar panel whose orientation tracked the sun. He also tried to verify that

the power output of a photovoltaic cell was a function of temperature.

METHODOLOGY

Page 36: RET560 Research Methods Course Material V01

19

1. Define goals and objectives:

The goals and objectives for the experiment were to verify that:

A logarithmic relationship exists between angle of incidence of sunlight on a solar panel

and power output;

A tracking system increases power output by 20%; and

The power output of a solar cell is a function of temperature

2. Research relevant theory and previously published data:

The student investigated various sources of information in designing his experiment:

Internet resources, textbooks, interview with experts in solar systems, etc

3. Select the independent / dependent variables:

The key variables were identified to be:

Angle of orientation of solar panel (independent), and

Output power of solar panel (dependent)

4. Select appropriate methods:

The student chose a direct method for measuring the angle of the solar panel and measured

voltage and current to determine power output of the solar panel.

5. Choose equipment and instrumentation:

The student used a camera tripod, protractor, and plumb bob to orient and determine the angle of

the solar panel for the fixed panel measurements, and a sundial2 rod to orient the panel normal to

the sun’s rays for the tracking measurements; a digital multimeter to measure current and

voltage; and a thermometer to measure the temperature of the solar panel.

6. Select the range of the independent variable:

The tripod allowed a 55-degree range of motion and this set the range for the angle of incidence.

For the tracking measurements, the range of measurements took place from 6:45 am – 6:00 pm.

The student was limited by the available resources for investigating the effect of temperature to

that obtainable under ambient conditions and by cooling the solar panel using ice cubes.

7. Determine the appropriate number of data points:

2 A sundial is a device that determines the time of day by the position of the Sun.

Page 37: RET560 Research Methods Course Material V01

20

To investigate the logarithmic relationship between angle of incidence and power output, the

student chose 5-degree increments, which resulted in 12 data points.

For the tracking measurements, the student reoriented the panel to be normal to the sun’s rays

using the following schedule:

• 15 min intervals 6:45 am – 10:00 am

• 30 min intervals 10:30 am – 4:00 pm

• 30 min intervals 10:30 am – 4:00 pm

Figure 2.1: Power output vs. insolation angle for polycrystalline silicon solar panel

Page 38: RET560 Research Methods Course Material V01

21

Figure 2.2: Power output for fixed orientation and tracking polycrystalline silicon solar panel

Self Assessment 2.1

It is claimed that a 10% blend of biodiesel with conventional diesel improves the emissions characteristics

of engines. Design an experiment to investigate the veracity of this claim.

SESSION 2.2: EXPERIMENTAL ERROR PROPAGATION AND ANALYSIS

2.2.1 Sources of Error in Experimental Work

In conducting experiments errors may arise from a number of sources including:

Blunders (mistakes) - e.g. dropping a solid on the balance pan;

Human error (different from blunders) - Bothers more on inexperience, e.g. not

reading from the meniscus of a volumetric cylinder.

Page 39: RET560 Research Methods Course Material V01

22

Instrumental limitations – inherent errors and limitations in instruments used

(discussed later in this section)

Errors due to external influences – e.g. impurity in the chemicals used. This could

be minimized with careful design of experiment.

Sampling Error - Errors arising out of samples that do not adequately represent the

population.

o Example 1 - in measuring solar radiation, data taken at peak sunshine hours

could be misleading (unrepresentative).

o Example 2 – in measuring pollution level in a river, different pollutants

dominate depending on time of day, if this is not taken into account, samples

taken for analysis will be unrepresentative of the reality.

2.2.1.1 Instrumental Errors - A Closer Look

Random Errors occur due to inherent limitations of measuring instrument used. The smallest

division that is marked on a measuring instrument is referred to as the least count. Thus a meter

rule will have a least count of 1.0 mm; a digital stop watch might have a least count of 0.01 sec,

etc. The precision to which a measuring device can be read, and is always equal to or smaller

than the least count.

Instrument Limit of Error (ILE): Good measuring tools are calibrated against national and

international standards, e.g. ISO, IEC, National Institute of Standards and Technology-(US

NIST), Ghana Standards Board, etc.

The Instrumental Limit of Error (ILE) is generally taken to be the least count or some fraction

(1/2, 1/5, 1/10) of the least count. For some devices the ILE is given as a tolerance or a

percentage.

Resistors may be specified as having a tolerance of 5%, implying that the ILE is 5% of the

resistor's value.

Page 40: RET560 Research Methods Course Material V01

23

Figure 2.3: Determining Instrumental Limits of Error and Least Count

2.2.2 Estimating Uncertainties

The statistical method for finding a value with its uncertainty is to repeat the measurement

several times, find the average, and find either the average deviation or the standard

deviation. The example below is presented for 4 repeated measurements of time (7.4, 8.1, 7.9

and 7.0).

Table 2.3: Determining the average, average deviation and standard deviation

Time, t, sec (t - <t>), sec |t - <t>|, sec (t - <t>)2

7.4 -0.2 0.2 0.04

8.1 0.5 0.5 0.25

7.9 0.3 0.3 0.09

7.0 -0.6 0.6 0.36

<t> = 7.6

Average

<t-<t>>= 0.0 <|t-<t>|>= 0.4

Average deviation (t - <t>)

2

= 0.247

Standard dev = 0.50

Measurements are then reported with the uncertainty as:

Measurement = Best Estimate ± Uncertainty

The average (mean) value is usually taken as the best estimate, and is determined as:

Page 41: RET560 Research Methods Course Material V01

24

𝐴𝑣𝑒𝑟𝑎𝑔𝑒 ⟨𝑡⟩ =𝑡1 + 𝑡2 + 𝑡3 + ⋯ + 𝑡𝑁

𝑁

Where N is the number of observations or measurements

A way to express the variation among the measurements is to use the average deviation. This

statistic tells us on average (with 50% confidence) how much the individual measurements vary

from the mean. As indicated above, the average deviation is calculated by summing the absolute

values of the deviation of measurements from the mean, and dividing by the number of

observations.

𝐴𝑣𝑒𝑟𝑎𝑔𝑒 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛, ⟨|𝑡 − ⟨𝑡⟩|⟩ =|𝑡1 − ⟨𝑡⟩| + |𝑡2 − ⟨𝑡⟩| + ⋯ + |𝑡𝑁 − ⟨𝑡⟩|

𝑁

However, the standard deviation is the most common way to characterize the spread of a data

set. The standard deviation is always slightly greater than the average deviation, and is used

because of its association with the normal distribution that is frequently encountered in statistical

analyses.

𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = √|𝑡1 − ⟨𝑡⟩|2 + |𝑡2 − ⟨𝑡⟩|2 + ⋯ + |𝑡𝑁 − ⟨𝑡⟩|2

𝑁 − 1

In the example above (section 3.2.2), the standard deviation of 0.5 implies that for the same

series of measurements, an additional measurement taken may be expected (with about 68%

confidence) to lie within ± 0.5 of the average value of 7.6 sec.

Fractional Uncertainty

When a reported value is determined by taking the average of a set of independent readings, the

fractional uncertainty is given by the ratio of the uncertainty divided by the average value.

𝐹𝑟𝑎𝑐𝑡𝑖𝑜𝑛𝑎𝑙 𝑈𝑛𝑐𝑒𝑟𝑡𝑎𝑖𝑛𝑡𝑦 =𝑈𝑛𝑐𝑒𝑟𝑡𝑎𝑖𝑛𝑡𝑦

𝐴𝑣𝑒𝑟𝑎𝑔𝑒

The fractional uncertainty is dimensionless, and sometimes reported as a fraction.

Propagation of Errors- Basic Rules

Page 42: RET560 Research Methods Course Material V01

25

General theory:

Suppose we want to determine a quantity f which depends on variables x, y ... etc.

f(x,y,...)

𝑑𝑓 =𝜕𝑓

𝜕𝑥𝛿𝑥 +

𝜕𝑓

𝜕𝑦𝛿𝑦+...

Taking the square of the above expression, we get the law of propagation of uncertainty:

(𝑑𝑓)2 = (𝜕𝑓

𝜕𝑥)

2

(𝛿𝑥)2 + (𝜕𝑓

𝜕𝑦)

2

(𝛿𝑦)2 + 2 (𝜕𝑓

𝜕𝑥) (

𝜕𝑓

𝜕𝑦) 𝛿𝑥𝛿𝑦

If the measurements of x and y are uncorrelated, then 𝛿𝑥𝛿𝑦 = 0 and the error in the function f

may be approximated as:

∆𝑓 = √(𝜕𝑓

𝜕𝑥)

2

(∆𝑥)2 + (𝜕𝑓

𝜕𝑦)

2

(∆𝑦)2

Where 𝛿𝑥 ≈ ∆𝑥 𝑎𝑛𝑑 𝛿𝑦 ≈ ∆𝑦.

Examples:

a) If

𝑓 = 𝑥 + 𝑦

𝜕𝑓

𝜕𝑥= 1,

𝜕𝑓

𝜕𝑦= 1

∆𝑓 = √(∆𝑥)2 + (∆𝑦)2

b) If

𝑓 = 𝑥𝑦

𝜕𝑓

𝜕𝑥= 𝑦,

𝜕𝑓

𝜕𝑦= 𝑥

∆𝑓 = √(𝑦)2(∆𝑥)2 + (𝑥)2(∆𝑦)2

Dividing by the function

𝑓 = 𝑥𝑦

We obtain

∆𝑓

𝑓= √(

∆𝑥

𝑥)

2

+ (∆𝑦

𝑦)

2

Page 43: RET560 Research Methods Course Material V01

26

c) If

𝑓 = 𝑥/𝑦

𝜕𝑓

𝜕𝑥=

1

𝑦,

𝜕𝑓

𝜕𝑦=

𝑥

𝑦2

∆𝑓 = √(1

𝑦)

2

(∆𝑥)2 + (𝑥

𝑦2)

2

(∆𝑦)2

Dividing by the function

𝑓 = 𝑥/𝑦

We obtain

∆𝑓

𝑓= √(

∆𝑥

𝑥)

2

+ (∆𝑦

𝑦)

2

Therefore the uncertainty in the function f is the same for both multiplication and division. Note

that unlike the sums, this is always written as fractional errors for dimensional consistency.

By a similar process the error in a function of the form

𝑓 = 𝑥𝑚𝑦𝑛

May be expressed as:

∆𝑓

𝑓= √(

𝑚∆𝑥

𝑥)

2

+ (𝑛∆𝑦

𝑦)

2

A summary of some of the basic rules is presented in the table below:

Table 4.2: Basic rules in error propagation

Operation Example Error

Addition S = A+B

.22 BAS

Page 44: RET560 Research Methods Course Material V01

27

Subtraction D = A-B

Multiplication P = A x B

Division Q = A / B

For equations involving mixtures of multiplication, division, addition, subtraction, and powers;

the same basic rules are applied systematically to evaluate the error contained in the dependent

variable as a result of errors in the independent variables.

Example

In an experiment to determine the enthalpy of neutralization of sodium hydroxide by

hydrochloric acid, the initial temperature was (19.2 ± 0.2) oC, and the final temperature (26.4 ±

0.2) oC. What is the temperature rise?

Solution: ΔT = (T2 – T1) ± ΔT;

(26.4 – 19.2) oC ± ΔT

=7.2 oC ± ΔT

The error ΔT is given by:

∆𝑇 = √(Δ𝑇1)2 + (Δ𝑇2)2

= √(0.2)2 + (0.2)2

= 0.28oC

ΔT = (7.2 ± 0.28) oC

Self Assessment 2.2

Calculate z and Δz for each of the following cases:

22 BAD

22

B

B

A

A

P

P

22

B

B

A

A

Q

Q

Page 45: RET560 Research Methods Course Material V01

28

1. 𝒛 = 𝒙 − 𝟐. 𝟓𝒚 + 𝒘 for 𝐱 = (𝟒. 𝟕𝟐 ± 𝟎. 𝟏𝟐)m, 𝐰 = (𝟏𝟓. 𝟔𝟑 ± 𝟎. 𝟏𝟔) m

2. 𝐳 = (𝐰 ×𝐱

𝐲) for 𝐰 = (𝟏𝟒. 𝟒𝟐 ± 𝟎. 𝟎𝟑) m/s2, 𝐱 = (𝟑. 𝟔𝟏 ± 𝟎. 𝟏𝟖) m, 𝐲 =

(𝟔𝟓𝟎 ± 𝟐𝟎) m/s

3. 𝑧 = 𝐴 sin 𝑦 for 𝐀 = (𝟏. 𝟔𝟎𝟐 ± 𝟎. 𝟎𝟎𝟕) m/s, 𝐲 = (𝟎. 𝟕𝟕𝟒 ± 𝟎. 𝟎𝟎𝟑) rad.

SESSION 2.3: EXPERIMENTAL ERROR PROPAGATION AND ANALYSIS

2.3.1 Probability Distributions and Standard Errors

Probability distribution is a function that describes the probability of a random variable 3taking

certain values. In more precise definitions, distinction is made

between discrete and continuous random variables.

(emathzone, 2012)

The probability function of the continuous random variable is called probability density

function.

It is denoted by 𝑓(𝑥);

Where 𝑓(𝑥) is the probability that the random variables X and takes the value between 𝑥 and 𝑥 +

∆𝑥 where ∆𝑥 is a very small change in X.

3 A random variable is a numerical variable whose measured value can change from one replicate experiment to

another.

A random variable is called continuous if it can assume all possible values in the possible

range of the random variable. Suppose the temperature in a certain city in the month of

June in the past many years has always been between 35o to 45 o centigrade. The

temperature can take any value between the ranges 35o to 45 o.

In discrete random variable the values of the variable are exact like 0, 1, 2 good bulbs. the

interval may be very small.

Page 46: RET560 Research Methods Course Material V01

29

Figure 2.4: Plot of f(x) vs X

Credit: (emathzone, 2012)

The probability that X is between a and b is determined as the integral of 𝑓(𝑥) from a to b, and

is expressed mathematically as:

𝑃(𝑎 < 𝑋 < 𝑏) = ∫ 𝑓(𝑥)𝑑𝑥𝑏

𝑎

2.3.2 Properties of Probability Density Function

The probability density function 𝑓(𝑥) has the following properties.

1. The function is non-negative for all values of 𝑥; 𝑓(𝑥) ≥ 0

2. The total area = ∫ f(x)dx∞

−∞= 1

3. (X = c) = ∫ f(x)dxc

c= 0; where c is a constant4. A probability of zero is assigned to each

point of the random variable. This means that we must calculate a probability for a

continuous random variable over an internal and not for any particular point. The

probability can be interpreted as an area under the graph between the interval from a to b.

4. If X is a continuous random variable, then for any a to b,

𝑃(𝑎 ≤ 𝑋 ≤ 𝑏) = 𝑃(𝑎 < 𝑋 ≤ 𝑏) = 𝑃(𝑎 ≤ 𝑋 < 𝑏) = 𝑃(𝑎 < 𝑋 < 𝑏)

4 The probability of a continuous random variable assuming a specific value is zero. This does not necessarily mean

that a particular value cannot occur. The interpretation is that the point (event) is one of an infinite number of

possible outcomes.

Page 47: RET560 Research Methods Course Material V01

30

2.3.3 Mean and Variance

Important parameters in presenting probability distributions include the mean (arguably the most

popular statistical parameter), the variance and standard deviation. These parameters could be

based on the population (N) or on a sample of the population (n), see figure 2.5 below:

Figure 2.5: samples are drawn from populations

Variance

The mean may be determined as below:

𝜇 =∑ 𝑋

𝑁 (𝑏𝑎𝑠𝑒𝑑 𝑜𝑛 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛, 𝑁)

Or

�̅� =∑ 𝑋

𝑛 (𝑏𝑎𝑠𝑒𝑑 𝑜𝑛 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛, 𝑛)

Page 48: RET560 Research Methods Course Material V01

31

Where:

𝜇 = 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛,

𝑋 = 𝑎 𝑠𝑐𝑜𝑟𝑒 𝑖𝑞𝑛 𝑡ℎ𝑒 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛

The variance is then determined as:

𝜎2 =∑(𝑋 − 𝜇)2

𝑁

𝑠2 =∑(𝑋 − �̅�)2

𝑛 − 1

Where:

𝜎2 = 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 𝑏𝑎𝑠𝑒𝑑 𝑜𝑛 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛

𝑠2 = 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 𝑏𝑎𝑠𝑒𝑑 𝑜𝑛 𝑠𝑎𝑚𝑝𝑙𝑒

The standard deviation is then s2 or σ2.

Page 49: RET560 Research Methods Course Material V01

32

2.3.4 The Normal Distribution (also called the bell-curve)

The normal distribution is the most widely used model for the distribution of random variables

and helps in determining the probability of something occurring in a given sample just due to

chance. It is also called the bell-curve because of its resemblance to the shape of a bell (see

below, Fig 2.6).

Figure 2.6: the normal distribution is bell-shaped.

The normal distribution has three fundamental characteristics:

Symmetrical - upper half and the lower half of the distribution are mirror images of each

other.

Unimodal - the mean, median, and mode are all in the same place, in the center of the

distribution (i.e., the top of the bell curve); and the normal distribution is highest in the

middle.

Asymptotic - the upper and lower tails of the distribution never actually touch the

baseline, also known as the x-axis.

In a normal distribution, a random variable X has a probability density function is given by:

𝑓(𝑥) =1

𝜎√2𝜋𝑒

−(𝑥−𝜇)2

2𝜎2 𝑓𝑜𝑟 − ∞ < 𝑥 < ∞

Page 50: RET560 Research Methods Course Material V01

33

Where;

−∞ < 𝜇 < ∞, and 𝜎 > 0

The notation 𝑁(𝜇, 𝜎2) is often used to denote a normal distribution with mean μ and variance σ2.

2.3.5 Skewed Distributions

When a sample of scores is not normally distributed, two terms, skew and kurtosis, are used to

characterise it.

If there are a few scores creating an elongated tail at the higher end of the distribution, it is said

to be positively skewed (see Fig 2.7). If the tail is pulled out toward the lower end of the

distribution, the shape is called negatively skewed (see Fig 2.8).

Figure 2.7: Positively skewed distribution

Page 51: RET560 Research Methods Course Material V01

34

Figure 2.8: Negatively skewed distribution

Kurtosis refers to the shape of the distribution in terms of height, or flatness. When a distribution

has a peak that is higher than that found in a normal, bell-shaped distribution, it is called

leptokurtic. When a distribution is flatter than a normal distribution, it is called platykurtic.

2.3.6 Standardization and Z-Scores

Using the mean and the standard deviation, researchers are able to generate a standard score,

also called a z score to help them understand where an individual score falls in relation to other

scores in the distribution.

A standard normal random variable is defined as a random variable with μ=0 and σ2=1. It is

normally denoted as Z.

Through a process of standardization, researchers are also better able to compare individual

scores

in the distributions of variables. Standardization is simply a process of converting each score in

a distribution to a z score.

A z score indicates how far above or below the mean a given score in the distribution is in

standard deviation units. Standardization is simply the process of converting individual raw

scores in the distribution into standard deviation units.

The z-score is computed as indicated below, in terms the mean and standard deviation:

Page 52: RET560 Research Methods Course Material V01

35

𝑧 =𝑟𝑎𝑤 𝑠𝑐𝑜𝑟𝑒 − 𝑚𝑒𝑎𝑛

𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛=

𝑋 − 𝜇

𝜎 𝑜𝑟

𝑋 − �̅�

𝑠

The 68-95-99.7% Rule

All normal probability density curves satisfy the following properties (see figure 2.9):

68% of the observations fall within 1 standard deviation of the mean,

i.e. 𝑃(𝜇 − 𝜎 < 𝑋 < 𝜇 − 𝜎) = 0.6827

95% of the observations fall within 2 standard deviations of the mean,

i.e. 𝑃(𝜇 − 2𝜎 < 𝑋 < 𝜇 − 2𝜎) = 0.9545

99.7% of the observations fall within 3 standard deviations of the mean,

i.e. 𝑃(𝜇 − 𝜎 < 𝑋 < 𝜇 − 𝜎) = 0.9973

Figure 2.9: Interpreting the z-score

Interpreting z-Scores

• z scores tell researchers instantly how large or small an individual score is relative to

other scores in the distribution.

• Example, if a student got a z score of -1.5 in an exam, it is inferable that student scored

1.5 standard deviations below the mean in that exam.

• If another student had a z score of 0.29, I would know the student scored 0.29 standard

deviation units above the mean in the exam.

Page 53: RET560 Research Methods Course Material V01

36

Self Assessment 2.3

Suppose that the average score of a student in an automobile engineering class is 517, with a

standard deviation of 100, and the distribution of scores is normal. What is the score that marks

the 90th percentile?

Remarks

Remember that the 90th percentile is 40 percentile points above the mean in a normal

distribution, so we are looking for the z score at which 40% of the distribution falls between the

mean and this z-score.

OR

The z score at which 10% of the distribution falls above, because the 90th percentile score

divides the distribution into sections with 90% of the score falling below this point and 10%

falling above

1. From traditional statistics tables5 , the z score that corresponds with the 90th percentile

(probability of 0.9) is 1.28.

So z = 1.28

These tables are developed using the probability function of a normal distribution.

2. Convert this z score back into the original unit of measurement

𝑋 = 𝜇 + (𝑧)(𝜎)

5 For cumulative standard normal distribution

Quick Questions:

• What does a z-score of 1.0 mean?

• What DOES it say?

• What it does NOT say?

Page 54: RET560 Research Methods Course Material V01

37

𝑋 = 517 + (1.28)(100)

𝑋 = 517 + 128

𝑋 = 645

2.3.7 Other Probability Distribution Functions

In addition to the normal distribution, other probability distributions include:

The binomial distribution which is used for the reporting of outcomes of random

experiments consisting of n repeated trials such that

o The trials are independent,

o Each trial results in only two possible outcomes, labeled as success and failure,

and

o The probability of a success on each trial, denoted as p, remains constant.

The random variable X that equals the number of trials that result in a success has a binomial

distribution with parameters p and n, where 0 < 𝑝 < 1, and 𝑛 = {1,2,3, … }

The probability function of X is:

𝑓(𝑥) = (𝑛𝑥

) 𝑝𝑥(1 − 𝑝)𝑛−𝑥, 𝑥 = 0, 1, …

Page 55: RET560 Research Methods Course Material V01

38

The mean and variance are determined as:

𝜇 = 𝐸(𝑋) = 𝑛𝑝 and 𝜎2 = 𝑉(𝑋) = 𝑛𝑝(1 − 𝑝)

The Poisson distribution:

The Poisson distribution is used to model the number of events over an interval, such as the

number of e-mails that arrive in an hour. Assuming events occur at random throughout the

interval. If the interval can be partitioned into subintervals of small enough length such that:

the probability of more than one count in a subinterval is zero,

the probability of one count in a subinterval is the same for all subintervals and

proportional to the length of the subinterval, and

the count in each subinterval is independent of other subintervals,

then the random experiment is called a Poisson process.

If the mean number of counts in the interval is𝜆 > 0, the random variable X which is the

number of counts in the interval has a Poisson distribution with parameter λ, and the

probability function is:

𝑓(𝑥) =𝑒−𝜆𝜆𝑥

𝑥!, 𝑥 = 0, 1, 2, …

The mean and variance of X are

𝐸(𝑋) = λ and V(X) = λ

SESSION 2.3: STANDARD ERRORS

The standard error is the measure of how much random variation we would expect from samples

of equal size drawn from the same population.

2.4.1 Error of the Mean

When samples are drawn from a given population, say the scores by students in an examination,

the samples will be characterized by their own means (sample means). As an example if 100

students score marks ranging from 2 to 10 in an examination in which 0 is the least and 10 is the

highest; we may at random draw 10 students from the population of 100. The scores of these 10

students will yield a mean of say 5.5. If the earlier 10 students are put back into the population

Page 56: RET560 Research Methods Course Material V01

39

and another sampling of 10 students is done, their scores may yield another mean of say 6.0. If

this process is continued, a distribution of the means of the samples will be obtained, as indicated

in figure 2.10 below.

Figure 2.10: distribution of the means of the samples

The distribution of the means of the samples drawn also poses the characteristics of other

probability distributions, i.e. the mean and standard deviation. The mean of the sampling

distribution is called the expected value of the mean, because it is the same as the population

mean. The associated standard deviation (of the sampling distribution) is called the standard

error.

The standard errors of the mean are calculated as below:

𝜎�̅� =𝜎

√𝑛

𝜎�̅� =𝑠

√𝑛

Where;

σ = the standard deviation for the population

s = the sample estimate of the standard deviation (used when σ is not known)

n = the size of the sample

The standard error of the mean refers to the average difference between the expected value

(e.g., the population mean) and an individual sample mean as shown in Figure 2.11 below.

Page 57: RET560 Research Methods Course Material V01

40

Figure 2.11: Average difference between expected value and sample mean

2.4.2 Central Limit Theorem

The Central Limit Theorem simply states that as long as you have a reasonably large sample

size (e.g., n = 30), the sampling distribution of the mean will be normally distributed, even if the

distribution of scores in your sample is not.

This theorem says that even when you have a non-normal distribution in a population, the

sampling distribution of the mean will most likely approximate a normal, bell-shaped

distribution as long as you have at least 30 cases in your sample.

2.4.3 The t-distribution

This test is used for samples of small sizes that are not distributed normally. With larger sample

sizes (n>=120) the distribution is identical to the normal distribution.

Whenever the population standard deviation is not known and an estimate from a sample must be

used, it is wise to use the family of t distributions.

When σ is known:

𝑧 =𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛 − 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛

𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟

Page 58: RET560 Research Methods Course Material V01

41

When σ is not known:

𝑡 =𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛 − 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛

𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟=

�̅� − 𝜇

𝑠�̅�

Where:

𝜇 = 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛

𝜎�̅� = 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟 𝑢𝑠𝑖𝑛𝑔 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛

�̅� = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛

𝑠�̅� = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟

These equations help us to address the question below:

With known population mean, what is the probability of having a sample distribution with a

particular mean?

Example:

The average American man exercises for 60 minutes a week. Suppose, further, that I have a

random sample of 144 men and that this sample exercises for an average of 65 minutes per week

with a standard deviation of 10 minutes. What is the probability of getting a random sample of

this size with a mean of 65 if the actual population mean is 60 by chance?

𝑡 =65 − 60

10√144

𝑡 =5

0.83

𝑡 = 6.02

From t-tables, the probability of getting a t value of this size or larger by chance with a sample of

this size is less than 0.001

Page 59: RET560 Research Methods Course Material V01

42

Self Assessment 2.4

An article in the Journal of Heat Transfer described a new method for measuring the thermal

conductivity of Armco iron. Using a temperature of 100 oF and a power input of 550 W, the

following 10 measurements of thermal conductivity were obtained (in Btu/hr-ft- oF):

41.60, 41.48, 42.34, 41.95, 41.86,

42.18, 41.72, 42.26, 41.81, 42.04

Determine the standard error of the sample mean.

SESSION 2.5: Examples in Normal Distributions Applied to Engineering Problems

(Courtesy Montgomery, Runger, & Hubele, 2000)

Page 60: RET560 Research Methods Course Material V01

43

2.5.1 Part 1 – Examples in Normal Distributions and z-Scores

Find the probability P and represent it on a normal distribution diagram, under the following

assumptions for the normalized score Z:

(1) 𝑃(𝑍 > 1.26)

(2) 𝑃(𝑍 < −0.86)

(3) 𝑃(𝑍 > 1.37)

(4) 𝑃(−1.25 < 𝑍 < 0.37)

(5) 𝑃(𝑍 ≤ −4.6)

(6) Find z such that 𝑃(𝑍 = 𝑧) = 0.005

(7) Find the value of z such that 𝑃(−𝑧 < 𝑍 < 𝑧) = 0.9

Page 61: RET560 Research Methods Course Material V01

44

Q1-SOLUTION

P(Z>1.26)

=1-P(Z≤1.26)= 1-0.89616=0.10384 or 10.384%

Q2-SOLUTION

P(Z<-0.86)

From normal distribution tables:

P(Z<-0.86)= 0.19490

Q3-SOLUTION

P(Z>-1.37)

Page 62: RET560 Research Methods Course Material V01

45

Remember normal distributions are symmetrical!

P(Z<1.37)=0.91465

Q4-SOLUTION

P(-1.25<Z<0.37)

P(Z<0.37)-P(Z<-1.25)

P(Z<0.37)=0.64431;

P(Z<-1.25)= 0.10565

P(Z<0.37)-P(Z<-1.25)= 0.64431-0.10565=0.53866

Q5-SOLUTION

P(Z≤-4.6)

P(Z≤-4.6) is not available in the tables, but using the last score of -3.99; P(Z≤-3.99) = 0.00003

Implying that P(Z≤-4.6) is negligible.

Q6-SOLUTION

Find the value of z such that P(Z>z)=0.05

Page 63: RET560 Research Methods Course Material V01

46

Z in the inequality above is the same as pertains in P(Z≤z)=0.95

Search through the probabilities in the Tables for the value that corresponds to 0.95.

𝒛 = 𝟏. 𝟔𝟓

Q7-SOLUTION

Find the value of z such that P(-z<Z<z)=0.99

Using the symmetry concept, the remaining area in the shaded region is (1-0.99)/2=0.005

The value for z corresponds to a probability of 0.995. The nearest probability is 0.99506 when

z=2.58

2.5.2 Part 2 – Applying Normal Distribution to Engineering Problems

Questions

1. The compressive strength of samples of cement from a manufacturing company can be

modeled by a normal distribution with a mean of 6000 kg/cm2 and a standard deviation of

100kg/cm2

(i) What is the probability that a sample’s strength is less than 6250 kg/cm2?

Page 64: RET560 Research Methods Course Material V01

47

(ii) What is the probability that a sample’s strength is between 5800 kg/cm2 and 5900

kg/cm2

(iii) What compressive strength is exceeded by 95% of the samples?

2. The fill volume of an automated filling machine used for filling cans of carbonated

beverage is normally distributed with a mean of 12.4 fluid ounces (fl oz) and a standard

deviation of 0.1 fluid ounce.

(i) What is the probability that a fill volume is less than 12 fluid ounces?

(ii) If all cans less than 12.1 or greater than 12.6 ounces are scrapped, what is the

proportion of cans id scrapped?

(iii) Determine the specifications that are symmetric about the mean that include 99%

of all cans.

Learning Track Activities

Unit Summary

Research is essential in the practise of engineering and is necessitated by the need

to solve existing problems, an advantage that could be derived from improved

products and services, etc.

A clear methodology should be established in the conduct of experiments, being

conscious of the possible sources of errors including the inherent errors in

equipment used.

Data generated from experimental work can be presented by the probability

distribution curves such as the normal, binomial and Poisson distributions, and

from these important inferences could be made.

Page 65: RET560 Research Methods Course Material V01

48

Key terms/ New Words in Unit

Least Count

Sampling error

Uncertainty

Random variable

Standard error

Expected value

Z-score

Standardization

Unit Assignments 2

Calculate z and Δz for questions 1 and 2:

1. 𝑧 = 𝑥3 for 𝐱 = (𝟑. 𝟓𝟓 ± 𝟎. 𝟏𝟓) m

2. 𝑧 = 𝑣(𝑥𝑦 + 𝑤)with 𝐯 = (𝟎. 𝟔𝟒𝟒 ± 𝟎. 𝟎𝟎𝟒)m, 𝐱 = (𝟑. 𝟒𝟐 ±

𝟎. 𝟎𝟔)m, 𝐲 = (𝟓. 𝟎𝟎 ± 𝟎. 𝟏𝟐)m, 𝐰 = (𝟏𝟐. 𝟏𝟑 ± 𝟎. 𝟎𝟖)m2.

3. The reaction time of a driver to visual stimulus is normally distributed, with

a mean of 0.4 second and a standard deviation of 0.05 second.

a. What is the probability that a reaction requires more than 0.5 second?

b. What is the probability that a reaction requires between 0.4 and 0.5

second?

c. What is the reaction time that is expected 90% of the time?

Page 66: RET560 Research Methods Course Material V01

49

Unit 3

SOCIAL SCIENCE RESEARCH DESIGN AND DATA

ANALYSIS

Introduction

It is common knowledge that research works within the social sciences draw on various long-

established traditions (viz. anthropology, psychology, sociology, etc). Fundamentally, social

science research works are concerned with people and their life contexts, and seek to answer

philosophical questions relating to the nature of knowledge and truth, values and being which

underpin human judgments and activities. One of the fundamental distinctions between social

science research and that of the natural sciences is the focus of the former on people saddled with

the unpredictability of human behaviour. The natural sciences, for e.g. medical researchers, are

able to use probability theories to develop therapeutic drugs because bodily systems function

relatively autonomously from the mind. Social science researchers are however unable to

develop such powerful solutions to social problems since the mind enables individuals and

groups to take decisions that vary widely with different motives.

The purpose of this unit is to introduce participants to research designs and the array of research

methodologies within the social sciences. Emphasis is placed on the “empirical social science

research” which involves the design of data collection instruments and the collection,

management, simulation, analysis and presentation of data about people and their social contexts

by a range of methods.

Learning Objectives

After reading this unit you should be able to:

1. Identify the research approaches available to researchers in the social

sciences;

2. Know the factors that affect the effectiveness of these research designs

3. Operationalise these research approaches in ways that the weaknesses do

not limit the credibility of the research findings.

Page 67: RET560 Research Methods Course Material V01

50

UNIT CONTENTS

SESSION 3.1: SURVEY RESEARCH

3.1.1 Introduction to Survey Research

3.1.2 Detailed Steps in Conducting Successful Surveys

SESSION 3.2 CASE STUDY RESEARCH

3.2.1 Introduction to Case Study

3.2.2 Purpose of Case Studies

3.2.3 Advantages of Case Study

3.2.4 Disadvantages of Case Study

3.2.5 Designing a Case Study

3.2.6 Categories of Case Study

3.2.7 How to Select Cases in a Case Study Research

SESSION 3.3 OTHER TYPES OF RESEARCH DESIGNS

3.3.1 Observational Research

3.3.2 Ethnographic Research (Ethnography)

3.3.3 Historical Research

3.3.4 Descriptive Research

3.3.5 Explanatory Research and Research on Causality

3.3.6 Comparative Research Design

3.3.8 Experimental Design

SESSION 3.4 RESEARCH ETHICS

3.4.1 Why Research Ethics

3.4.2 Balancing Costs and benefits in Research

3.4.3 Informed Consent

3.4.4 Competence

3.4.5 Privacy

Page 68: RET560 Research Methods Course Material V01

51

SESSION 3.1 SURVEY RESEARCH

3.1.1 Introduction to Surveys

A survey is a systematic method of collecting data from a population of interest. Generally,

survey research tends to be quantitative in nature and aims to collect information from a sample

such that the results are representative of the population within a certain degree of error. Survey

gathers quantitative information, usually through the use of a structured and standardized

questionnaire. They are appropriate for assessing perceptions, opinions, knowledge, attitudes and

behaviors using structured questionnaires which are often close-ended.

3.1.2 Detailed Steps for Conducting a Survey Research

There are about 12 steps in conducting a survey research. These steps are briefly described

below.

3.1.2.1 Clarify the Purpose of the Research

Clarify the purpose of the research. This step will spell out the reasons for conducting the survey

and identify who will be involved in the design and data collection exercises. It is also important

to clarify that the survey is the best approach for the collection of the information required. In

seeking to clarify the research, the following questions may be relevant:

Why conduct a survey?

Who are the stakeholders (primary and secondary stakeholders)? That is who is interested

and/or has influence over the survey?

Who is the population of interest? Demographic characteristics; Where do they live?

What is the best means of communicating with them (medium, time of day and time of the

week); What is the best way to reach them (i.e. direct interviews, mail, telephone)?

Are you interested in any sub-groups of this population? Determining the characteristics

of the population of interest gives the researcher some indication of how he can get a

representative sample, whether he needs to set quotas for subgroups, and how many

people he would need to survey.

Page 69: RET560 Research Methods Course Material V01

52

Self Assessment 3.1

You have observed that several children of school-going age are not in school in a

particular farming community. You want to investigate the causes of this phenomenon

through a survey.

Which sub groups would be of interest to you as a researcher?

When will you undertake the survey?

Answer Tips

List everyone who has a stake in children’s schooling in a locality.

Identify the reasons why timing is of the essence in data collection.

3.1.2.2 Resource Assessment

Following the clarification of the survey’s purpose is an assessment of the resource requirements

for the survey. This stage helps the researcher to evaluate the adequacy of in-house resources to

enable him design a survey that is within the budget line. The in-house resource assessment also

enables the researcher know which resources he needs to contract out. The Health

Communication Unit at the Centre for Health Promotion, University of Toronto recommends the

following questions for a comprehensive resource assessment:

Which in-house resources are available for use? Staff availability and skills; logistics

(materials and equipment, etc).

What external resources will you need? After assessing internal resources, if any gaps

are identified in the resources required, they can be filled by external resources.

Is the budget enough to enable the researcher acquire these resources?

Further Insight

The budget line for a UNDP survey (which was to “address gaps in data on energy access for

rural and urban areas in Ghana) undertaken by The Energy Center, KNUST was US$ 29 000.

The original intent was to survey 56 communities from the 10 administrative regions of Ghana.

Page 70: RET560 Research Methods Course Material V01

53

After assessing the survey expenses, the budget could cover only 15 communities from three

administrative regions. If the planners had glossed over this important stage of the research

process, the survey would have landed on an impermeable rock.

3.1.2.3 Decision on the Methods and Procedures

The third step in a survey research is to decide on methods to be used. That is, the most

appropriate method required for the research work. Primarily, there are three methods for

obtaining survey research:

Face to face interviews;

Mailed and e-mailed questionnaires; and

Telephone and computerized telephone interviews.

Table 3.5: Advantages and Disadvantages of the Interview Methods

Method Advantages Disadvantages

Face-to-face

interviews

Usually results in a higher

response rate

Reduces non-response to

individual questionnaire items

Interviewers can document

characteristics of non-

respondents and reasons for

refusal.

Preferable for survey

addressing complex issues

where further explanations

would be required to ensure

clarity on the part of the

respondent.

A social desirability bias may

affect the accuracy of

responses, especially when

survey is addressing sensitive

issues.

Recruitment and training of

interviewers is time

consuming and expensive.

Cost per interview is

expensive.

Mailed and e-

mailed

questionnaires

Social desirability bias is

minimized

Administrative costs and costs

per respondent are

significantly reduced.

May not possible to

determine the demographics

and characteristics of non-

respondents and/or reasons

for refusal.

Some responses may not be

complete on returned

questionnaires.

The time taken to receive

completed questionnaires

may be long.

Page 71: RET560 Research Methods Course Material V01

54

Telephone and

computerized

telephone

interviews

It is possible to achieve high

response rates.

Interviewers are able to

document characteristics of

non-respondents and reasons

for refusal.

The amount of non-response to

questionnaire items can be

minimized.

Able to obtain results quickly

Less costly than face to face

interviews (but more expensive

than mail surveys).

Sometimes difficult to reach

a selected resident of a

household.

Complex questions make it

difficult for respondents to

retain the questions and

response categories.

Self Assessment 3.2

In the exercise in 3.1, and with your appreciation of the pros and cons of the three

interview methods in Table 3.1, which of the methods will you use in your home district?

Briefly explain the factors you considered in your choice of the most appropriate

interview method.

Answer Tips

The methods you will use should be informed by how effective each of the

methods will be in your home-district.

3.1.2.4 Design the Questionnaire

Questionnaires should be designed to address the research objectives. In designing the

questionnaires, it is crucial to note that the quality and usefulness of the information collected

will depend on how the questions are worded. Vague questions will result in the collection of

less useful responses or cause non responses. The researcher could be guided by the following

guidelines:

Page 72: RET560 Research Methods Course Material V01

55

Language and Wording

Proper wording of the questions is essential. The questions should be simple and straightforward

to ensure that respondent understands the questions correctly. Highly technical terms, slang,

abbreviations or words, which may be considered as insulting should be avoided. All the

questions should be available in the native language of respondent.

Recall Bias

When formulating the questions, it is imperative to have in mind that people tend to forget

events. When the recall period is longer the accuracy is often worse. Therefore, recall of the

events should be assisted by adding aids to the questionnaire and by ordering the questions. For

example holidays and national festivals can be used or the respondents can use a calendar.

Order of the Questions

The order of the questions in the questionnaire is also important. A poorly organized

questionnaire may confuse respondent, bias the responses, has an effect on response rate, as well

as willingness to answer sensitive questions. The questionnaire should start with the easy

questions. When more difficult questions are placed at the end of the questionnaire and if

respondent stops answering, at least some data for earlier questions have been collected. Asking

the easy questions first may lead to the establishment of a healthy rapport and thus the

respondent may be willing to answer the difficult (or probing) questions.

Length of the Questionnaire

The length of the questionnaire affects the response rate as well as reliability of the data. With

long questionnaires, the respondents often get careless towards the end of the interview which

will affect the reliability of the responses. A short questionnaire increases the response rate but

may lack important questions for the indicators. The ideal length for a self-administered

questionnaire is 15 minutes and for the face-to-face interview 30 minutes. Sometimes, loner

Page 73: RET560 Research Methods Course Material V01

56

questionnaires may not be avoided. Here, the interviews could be phased out so that the

responses will not suffer.

Types of Questions

Two types of questions are used in questionnaires:

Structured Questions

These are questions that are followed by a list of possible alternative responses from which

respondents select responses that best describes their situation. It is impossible to list all possible

alternative responses so what is normally done is to provide space for responses that were not

mentioned in the list. In such cases it is customary to provide additional space for a response

labelled “others {specify}”. This takes care of all other responses, which do not fit in the list of

alternative responses provided. (Look at the examples provided as an attachment.)

Merits of Structured questions

They are easier to administer and to analyse since they are in an immediate usable form.

They are economical to use in terms of time and money.

They are easier to administer because of the alternative answers provided.

Demerits of Structured questions

They provide limited responses or responses provided may fall short of the responses that

respondents may provide.

Respondents are compelled to answer questions according to the respondents’ choices.

More difficult to construct because one needs to carefully think through the categories of

response to provide.

Unstructured Questions

Unstructured or open-ended questions are those that are left open for respondents to provide

answers. Responds have the freedom to provide answers that they think are appropriate

Page 74: RET560 Research Methods Course Material V01

57

irrespective of what the researcher thinks. Individuals respond in their own ways and the length

of the response is determined by the kind of space provided for the response. E.g. where a little

space is provided, a short response is provided too. The reverse is also true. An instrument can

have both open and close ended questions based on the objective behind each question.

Merits of Structured questions

Open-ended questions have the tendency to stimulate respondents to think about their

feelings or motives to express what he/she may consider as an appropriate or most

important.

Responses express respondents feeling about a particular issue.

Responses say a lot about the responds in terms of their background, hidden motivation,

decisions and interests.

Open-ended questions are easier and simpler to formulate.

They allow for a greater depth of response.

Demerits of open-ended questions

This approach has a tendency of allowing people to provide irrelevant information or

information which does not answer the questions or objectives.

It could be time consuming. What is the implication of this?

It could be difficult to categorise such responses and hence difficult to analyse

quantitatively. (Where information cannot be categorized, it is better including it in the

narrative to be sure it does not get lost. Some open-ended questions are good for

qualitative purposes.)

c) Contingency questions

Where certain questions are only applicable to certain groups, they are followed with other

questions, which are referred to as contingency questions. Follow-up questions are required to

get further information from the relevant sub-groups. Thus subsequent questions asked after the

initial question are called contingency questions or filter questions. They are used to probe for

more information.

Page 75: RET560 Research Methods Course Material V01

58

An example of contingency questions is as follows:

3. Have you eating today?

Responses:

Yes (if yes, please move on to question no. 4)

No (if no, please move to question no. 5)

4. If Yes, explain. ………………………………………………………………………

d) Matrix Questions

These are questions where a set of responses is used to answer all the questions. Likert scales are

usually used for such responses, such as extremely satisfied, satisfied, dissatisfied extremely

dissatisfied. An example of this is shown below:

Responses: 1 – Extremely satisfied

2 – satisfied

3 – neutral

4 – dissatisfied

5 – extremely dissatisfied

How satisfied are you with your research methodology lecturer who is not into

management or business administration?

How satisfied are you with the research methods lectures so far as far as your

research work is concerned?

How satisfied are you with the length of time allocated for each lecture? Etc.

Merits of Matrix Questions

Space is used efficiently

It is easier to complete questions presented in a matrix form.

They are easier to complete such questions.

Page 76: RET560 Research Methods Course Material V01

59

They rarely put off respondents

Easy to compare responses given to different items.

It facilitates easy determination of a trend in the response.

Demerits of Matrix Questions

It is often abused because of the way it is easily constructed and provides responses.

It can easily influence a pattern of responses from respondents when they make up their mind not

to provide right responses.

Table 3.6: Advantage sand Disadvantages of Open-ended and Close questions

Types of

questions

Advantages Disadvantages

Open-ended

Elicit “rich” qualitative

data

Encourage thought and

freedom of expression

May discourage responses from less

literate respondents

Take longer to answer and may put

some people off

Are more difficult to analyze –

responses can be misinterpreted.

Close questions

Elicit quantitative data

Can encourage

‘mindless’ replies

Are easy for all literacy

levels to respond to

Are quick to answer and

may improve your

response rate

Are easy to ‘code’ and

analyze

Can suggest ideas that the respondent

would not otherwise have.

Respondents with no opinion or no

knowledge can answer anyway

Respondents can be frustrated

because their desired answer is not a

choice

It is confusing if many response

choices are offered

Misinterpretation of a question can

go unnoticed

Marking the wrong response is

possible

They force respondents to give

simplistic responses to complex

issues

Page 77: RET560 Research Methods Course Material V01

60

Self Assessment 3.3

The purpose of this exercise is to enable students understand the rules in writing questions in

surveys. Offer reasons why the following rules must be observed in drafting survey questions:

Avoid leading questions: E.g. “Wouldn’t you say that…”, “Isn’t it fair to say…”

Be specific. Avoid words like “regularly”, “often”, or “locally”.

Avoid jargon and colloquialisms.

Avoid double-barreled questions. E.g. “Will you like to use charcoal and LPG?

Avoid double negatives. E.g. “Smoking in public places should not be abolished”.

Why you have to explain the rationale for asking very personal and probing issues?

Ensure options are mutually exclusive – e.g. “How many years have you worked in

academia: 0-5, 6-10, 11-15, over 15.” Not, “0-5, 5-10, 10-15”.

Answer Tips

Consider the kind of responses you will elicit from your respondents if the above

errors are not avoided.

3.1.2.5 Pilot Test the Questionnaires

A pilot test is an evaluation of the specific questions, format, question sequence and instructions

prior to use in the main survey. Pilot testing is a crucial step in avoiding costly errors. The pilot

testing of survey instruments helps to:

Ascertain whether each question addresses the research questions;

Know if the questions are interpreted in a similar vein by interviewers/enumerators and

respondents;

Identify if options provided for close-ended questions are exhaustive. That is, they

address the views of all respondents;

Assess clarity and understandability of the questions;

Page 78: RET560 Research Methods Course Material V01

61

Evaluate to know if the instruments takes a long time to administer;

Ascertain if the questions are obtaining responses for all the different response categories

or if responses the same.

Have a fair knowledge of the kind of reactions to expect from respondents in order to

prepare to meet them in the main survey.

3.1.2.6 Preparation of the Sample

Sampling is used to cut cost and effort while still obtaining information from a representative

sample of the target population. What is essential here is that the researcher should ensure that

the number of individuals participating in the survey is representative of the target group. The

main questions in selecting your sampling design are:

How many will be included (the sample size)?

How will the survey respondents be selected?

Determining the Sample Size

Below are some relevant questions to consider in deciding on the sample size:

What is the size of your target population?

What can the budget allow?

How confident do you need to be with the results?

Do you need to look at any subgroups?

Deciding on the sample size is primarily driven by the budget line) and the size of the subgroups

the researcher wishes to analyze. The researcher has to ensure that he has sampled enough people

to obtain an adequate number of respondents in his subgroups so he can accurately draw

conclusions about that group. If the target population is very small (say less than 100), the

researcher should consider doing a census (i.e. complete enumeration). However, if the target

population is very large (for e.g. in millions) the researcher will not improve the accuracy of his

results by interviewing more and more people albeit how expensive it will be to cover everyone.

The sample sizes are often determined statistically at significance levels. Miller and Brewer

(2003) model can be a useful tool for determining the sample size.

Page 79: RET560 Research Methods Course Material V01

62

Formula:

𝑛 =𝑁

1+𝑁(∝)2. Where n is the sample size; N is the sample frame (total number of objects in the

target population) and α is the confidence level.

Working Example 1:

It has been observed that the performance of students in examinations has been declining over

the years. In a school with a population of 10 000 students, a researcher wants to know the

causes of the poor performance. How many students will you sample if your budget and time

will not permit a census?

Solution:

The question the researcher needs to answer first is the error he is ready to accept. If settled, then

he can go ahead and determine the sample size. Assuming the researcher wants the error margin

to be 5% the sample size can be determined as follows:

Student population = 10 000

𝑛 =𝑁

1 + 𝑁(∝)2

n = 10 000 / 1 + 10 000 (0.05)2

n = 384.6

Thus, approximately 385 students would be required for the survey at 95% significance level.

Characteristics of a Good Sample Design

Sample design must result in a truly representative sample. A representative sample is a

segment of a population being studied chosen because it is as representative as possible

of the population from which it is drawn.

It must be such which results in a small sampling error,

It must be viable in the context of funds available for the research study.

It must be such that systematic bias can be controlled in a better way.

Page 80: RET560 Research Methods Course Material V01

63

It should be such that the result of the sample study can be applied in general, for the

universe, with a reasonable level of confidence

Sampling (i.e. how will the survey respondents be selected?

After the sample size has been determined, the next question to address is how to select the units

for the sample. Should it be random or follow other approaches? If even, the selection is by a

random approach, how is it operationalised? There are several approaches available to be used to

select respondents. These approaches are categorized into two; namely probability and non

probability sampling techniques

Probability sampling requires that each member of the defined target population has a known,

and non-zero, chance of being included in the sample. It is not possible to determine whether a

non-probability sample is likely to provide very accurate or very inaccurate estimates of

population parameters. Consequently, these types of samples are not appropriate for dealing

objectively with issues concerning either the estimation of population parameters or the testing

of hypotheses.

The use of non-probability samples is often carried justified that estimates derived from the

sample may be linked to some hypothetical universe of elements rather than to a real population.

In some circumstances, probability sample design can be turned accidentally into a non-

probability sample design if subjective judgement is exercised at any stage during the execution

of the sample design.

Types of Probability Samples

There are many ways in which a probability sample may be drawn from a population. The

method that is most commonly described is the simple random sampling. The others are

stratified, cluster sampling, and multiple stages of selection.

a. Random sampling

Page 81: RET560 Research Methods Course Material V01

64

The first statistical sampling method is simple random sampling. In this method, each item in the

population has the same probability of being selected as part of the sample as any other item.

Random sampling can be done with or without replacement. If it is done without replacement, an

item is not returned to the population after it is selected and thus can only occur once in the

sample.

Having determined a sample size of 385 in example 1, the simple random sampling technique

will be operationalised by assigning numbers to all the 10 000 units and drawing them out from a

basket 385 times. This approach is simple random sampling without replacement.

Advantages

The selection procedure ensures that every sampling units of the population has an equal

and known (non zero) probability of being included in the sample.

Highly representative if all subjects participate; the ideal

Disadvantages

Not possible without complete list of population members; potentially uneconomical to

achieve; can be disruptive to isolate members from a group; time-scale may be too long,

data/sample could change

b. Systematic Sampling with a Random Start

This consists of selecting every Kth sampling unit (called the sampling interval) of the population

after the first sampling unit is selected at random from the total sampling unit. That is, an

element of randomness is introduced into this kind of sampling by using random numbers,

usually from 1-10, to pick up the unit with which to start. The sampling interval (K) is

determined by dividing the sampling frame (N) by the sample size (n).

E.g. With a sample size of 385 students, the sampling interval will be 10 000 / 385 = 26.

Page 82: RET560 Research Methods Course Material V01

65

Assuming the random number selected is 7, the next number to be selected will be 33 (i.e. 7 +

26), the next number will be 40 (i.e. 33 + 7). This process is continued till the sample size of 385

is reached.

Advantages

Systematic sampling is more convenient than random sample especially when

interviewers are untrained in sampling techniques-they can be instructed to select every

Kth person.

It is more convenient for the use with very large population or when large samples are to

be selected.

It is an easier and less costly method of sampling.

Each sampling unit in the population has a 1/K probability of being included in the

sample.

Disadvantages

It proves to be an inefficient method only if certain production process is defective as this

sample depends solely upon the random starting position. In practice, this method can be

used when list of population are available and are of a considerable length.

The system may interact with some hidden pattern in the population, e.g. every third

house along the street might always be the middle one of a terrace of three.

c. Stratified Sampling

The stratified sampling method is used when representatives from each subgroup within the

population need to be represented in the sample. The first step in stratified sampling is to divide

the population into subgroups (strata) based on mutually exclusive criteria. Random or

systematic samples are then taken from each subgroup. The sampling fraction for each subgroup

may be taken in the same proportion as the subgroup has in the population. For example, if the

person conducting a customer satisfaction survey selected random customers from each customer

type in proportion to the number of customers of that type in the population. Stratified sampling

can also sample an equal number of items from each subgroup.

Page 83: RET560 Research Methods Course Material V01

66

Steps involves in stratified sampling:

Define the population;

Determine the desired sample size;

Identify the variable and subgroups (strata) for which you want to guarantee appropriate

representation (either proportion or equal); and

Classify all members of the population as members of one of the identified subgroups.

Randomly select (using table of random numbers) an appropriate number of individuals

from subgroups.

Advantages

Can ensure that specific groups are represented, even proportionally, in the sample(s)

(e.g., by gender), by selecting individuals from strata list

Disadvantages

More complex, requires greater effort than simple random; strata must be carefully

defined

d. Cluster Sampling

In cluster sampling, the population that is being sampled is divided into groups called clusters.

Instead of these subgroups being homogeneous based on a selected criterion as in stratified

sampling, a cluster is as heterogeneous as possible to matching the population. A random sample

is then taken from within one or more selected clusters. Cluster sampling can tell us a lot about

that particular cluster, but unless the clusters are selected randomly and a lot of clusters are

sampled, generalizations cannot always be made about the entire population.

Steps:

Define the population

Determine the desired sample size

Identify and define a logical cluster

Obtain, or make a list of all clusters in the population

Page 84: RET560 Research Methods Course Material V01

67

Estimate the average number of population members per cluster

Determine the number of clusters needed by dividing the sample size by the estimated

size of the cluster

Randomly select the needed number of clusters (using a table of random numbers)

Include in the sample all population members in selected cluster

Advantages:

Generating sampling frame for clusters is economical, and sampling frame is often

readily available at cluster level

Most economical form of sampling

Larger sample for a similar fixed cost

Less time for listing and implementation

Also suitable for survey of institutions

Disadvantages:

May not reflect the diversity of the community.

Other elements in the same cluster may share similar characteristics.

Provides less information per observation than an SRS of the same size (redundant

information: similar information from the others in the cluster).

Standard errors of the estimates are high, compared to other sampling designs with same

sample size

e. Multi-stage sampling

In many situations, there are natural divisions of the population into several different sizes of

units. For example, a forest management unit consists of several stands, each stand has several

cut blocks, and each cut block can be divided into plots. These divisions can be easily

accommodated in a survey through the use of multi-stage methods. Selection of units is done in

stages. For example, several stands could be selected from a management area; then several cut

blocks are selected in each of the chosen stands; then several plots are selected in each of the

Page 85: RET560 Research Methods Course Material V01

68

chosen cut blocks. Note that in a multi-stage design, units at any stage are selected at random

only from those larger units selected in previous stages.

Example:

You have been asked to undertake a survey in a farming district in your home country. How

will you select respondents for interview?

Steps:

Note that not all the communities may be farming in the district. You need to

identify the farming communities. First stage.

The units to be sampled are the particular farming activities e.g. Food or Cash

Crop Production – Second stage.

The units to be sampled from the farming activities (food or cash crop) are the

farming households who undertake the particular farming activity considered. –

Third Stage.

Types of Non-probability Samples

There several types of non-probability samples: convenience, purposive/judgment, convenience,

quota samples, snowball, etc. These approaches to sampling result in the elements in the target

population having an unknown chance of being selected into the sample. It is always wise to

treat research results arising from these types of sample design as suggesting statistical

characteristics about the population – rather than as providing population estimates with

specifiable confidence.

a. Convenience sampling

A sample of convenience is the terminology used to describe a sample in which elements have

been selected from the target population on the basis of their accessibility or convenience to the

researcher. Convenience samples are sometimes referred to as ‘accidental samples’ for the

reason that elements may be drawn into the sample simply because they just happen to be

Page 86: RET560 Research Methods Course Material V01

69

situated, spatially or administratively, near to where the researcher is conducting the data

collection.

Advantages

Convenience sampling is very easy to carry out with few rules governing how the sample

should be collected.

The relative cost and time required to carry out a convenience sample are small in

comparison to probability sampling techniques. This enables one to achieve the sample

size you want in a relatively fast and inexpensive way.

Disadvantages

Convenience sample can lead to the under-representation or over-representation of

particular groups within the sample.

The ability to make generalizations is undermined if the interest group is under-

represented in the sample

b. Quota sampling

It is sometimes misleadingly referred to as ‘representative sampling’ because numbers of

elements are drawn from various target population strata in proportion to the size of these strata.

The population is stratified by important variables and the required quota is obtained from each

stratum.

Advantages

Quick and cheap to organize

Disadvantages

not as representative of the population as a whole as other sampling methods

because the sample is non-random it is impossible to assess the possible sampling error

Page 87: RET560 Research Methods Course Material V01

70

c. Purposive Sampling

This is often referred to as judgment sample. With this technique the researcher selects sampling

units subjectively in an attempt to obtain a sample that appears to be representative of the

population. That is, the chance that a particular sampling unit will be selected depends on the

subjective judgment of the researcher. The selection of the researcher may yield results favorable

to his/her point of view, resulting in the entire setting vitiated with the element of bias. However,

the sampling technique assures that results obtained are tolerably reliable.

Advantages

Ensures balance of group sizes when multiple groups are to be selected

Disadvantages

Samples are not easily defensible as being representative of populations due to potential

subjectivity of researcher

d. Snowball sampling

Researchers use this sampling method if the sample for the study is very rare or is limited to a

very small subgroup of the population. This type of sampling technique works like chain referral.

After observing the initial subject, the researcher asks for assistance from the subject to help

identify people with a similar trait of interest. The process of snowball sampling is much like

asking your subjects to nominate another person with the same trait as your next subject. The

researcher then observes the nominated subjects and continues in the same way until the

obtaining sufficient number of subjects.

For example, if obtaining subjects for a study that wants to observe a rare disease, the researcher

may opt to use snowball sampling since it will be difficult to obtain subjects. It is also possible

that the patients with the same disease have a support group; being able to observe one of the

members as your initial subject will then lead you to more subjects for the study.

Page 88: RET560 Research Methods Course Material V01

71

Advantages

The chain referral process allows the researcher to reach populations that are difficult to

sample when using other sampling methods.

The process is cheap, simple and cost-efficient.

This sampling technique needs little planning and fewer workforce compared to

other sampling techniques.

Disadvantages

The researcher has little control over the sampling method. The subjects that the

researcher can obtain rely mainly on the previous subjects that were observed.

Representativeness of the sample is not guaranteed. The researcher has no idea of the true

distribution of the population and of the sample.

Sampling bias is also a fear of researchers when using this sampling technique. Initial

subjects tend to nominate people that they know well. Because of this, it is highly

possible that the subjects share the same traits and characteristics, thus, it is possible that

the sample that the researcher will obtain is only a small subgroup of the entire

population.

The advantages and disadvantages of the various sampling techniques are summarised in Table

3.3.

Table 3.7: Sampling techniques: Advantages and disadvantages

Technique Brief Descriptions Advantages Disadvantages

Simple

random

Random sample

from whole

population

Highly representative if all

subjects participate; the

ideal

Not possible without complete

list of population members;

potentially uneconomical to

achieve; can be disruptive to

isolate members from a group;

time-scale may be too long,

data/sample could change

Page 89: RET560 Research Methods Course Material V01

72

Stratified

random

Random sample

from identifiable

groups (strata),

subgroups, etc.

Can ensure that specific

groups are represented,

even proportionally, in the

sample(s) (e.g., by gender),

by selecting individuals

from strata list

More complex, requires greater

effort than simple random;

strata must be carefully defined

Cluster Random samples

of successive

clusters of subjects

(e.g., by

institution) until

small groups are

chosen as units

Possible to select randomly

when no single list of

population members exists,

but local lists do; data

collected on groups may

avoid introduction of

confounding by isolating

members

Clusters in a level must be

equivalent and some natural

ones are not for essential

characteristics (e.g.,

geographic: numbers equal, but

unemployment rates differ)

Purposive Hand-pick subjects

on the basis of

specific

characteristics

Ensures balance of group

sizes when multiple groups

are to be selected

Samples are not easily

defensible as being

representative of populations

due to potential subjectivity of

researcher

Quota Select individuals

as they come to fill

a quota by

characteristics

proportional to

populations

Ensures selection of

adequate numbers of

subjects with appropriate

characteristics

Not possible to prove that the

sample is representative of

designated population

Snowball Subjects with

desired traits or

characteristics give

names of further

appropriate

subjects

Possible to include

members of groups where

no lists or identifiable

clusters even exist (e.g.,

drug abusers, criminals)

No way of knowing whether

the sample is representative of

the population

Self Assessment 3.4

As the Senior Research Officer of your organisation, you have been tasked to conduct a

household survey in a large town whose households fall into three income categories,

Page 90: RET560 Research Methods Course Material V01

73

namely high income, middle income and low income earners. Determined to ensure that

the sample size takes care of the diversity in the target population, what sampling

technique will you use to select units and why?

3.1.2.7 Train Interviewers for Telephone and Intercept Surveys

Training interviewers involves providing them with the skills needed to undertake successful

interviewing. Having trained interviewers is imperative as the interviewer is the interface

between your organization and the respondents. Interviewers have a tremendous amount of

influence on the quality of the research. A good interviewer can make all the difference in the

world to the usefulness of the data collected.

What makes a Great Interviewer?

A great interviewer follows a few simple guidelines which ensure detailed, accurate, and

unbiased data.

Read the Questions as Written

Do Not Suggest Responses

Clarify Responses

Probe for Responses

Record Information Neatly and Thoroughly

Maintain Strict Confidentiality

Be Polite and Professional

3.1.2.8 Data Collection

This step describes how the information is collected for the different survey methods. This is an

important step, that must be done right in order to ensure the integrity of the information

collected. The following procedures are to be observed in data collection:

Face-to-face Interviews

Page 91: RET560 Research Methods Course Material V01

74

Select location(s) to conduct interviews: The most appropriate location to conduct a face

to face interview is a place where members of your population frequent and is

comfortable for them to participate at that location.

If you are randomly selecting respondents for a face to face intercept interview it is

important to utilize more than one location in order to ensure a better representation of

the population.

Train interviewers in how to conduct a structured questionnaire face to face and how to

intercept respondents if they are doing intercept interviews. It is quite difficult to ensure

the interviewers randomly select people to participate in intercept interviews. Interviewer

and respondent biases may influence the people who are selected to participate and those

who agree to. Interviewers should follow a standardized and systematic approach to

selecting people who pass by to be interviewed.

If you require a particular group for your survey you may have to develop a questionnaire

screener which would be used to find eligible respondents. A questionnaire screener is a

series of one or two questions (usually demographics like age or family status) which

help you to identify people who are in your target population before doing a full.

Using Telephone Surveys

It is important to supervise interviewers when they are calling respondents to monitor

whether they are following the interviewing protocol.

It is important to verify a sample of completed interviews by calling a sample of

respondents who completed interviews to ensure they did complete the interview.

Do not distribute your sample to interviewers all at once; give each interviewer chunks of

sample as needed.

If you require a particular group for your survey you may have to develop a questionnaire

screener which would be used to find eligible respondents. A questionnaire screener is a

series of one or two questions (usually demographics like age or family status) which

help you to identify people who are in your target population before doing a full

interview. If a person is not eligible the interview is ended after the screening questions.

Page 92: RET560 Research Methods Course Material V01

75

Using Mail Surveys

Send out the first mailing (usually results in a 40% response)

Send a reminder card 10 days after the 1st mailing to thank those participants who have

already responded and to remind those who have not of the importance of the study. The

card should also indicate where people can obtain another copy of the questionnaire if

they have mislaid their original copy.

Three to four weeks later, send a second mailing emphasizing the importance of receiving

responses. Also include a new questionnaire and return envelope.

The covering letter is one of the most important aspects of a mailed questionnaire. It will

determine whether the recipient reads the survey and the attitude with which respondents

complete the questionnaire.

The letter should explain why the study is important and why their responses are needed.

3.1.2.9 Processing the Data

Processing the data involves preparing and translating the data for analysis. It involves taking the

completed questionnaires and putting them into a format that can be summarized and interpreted.

There are many errors that can be made during this step and it is essential that the quality of the

data is preserved.

Coding

The following are the steps involved in coding respondents’ answers to your questionnaire:

Familiarize yourself with the questionnaire and topic area.

Divide open ended questions into groups that can share a code list (not always possible).

For each question (or group) read through at least 15% of the questionnaires writing

down all the unique responses (this is a rough code list).

When no new responses are found, rewrite codes and assign a number to each code

(master code list).

Write the corresponding code number(s) beside each open-ended question on each

questionnaire.

Repeat this for each open ended question.

Page 93: RET560 Research Methods Course Material V01

76

Data Entry

There are two common approaches to data entry:

Direct data entry. Interviewers complete the questionnaires and then they are coded data

entered into a computer for analysis.

Computer assisted telephone interviewing (CATI). Interviewers enter responses directly

into a computer and the questions required coding are entered at a different time.

Methods to Avoid Data Entry Errors

Data entry errors are minimized when the data is verified. Verification of 10% of the data

entered results in increased confidence in the accuracy of the data.

An additional means to reduce the incidence of data entry errors is to program your data

entry program to check each field for out-of-range data. When errors or inconsistencies

are identified, the ID number of the record is used to locate the questionnaire. The source

of the error is identified and the corrected data is entered.

3.1.2.10 Analysis of Results

Once the data has been entered into your statistical package, the analyses required to answer your

research questions can be performed. Analyzing the survey results is done in order to answer the

original questions that were posed for the evaluation. It allows you to draw conclusions.

Analyzing the results is one of the most crucial steps in the process of ensuring useful findings

which accurately reflect the opinions and views of the participants involved and answers the

original questions.

Both quantitative and qualitative methods are employed for data analysis. The qualitative inquiries

capture areas where in-depth information is required for better understanding of issues. The qualitative

data will also serve as a means of triangulating data gathered through the quantitative approach and

providing in-depth explanation to some of the quantitative data. The quantitative analysis is good for

generalization and numbers. The analysis can be done using SPSS, STATA or any other statistical

software which will be discussed in Unit 4.

Page 94: RET560 Research Methods Course Material V01

77

For most surveys simple descriptive statistics (frequencies, means, ranges, etc) may be all that is

needed to be able to interpret the results. This involves determining how many of the respondents

answered a particular way for each of the questions. More complex analyses may be required

when comparisons are needed between subgroups of the population or for measurements taken at

different times.

Statistical analysis aims to show that your results are not just due to chance or the ‘luck of the

draw’. It provides a way to determine the repeatability of any differences observed. If the same

outcome is found when a study is repeated over and over again, we really don’t need a statistical

analysis. Similarly when we study a ‘sample’ of the population, statistical analysis is used to help

us decide whether it is likely that these same differences would be found if we repeated the

experiment in multiple samples or in the entire population. Hypothesis could be tested with

common statistics tools such as the T-tests (to compare results for continuous data), Z-test or Chi

square (to compare results for categorical data).

3.1.2.11. Interpret and Disseminate Results

The results of a survey should be provided back to its through written reports, and/or

presentations. It is important to feed back the results of the survey to management, staff,

interested participants and other stakeholders in order to keep them informed and establish buy-

in for implementing any changes resulting from the survey.

Interpreting survey results

Survey results need to be interpreted within the context of the purpose of the project.

Keep the audience in mind when preparing report. What do they need and want to know?

Consider the limitations of the survey (e.g. possible biases; validity of results, reliability

and generality of results.

Presenting Results

It is easy to become overwhelmed with too much information so focus on the research

questions and only present the information which answers those questions.

Page 95: RET560 Research Methods Course Material V01

78

Choose a format which will highlight the key result.

Keep it simple

Pictures are worth a thousand words

3.1.2.12 Take Action

Taking action refers to implementing the changes suggested by the results of your survey. It is

important to take action and implement changes in order to make improvements to subjects

understudied.

How to Decide which Actions to Take

Involve the stakeholders in interpreting and taking action on the results.

Revisit the original goals of data collection. The data should provide answers to the

original questions.

Write a list of recommended actions which address the outcomes of the survey.

Prioritize those changes which are most important and feasible to implement.

Set up an action plan to implement the recommended changes.

Implement the changes.

SESSION 3.2 CASE STUDY RESEARCH

3.2.1 Introduction to Case Study

A case study is an intensive study of a single unit for the purpose of understanding a larger class of

(similar) units. A case study is an in-depth investigation of an individual, group, institution or

phenomenon. Case studies are often based on the premise that locating one case is enough make a

conclusion for other cases since a case can be typified for similar other cases. A case being studied is

taken as an example of other similar things/situations.

Page 96: RET560 Research Methods Course Material V01

79

As a means of overcoming shortcomings of quantitative research studies, case study research are

often undertaken to have a holistic and in-depth investigation of social and behavioral problems

such as unemployment, poverty, drug addiction, governance, management and illiteracy.

Through case study methods, a researcher is able to go beyond the quantitative statistical results

and understand the behavioral conditions through the actor’s perspective. Whilst in quantitative

research certain peripheral but relevant information might be omitted and obscured, case study

research explains both the process and outcome of a phenomenon through complete observation,

reconstruction and analysis of the issue under study and thereby covers all relevant information.

A case study in true essence is the exploration, investigation or analysis of a contemporary

practical life phenomenon of a specific contextual scope, a small geographical area or a limited

population through detailed background examination of an event or condition.

3.2.2 Purpose of Case Studies

According to Singleton et al. (1993), the primary purpose is to determine factors and

relationships among the factors that have resulted in the behaviour understudy. The investigation,

therefore, involves a detailed examination of a single subject, group or phenomenon.

3.2.3 Advantages of Case Study

There are a number of advantages in using case studies.

The examination of the data is most often conducted within the context of its use (Yin,

1984), that is, within the situation in which the activity takes place.

Variations in terms of intrinsic, instrumental and collective approaches to case studies

allow for both quantitative and qualitative analyses of the data.

The detailed qualitative accounts often produced in case studies not only help to explore

or describe the data in real-life environment, but also help to explain the complexities of

real-life situations which may not be captured through experimental or survey research.

3.2.4 Disadvantages of Case Study

Page 97: RET560 Research Methods Course Material V01

80

Despite these advantages, case studies have received criticisms. There are primarily three types

of arguments against case study research.

Case studies are often accused of lack of rigor. Too many times, the case study

investigator has been sloppy, and has allowed equivocal evidence or biased views to

influence the direction of the findings and conclusions.

Case studies provide very little basis for scientific generalization since they use a small

number of subjects, some conducted with only one subject. The question commonly

raised is “How can you generalize from a single case?

Case studies are often labeled as being too long, difficult to conduct and producing a

massive amount of documentation. In particular, case studies of ethnographic or

longitudinal nature can elicit a great deal of data over a period of time.

A common criticism of case study method is its dependency on a single case exploration

making it difficult to reach a generalizing conclusion.

3.2.5 Designing a Case Study

Case studies have been generally criticized for their lack strength as a research tool. This makes

its design very important. Depending on the issue at hand, a single case or a multiple case can be

adopted. In situations where there is no room for replication of a particular study or it is rare,

uncommon and limited to a single occurrence, a single case can be adopted.

Single case, though generally limited by its inability to provide generalizing conclusions, the

drawback can be overcome by triangulating the study with other methods to authenticate the

validity of the process. Multiple-case design, on the other hand, can be adopted with real-life

events that show numerous sources of evidence through replication rather than sampling logic to

enhance and support previous results. This helps raise the level of confidence in the strength of

the method adopted. For instance, whilst a study on the psychological impacts of the 1983

drought on children may difficult to be replicated and hence appropriate for a single case study,

the assessment of the sensing ability of deaf children is replicable and hence more appropriate

for multiple case study. The design of a case study is therefore very important. A case study

method must be able to prove, through interviews or journal entries, that:

Page 98: RET560 Research Methods Course Material V01

81

It is the only viable method to elicit implicit and explicit data from the subjects

It is appropriate to the research question

It follows the set of procedures with proper application

The scientific conventions used in social sciences are strictly followed

A ‘chain of evidence’, either quantitatively or qualitatively, are systematically recorded

and archived particularly when interviews and direct observation by the researcher are the

main sources of data

The case study is linked to a theoretical framework.

3.2.6 Categories of Case Study

Though there are several types of case studies, the prominent ones are explained below:

3.2.6.1 Explorative case studies

These are intended to investigate information which serves as a point of interest to the

researcher. This category of case study, owing to its somewhat originality of the subject matter,

prior field work and small scale data collection needs to be conducted before the research

question and hypotheses are proposed to help prepare the framework of the study. An example of

an explorative study is a pilot study.

3.2.6.2 Descriptive case studies

Second, descriptive case studies set to describe the natural phenomena which occur within the

data in question. The aim of the researcher is to describe or narrate the data in their original state.

The challenge of a descriptive case study is that the researcher must begin with a descriptive

theory to support the description of the phenomenon or story. If this fails there is the likelihood

that the description lacks thoroughness and that problems may arise during the project.

3.2.6.3 Explanatory case studies

Explanatory case studies examine the data both at a surface and deep level in order to explain the

phenomena in the data. On the foundation of the data, the researcher then forms a theory and set

to test it. Furthermore, explanatory cases are also deployed for causal studies where pattern-

matching can be used to investigate certain phenomena in very complex and multivariate cases.

Page 99: RET560 Research Methods Course Material V01

82

The complex and multivariate cases can be explained by three rival theories: a knowledge-driven

theory, a problem-solving theory, and a social-interaction theory.

The knowledge-driven theory stipulates that eventual commercial products are the results of

ideas and discoveries from basic research. Similar notions can be said for the problem-solving

theory. However, in this theory, products are derived from external sources rather than from

research. The social-interaction theory, on the other hand, suggests that overlapping professional

network causes researchers and users to communicate frequently with each other.

3.2.6.4 Interpretative and Evaluative case studies

Through interpretive case studies, the researcher aims to interpret the data by developing

conceptual categories, supporting or challenging the assumptions made regarding them. In

evaluative case studies, the researcher goes further by adding their judgment to the phenomena

found in the data.

3.2.7 Techniques for selection of cases in Case Study Research

Case selection in case study research has similar objectives as random sampling. In case

selection, a researcher desires a representative sample which has useful variation on the

dimensions of theoretical interest. One’s choice of cases is therefore driven by the way a case is

situated along these dimensions within the population of interest. That is, how the case fits into

the theoretically specified population. The following steps are useful for the selection of cases:

Cases should be selected in the same way as the topic of an experiment is selected;

Developed preliminary theory is used as a template with which to compare the

characteristics and empirical findings from the case(s); and

Selected cases should reflect characteristics and problems identified in the underlying

theoretical propositions/conceptual framework.

SESSION 3.3 OTHER TYPES OF RESEARCH DESIGN

3.3.1 Observational Research

Page 100: RET560 Research Methods Course Material V01

83

Observational study involves observing a phenomenon. For example, instead of asking how the

Black Stars are likely to perform in the World cup in Germany, you may observe them playing

prior to the trip to Germany. Observational research is also guided by clearly defined hypotheses

or objectives to make the research objective. The observations should be systematic rather than

opportunistic and disorderly.

3.3.1.1 Purpose of Observational Research

It is used to collect objective information. The information is said to be objective because

the researcher observes the behaviour rather than depending on the self-report as the

basic source of the information.

This method avoids the limitations associated with the survey research.

3.3.1.2 Steps in carrying out Observational Research

Selection and definition of the problem

Sample selection

Define observational variables (this is an important step in the research and what is

observed is determined by hypothesis and objectives)

Record observational information (there are four ways of doing this; duration recording,

frequency count recording, interval recording and continuous observation)

3.3.1.3 Types of Observational Research

Non-participant observation

Naturalistic observation

Simulation observation

Participant observation

Participatory Rural Appraisal/Action

3.3.1.4 Limitations of Observational Research

There is a high tendency to infringe on participants rights by observing people without

their knowledge and recording conversations with concealed recording devices.

Page 101: RET560 Research Methods Course Material V01

84

There is a problem of the impact of the observer’s participation on the situation and the

subjects.

It could be very biased.

3.3.2 Ethnographic Research (Ethnography)

This is a method that involves very intensive data collection. The data on many variables are

collected over an extended period of time in a natural setting. The use of this method is based on

the belief that behaviour is greatly influenced by the environment in which it occurs.

Ethnographers do not study individuals outside the context in which the function occurs. The key

characteristic of ethnographic research is that the researcher (now the observer) goes through a

continuous process of observation, trying to record everything that occurs in the area being

studied making very lengthy notes of what is observed.

3.3.2.1 Steps in carrying out ethnographic studies

The ethnographer uses a variety of data collection strategies in conjunction with

observation. Involves non-participant observation, participant observation or both.

Like all other research, define the research problem

Determine the research hypothesis

Plan the research

Decide on appropriate setting to conduct the ethnographic research

Decide on the best level of participation

3.3.2.2 Advantages

Hypothesis or theories developed are grounded firmly in observational data gathered in a

naturalistic setting.

It provides a very vivid (life) picture of the environment being studied

The long period of study required in ethnographic research gives the research a

longitudinal perspective that cannot be achieved in many other types of research.

3.3.2.3 Disadvantages

Page 102: RET560 Research Methods Course Material V01

85

Ethnographic research requires the skills of someone trained in observational techniques

to make results valid.

The outcome of the field data can easily be influenced by the observer’s bias.

Since the field reports are usually long hand written notes, such field records are usually

difficult to quantify and interpret.

Ethnographic research goes on for a long period of time, which makes it very expensive.

A lot of time is first of all devoted in trying to understand the environment where the

study will be carried out long before the study takes place, thus making it very expensive.

The observer is forced to become an active participant in the society/environment being

studied, which could lead to role conflicts (e.g. one can easily forget the role he/she is

expected to play and disclose his/her real self) and this could reduce the validity of data

being collected.

It requires an observer who is alert and a fast writer who can also write clearly.

3.3.3 Historical Research

Moore (1988) defines historical research as the study of a problem that requires collecting

information from the past. This type of research involves understanding, studying and

experiencing past events. Historical research studies do not involve the use of instruments to

gather data from individuals as in survey research, but makes use of existing data. Thus it is up to

the researcher to determine whether the data adequately explores the events in which he or she is

interested.

Historical research is also defined as “the discovery and analysis of records of previous events,

interpretation of trends in the attitudes or events of the past and generalizations from these past

events to help guide present or future behaviour. Historical research consists of locating,

integrating and evaluating evidence from physical relics, written records or documents in order

to establish facts or generalizations regarding past or present events, human characteristics or

other problems in question” Compton and Hall (1972). The historical researcher is interested in

understanding and analyzing the past. The research for evidence or facts is always guided by a

Page 103: RET560 Research Methods Course Material V01

86

broad theory or interpretation relevant to the researcher’s interest and therefore the facts to not

speak for themselves.

Examples of historical sources of data, which could either be primary or secondary sources include;

Official records which may include legal records, legal instruments such as contracts and

wills, court decisions, etc.

Eye witness accounts of events, which could be given orally or in written form.

Creative productions such as works of art, photographs, literature, museum pieces and

costumes.

Expressive documents, such as personal letters, life histories (from diaries or

autobiographies, etc.).

3.3.3.1 Purpose of Historical Research

Historical research aims at arriving at conclusions concerning causes, effects or trends of past

occurrences that may help explain present events, which could be used to anticipate future

events. Thus historical events are useful for understanding;

Histories of specific individuals

Histories of political systems

Histories of important events of a country, e.g. wars, etc.

Historical research also attempts to interpret ideas or events that had previously seemed

unrelated. It emphasizes old data or merges old data with new historical facts that others have

discovered. Historical research is also used to reinterpret past events that have been studied.

3.3.3.2 Steps in conducting historical research

Identify the research problem. The problem must be of historical significance and this

makes this step difficult.

Developing research hypothesis or objectives that one wants to test.

Collecting and classifying research resource materials, determining facts by internal and external

criticism.

Organizing facts into results

Page 104: RET560 Research Methods Course Material V01

87

Interpreting data in terms of stated hypothesis or theory. It is important to note that isolated facts

have no meaning and a mere listing of historical events is not research.

Synthesizing and presenting the research in an organized form.

3.3.3.3 Limitations of historical research

Collecting historical data involves long and tedious hours of search through piles of

records, files, documents, etc.

Establishing the validity of the data (source and content) involves a dual process of

internal and external audit/criticism, which is also time-consuming. The limitations in

this research raise a lot of ethical issues.

3.3.4 Descriptive Research

Although some people dismiss descriptive research as `mere description', good description is

fundamental to the research enterprise and it has added immeasurably to our knowledge of the

shape and nature of our society. Descriptive research encompasses much government sponsored

research including the population census, the collection of a wide range of social indicators and

economic information such as household expenditure patterns, time use studies, employment and

crime statistics and the like.

Descriptions can be concrete or abstract. A relatively concrete description might describe the

ethnic mix of a community, the changing age profile of a population or the gender mix of a

workplace. Alternatively the description might ask more abstract questions such as `is the level

of social inequality increasing or declining?’ `How secular is society?' or `How much poverty is

there in this community?' Accurate descriptions of the level of unemployment or poverty have

historically played a key role in social policy reforms (Marsh, 1982). By demonstrating the

existence of social problems, competent description can challenge accepted assumptions about

the way things are and can provoke action.

Good description provokes the `why' questions of explanatory research. If we detect greater

social polarization over the last 20 years (i.e. the rich are getting richer and the poor are getting

poorer) we are forced to ask `Why is this happening?' But before asking `why?' we must be sure

Page 105: RET560 Research Methods Course Material V01

88

about the fact and dimensions of the phenomenon of increasing polarization. It is all very well to

develop elaborate theories as to why society might be more polarized now than in the recent past,

but if the basic premise is wrong (i.e. society is not becoming more polarized) then attempts to

explain a non-existent phenomenon are silly.

Of course description can degenerate to mindless fact gathering or what C.W. Mills (1959) called

`abstracted empiricism'. There are plenty of examples of unfocused surveys and case studies that

report trivial information and fail to provoke any `why' questions or provide any basis for

generalization. However, this is a function of inconsequential descriptions rather than an

indictment of descriptive research itself.

3.3.5 Explanatory Research and Research on Causality

Explanatory research focuses on why questions. For example, it is one thing to describe the

crime rate in a country, to examine trends over time or to compare the rates in different

countries. It is quite a different thing to develop explanations about why the crime rate is as high

as it is why some types of crime are increasing or why the rate is higher in some countries than in

others.

The way in which researchers develop research designs is fundamentally affected by whether the

research question is descriptive or explanatory. It affects what information is collected. For

example, if we want to explain why some people are more likely to be apprehended and

convicted of crimes we need to have hunches about why this is so. We may have many possibly

incompatible hunches and will need to collect information that enables us to see which hunches

work best empirically. Answering the `why' questions involves developing causal explanations.

Causal explanations argue that phenomenon Y (e.g. income level) is affected by factor X (e.g.

gender). Some causal explanations will be simple while others will be more complex. For

example, we might argue that there is a direct effect of gender on income (i.e. simple gender

discrimination). People often confuse correlation with causation. Simply because one event

follows another, or two factors co-vary, does not mean that one causes the other. The link

between two events may be coincidental rather than causal.

Page 106: RET560 Research Methods Course Material V01

89

There is a correlation between the number of fire engines at a fire and the amount of damage

caused by the fire (the more fire engines the more damage). Is it therefore reasonable to conclude

that the number of fire engines causes the amount of damage? Clearly the number of fire engines

and the amount of damage will both be due to some third factor - such as the seriousness of the

fire.

Confusing causation with correlation also confuses prediction with causation and prediction with

explanation. Where two events or characteristics are correlated we can predict one from the

other. Knowing the type of school attended improves our capacity to predict academic

achievement. But this does not mean that the school type affects academic achievement.

Predicting performance on the basis of school type does not tell us why private school students

do better. Good prediction does not depend on causal relationships. Nor does the ability to

predict accurately demonstrate anything about causality.

3.3.6 Comparative Research Design

This design entails the study using more or less identical methods of two contrasting cases. It

embodies the logic comparison in that it implies that the researcher can understand social

phenomenon better when they are compared in relation to two or more meaningful contrasting

cases or situations. The key to the Comparative design is its ability to allow the distinguishing

characteristic of two or more cases to act as a springboard for theoretical reflections about

contrasting findings.

3.3.7 Longitudinal Design

This form of design represents a distinct form of research design because of the time and cost

involved. It is a relatively little used design in social research. Longitudinal research design is a

design in which data are collected on a sample (of people, documents, etc) on at least two

occasions.

Page 107: RET560 Research Methods Course Material V01

90

Two types of Longitudinal Design

The Panel Study

With this type, a sample often a randomly selected is the focus of data collection on at least two

(and often more) occasions. The data may be collected from different types of cases within a

panel study framework: people, household, organization, schools etc.

The Cohort Study:

The study selects an entire cohort of people or a randomly selected sample of them as the focus

of data collection. The cohort is made up of people who share a certain characteristics, such as

all being born in the same week or having a certain experience, such as being unemployed or

getting married on a certain day in the same week.

The Panel and Cohort studies share similar features:

They share a similar design structure i.e. the data are collected in at least two waves on

the same variable on the same people.

They are both concerned with illuminating social change and improving the

understanding of causal influence over time- the causal influence implies that the

Longitudinal designs are somewhat better able to deal with the problem of ambiguity

about the direction of influence.

3.3.8 Experimental Design

Experimental Research design is that which rules out alternative explanations of findings

deriving from it (i.e. possess internal validity) by having at least

an experimental group, which is exposed to treatment, and a control group, which is not

exposed to treatment, and

Random assignment to the group.

3.3.8.1 Advantages

Page 108: RET560 Research Methods Course Material V01

91

Experiments enable researchers to exert a great deal of control over extrinsic and intrinsic

variables, strengthening the validity of causal inferences (internal validity).

Experiments enable researchers to control the introduction of the Independent variable so

they may determine the direction of causation.

3.3.8.2 Disadvantages

External validity is weak because experimental design does not allow researchers to

replicate real-life social situation.

Researchers must often rely on volunteer or self-selected subjects for their samples.

Therefore the sample may not be representative of the population of interest, preventing

researchers from generalizing to the population and limiting the scope of their findings

A true Experiment is often used as a yardstick against which non-experimental research is

assessed. Experimental research is frequently held up as a touchstone because it engenders

considerable confidence in the robustness and trustworthiness of causal findings. That is, true

experiment tends to be very strong in terms of internal validity.

SESSION 3.4 RESEARCH ETHICS

3.4.1 Why Research Ethics

The ethics of conducting social science research has grown over the years and has to do with the

rights and welfare of those being researched as well as the obligation of the researcher. The

purpose of research as we have been saying is to contribute to knowledge. Unfortunately,

carrying out the research is likely to violate the rights and welfare of those being researched and

ethical codes have been developed to protect the interest of these people. Each of the stages in

the research process involves some ethical implications.

3.4.2 Balancing Costs and benefits in Research

Basically social scientists are confronted with two ethical issues; the right to conduct the research

in search of new knowledge and the right of the person providing the information. Not to

conduct the research for fear of infringing on the right of the research participants will not be fair

Page 109: RET560 Research Methods Course Material V01

92

since it blocks the chances of gaining new knowledge and unethical to the researcher.

Conducting the research that abuses the right of the individual being researched could also be

unfair. This may be true in research that employs deception because provides methodological

and practical advantages. The above shows that social scientists often find themselves in a

conflict of ethical dilemma.

There are no absolute answers to the above conflict but it is important to be aware of it and to

guide against it as much as possible, or be able to manage it. Values people attach to the benefits

or cost of conducting research are based on so many factors including background, culture,

experience, convictions, etc. Some of the costs that the researcher may put the researched into

are affronts to dignity of the individual, embarrassment, loss of trust in social relations, loss of

self-esteem or self-confidence, etc. For the researcher the gains could be developing more theory

about the hidden agenda of people, potential advances of applied knowledge, etc. For the

researched, the gains could be the monetary benefits, satisfaction in contributing to knowledge,

etc. All ethical decisions have to be made individually.

3.4.3 Informed Consent

It is important to inform people to be researched about the research ahead of time and to seek

their concern. This is important especially where those to be research are exposed to risks of all

kinds (for example, when it has to do with drugs, theft, sexuality, etc.). It is also important to

know that providing responses to a researcher’s questions is voluntary. In order words one

should not force responses from respondents. The researcher after being unable to convince the

researched to provide a response should move to another person.

3.4.4 Competence

It is important to know that it is not everyone who is competent enough to provide informed

responses to questions posed by the researcher. It is often assumed that adults are capable of

providing response of any kind while children are not. This could be true or untrue depending on

the research topic. In some cases children may more competent in providing responses than

adults and vice versa. Ethically, competence must be taken into account in deciding on the

Page 110: RET560 Research Methods Course Material V01

93

respondents. The freedom to decide whether to participate in a research or not is left to those to

be researched and so on ethical grounds it is considered as voluntary.

3.4.5 Privacy

Privacy as an ethical issue in research needs safeguarding. It is viewed from three angels:

a) Sensitivity of information

Sensitivity of information refers to how personal or potential threatening the information is that

the researcher is interested in. The greater the sensitivity of the information, the more the

researcher needs to provide privacy to the respondent. People are often sensitive about issues

related to religion, income, sexual practices, racism and personal attributes such as honesty,

intelligence, etc.

b) Settings being observed

The setting could vary from the private (e.g. home) to the public place. The extent to which any

of the above two places could be intrusion in people’s privacy is not certain which could lead to

an ethical issue. An example is trying to interview homosexual in a public drinking place.

c) Dissemination of the information

It should not be easy to match information with the people who provided it. Being able to do so

would mean not protecting the privacy of those who provided the information. It is easy to get

that done by not putting names of the questionnaires or research instruments used.

d) Anonymity and Confidentiality

This is similar to the information on dissemination of information under privacy. Here

researchers avoid collecting information with the identity of the one providing the data. A quick

way of ensuring anonymity is to collect information without the names of the respondents and

other identities. It is easy to maintain anonymity through a mail survey. Where the identity is

provided, the researcher can ensure anonymity by separating the other data from the identity of

the one who provided it during the data entry.

Page 111: RET560 Research Methods Course Material V01

94

It is common to find that those being researched are told that any information they provide will

be taken as confidential. This is often written in the introduction letter that goes with the

questionnaire. It is also true that sometimes the researchers are unable to keep their promises due

to a number of factors. It could happen that the information provided is unique and therefore

stands out among the others. Such information could be used as an example and used to make a

case. In such a situation, the confidentiality promise will be broken. Thus it is important to explain to

those being researched what exactly is meant by confidentiality and its limits.

Learning Track Activities

Unit Summary

Social science research is always limited by the unpredictability of the human behaviour.

Premised on this, it has to be approached in a systematic manner devoid of biases. Depending

on the nature of the problem, several approaches can be used to address the research

problem. What is imperative is for the researcher to clarify the purpose of the research and

examine the suitability of a chosen research approach in addressing the research questions.

Noting that researchers use samples after which findings are generalised to represent the

population, it is imperative that units in the sample reflects nuances in the target population.

Another significant factor worthy to be consider is the need to observe the research ethics

which include but not limited to; the use of informed consent, competence and privacy. The

quality of data to be gathered depends primarily on the nature of instruments used. Hence,

the instruments are to be designed to gather the required data from respondents. The

Page 112: RET560 Research Methods Course Material V01

95

questions should be unambiguous, devoid of technicalities and jargons to enable the

enumerator and respondent understand them to gather the required data.

Key terms/ New Words in Unit

Interviewer. The person who is collecting data by conducting interviews.

Respondent. The person who is answering the questionnaire.

Researcher. The person who is analyzing the data collected.

Sample. The list of people who will be interviewed.

Survey. An instrument designed to gather information from a specific group of people

(employees, customers, all people in a province or country, women, children, etc.)

Questionnaire. A set of questions designed for a specific purpose (evaluation, polling,

market research, etc.). Can be either printed on paper, or programmed into a

computerized interviewing system.

Closed-end. A type of question that allows only for specific responses (Yes or No, etc.).

The interviewer circles the response on the questionnaire.

Open-end. A type of question that allows the respondent to give any answer they wish.

The interviewer writes in, verbatim, the response.

Probe. Asking for more responses.

Precodes. A list of possible responses to a question. The instructions on the questionnaire

will inform the interviewer whether they should read the list or not.

Response rate: Response rate refers to the percentage of subjects that respond to the

questionnaires. A response rate of 70% or more is considered as very good.

Non-Respondents: This refers to those who do not respond to the questionnaire.

Page 113: RET560 Research Methods Course Material V01

96

Unit Assignment

The Government of your country wants to curtail the impact of a hydro electric

power dam it is about to construct on the livelihood sources of people with the

proposed dam’s catchment area. The official record of the National Statistical

Service indicates that about 15 000 people are to be affected through inundation.

As a research fellow, you have been asked to carry out a preliminary assessment for

the implementers to evaluate the effects of their interventions and plan appropriately

to curtail the effects.

Use the narrative to answer the following questions:

Budget and times constraints inhibit you from undertaking a census. How

many people will you interview?

How will you select your units to reflect the nuances in the target

population?

What ethics will you apply to ensure that the survey process is not

compromised?

What types of data do you envisage and how will they be gathered?

Page 114: RET560 Research Methods Course Material V01

97

Unit 4

STATISTICAL ANALYSIS WITH STATA AND SPSS

Introduction

Statistical analysis software packages such as SPSS and STATA provide complete,

comprehensive set of tools that can be used to perform various statistical procedures, such as line

plots, scatter plots, tables, regression analysis, bar charts, pie charts, dot charts, multivariate

analysis, time series analysis, survival analysis etc.

Learning Objectives

After reading this unit you should be able to:

8. Use SPSS to do various forms of statistical

analysis.

9. Perform various forms of statistical analysis with

the STATA software.

Unit content

SESSION 4.1: INTRODUCTION TO SPSS

4.1.1 The Nature of SPSS

4.1.2 Data Management in SPSS

4.1.3 Descriptive Statistics in SPSS

SESSION 4.2: INTRODUCTION TO STATA

4.2.1 The Nature of STATA

4.2.2 Data Management in STATA

4.2.3 Descriptive Statistics in STATA

Page 115: RET560 Research Methods Course Material V01

98

SESSION 4.1: INTRODUCTION TO SPSS

4.1.1 The Nature of SPSS

SPSS (Statistical Package for the Social Sciences) is a statistical analysis and data management

software package. SPSS can take data from almost any type of file and use them to generate

tabulated reports, charts, and plots of distributions and trends, descriptive statistics, and conduct

complex statistical analyses.

There are two important limitations of SPSS that deserve mention at the outset:

o SPSS users have less control over statistical output than, for example, Stata or Gauss

users. For novice users, this hardly causes a problem. But, once a researcher wants

greater control over the equations or the output, she or he will need to either choose

another package or learn techniques for working around SPSS’s limitations;

o SPSS has problems with certain types of data manipulations, and it has some built in

quirks that seem to reflect its early creation. The best known limitation is its weak lag

functions, that is, how it transforms data across cases. For new users working off of

standard data sets, this is rarely a problem.But, once a researcher begins wanting to

significantly alter data sets, he or she will have to either learn a new package or develop

greater skills at manipulating SPSS.

4.1.1.2 Getting Started with SPSS

SPSS for Windows is a versatile computer package that can perform a wide variety of statistical

procedures. When using SPSS, you will encounter several types of windows. The window with

which you are working at any given time is called the active window. There are six different

windows that can be opened when using SPSS. The following will give a description of each of them.

Data Editor Window. This window shows the contents of the current data file. A blank data

editor window, as shown in figure 4.1, automatically opens when you start SPSS for Windows;

only one data window can be opened at a time. From this window, you may create new data files

or modify existing ones.

Page 116: RET560 Research Methods Course Material V01

99

Output Viewer Window. This window displays the results of any statistical procedures you run,

such as descriptive statistics or frequency distributions. All tables and charts are also displayed in

this window. The viewer window automatically opens when you create output. Figure 4.2 shows

an output viewer window.

Chart Editor Window. In this window, you can modify charts and plots. For instance, you can

rotate axes, change the colors of charts, select different fonts, and rotate three-dimensional

scatter plots.

Figure 4.1: SPSS Data Editor

Syntax Editor Window. You will use this window if you wish to use SPSS syntax to run

commands instead of clicking on the pull-down menus. An advantage to this method is that it

allows you to perform special features of SPSS that are not available through dialog boxes.

Syntax is also an excellent way to keep a record of your analyses.

Page 117: RET560 Research Methods Course Material V01

100

Figure 4.2: SPSS Output Viewer Window

Pivot Table Editor. Output displayed in pivot tables can be modified in many ways with the

Pivot Table Editor. You can edit text, swap data in rows and columns, add colour, create

multidimensional tables, and selectively hide and show results.

Page 118: RET560 Research Methods Course Material V01

101

Figure 4.3: SPSS Syntax Editor

Figure 4.4: SPSS Chart Editor

Text Output Editor. Text output not displayed in pivot tables can be modified with the Text

Output Editor. You can edit the output and change font characteristics (type, style, color, size).

4.1.1.3 The Main Menu

SPSS for Windows is a menu-driven program. Most functions are performed by selecting an

option from one of the menus. For example, to activate the file menu, either click the mouse on

file or use the keyboard with Alt-F. The main menu bar lists 11 menus:

File. This menu is used to create new files, open existing files, read files that have been created

by other software (e.g., spreadsheets or databases), and print files.

Edit. This menu is used to modify or copy text from output or syntax windows.

Page 119: RET560 Research Methods Course Material V01

102

View. This menu allows you to change the appearance of your screen. You can, for instance,

change fonts, customize toolbars, and display data using their value labels.

Data. Use this menu to make temporary changes in SPSS data files, such as merging files,

transposing variables and cases, and selecting subsets of cases for analyses. Changes are not

permanent unless you explicitly save the changes.

Transform. The transform menu makes changes to selected variables in the data file and

computes new variables based on values of existing variables. Transformations are not

permanent unless you explicitly save the changes.

Analyze. Use this menu to select a statistical procedure to be performed such as descriptive

statistics, correlations, analysis of variance, and cross-tabulations.

Graphs. This menu is used to create bar charts, pie charts, histograms, and scatter plots. Some

procedures under the Analyze menu also generate graphs.

Utilities. This menu is used to change fonts, display information on the contents of SPSS data

files, or open an index of SPSS commands.

Window. Use the window menu to arrange, select, and control the attributes of the SPSS

windows.

Help and add-on. These menus open a Microsoft Help window containing information on how to

use many SPSS features.

4.1.1.4 Some Mathematical Expressions and Logical or Relational Operators

Some Mathematical Expressions

• + , addition

• -, subtraction

• / , division

Page 120: RET560 Research Methods Course Material V01

103

• *, multiplication

• **, exponentiation

• abs(x) returns the absolute value of x.

• exp(x) returns the exponential function of x.

• int(x) returns the integer by truncating x towards zero.

• ln(x), log(x) returns the natural logarithm of x if x>0.

• log10(x) returns the log base 10 of x if x>0.

• max(x1,...,xn) returns the maximum of x1, ..., xn.

• min(x1,...,xn) returns the minimum of x1, ..., xn.

• round(x) returns x rounded to the nearest whole number.

• round(x,y) returns x rounded to units of y.

• sign(x) returns -1 if x<0, 0 if x==0, 1 if x>0.

• sqrt(x) returns the square root of x if x>=0.

Logical Operators

& and

| or

! not

∼ not

Relational Operators

greater than

< less than

>= greater or equal

<= smaller or equal

= equal(for conditional statements)

!= not equal

Page 121: RET560 Research Methods Course Material V01

104

4.1.2 Data Management

Data can be entered directly into SPSS, or it can be imported from a number of different sources.

The processes for reading data stored in SPSS data files, spreadsheet applications, such as

Microsoft Excel, database applications, such as Microsoft Access, and text files are all discussed

in this chapter.

4.1.2.1 Entering Your Own Data

To begin entering data in the data editor, follow these steps:

1. Click on File from the menu bar.

2. Click on New and then Data from the file pull-down menu.

3. Click on the cell in which you wish to enter data (or use the arrow keys to highlight the

cell). A darkened border will appear around the cell; this tells you that this is the cell you

have selected.

4. Type in the value you wish to appear in that cell and then press Enter.

5. Repeat this process until you have entered all of the data you wish for column 1 (values

for all cases on variable 1).

6. When you are ready to add another variable, click on the first cell in the next column

(row 1, column 2).

7. Repeat this process for all values in column 2.

8. Continue this procedure until you have entered values for all cases and variables that you

wish for your data file.

Once you have entered data in the data editor, you may change or delete values. To change or

delete a value in a cell, simply click on the cell you wish to alter. You will notice that a dark

border appears around the selected cell, and the value in the cell appears at the top of the data

editor. If you are changing the value, simply type the new value and press enter.

Adding Cases and Variables

To insert a new case (row) between cases that already exists in your data file:

Page 122: RET560 Research Methods Course Material V01

105

1. Point the mouse arrow and click on the row number below the row where you wish to

enter the new case. The row should be highlighted in black.

2. Click on Data on the menu bar.

3. Click on Insert Cases from the pull-down menu. A new row is now inserted and you may

begin entering data in the cells. Notice that before you enter your values, all of the cells

have system-missing values (represented by a period).

To insert a new variable (column) between existing variables:

1. Click on the column variable name that is to the right of the position where you wish to

enter a new variable. The column should be highlighted in black.

2. Click on Data on the menu bar.

3. Click on Insert Variable from the pull-down menu. A new variable (column) is now

inserted and you may begin entering data in the cells.

Deleting Cases and Variables

To delete a case:

1. Click on the case number that you wish to delete.

2. Click on Edit from the menu bar.

3. Click on Clear. The selected case will be deleted and the rows below will shift upward.

To delete a variable:

1. Click on the variable name that you wish to delete.

2. Click on Edit from the menu bar.

3. Click on Clear. The selected variable will be deleted and all variables to the right of the

deleted variable will shift to the left. Deleting variables can also be accomplished using

SPSS syntax with the Drop and Keep subcommands.

Defining Variables

By default, SPSS assigns variable names and formats to all variables in the SPSS data file. By

default, variables are named VAR##### (prefix VAR followed by five digits) and all values are

valid (blanks are assigned system missing values). Most of the time, however, you will want to

Page 123: RET560 Research Methods Course Material V01

106

customize your data file. For example, you may want to give your variables more meaningful

names, provide labels for specific values, change the variable formats, and assign specific values

to be regarded as “missing.”

To do any or all of these:

1. First, make sure that your data file window is the active window and click on the variable

name that you wish to change.

2. Click on the Variable View tab or else double-click on the variable name in the data

editor.

3. Type the name of the variable in the Name column. Variable names have to be unique,

begin with a letter, and cannot contain blank spaces.

4. If you wish to change the type or format of a variable, click the button in the Type cell to

open the Variable Type dialog box. By default, all variables are numeric, but you may

work with other types such as names, dates, and other non-numeric data.

5. Suppose you have a variable representing average cost of groceries per person that was

entered to the nearest cent (e.g., 32.24) and you want to change this format so that the

average cost is displayed as a whole number (rounded to the nearest dollar, e.g., 32) click

in the button Decimal places box. To change the format of the numeric variable, click in

the Width box.

6. If one of your variables is categorical, you can assign numbers to represent the categories

of the variable. For example, the variable sex will have 2categories: male and female.

Males may have the assigned value “1” and “2” represents females. It is useful to have

descriptive labels assigned to the values of 1 and 2 so that it is easy to see which number

represents which category in your output files.

7. If there are specific values that you would like to be treated as missing values, click on

Missing to open the Missing Values dialog box. Click on Discrete Missing Values to tell

SPSS that you have specific values that are considered to be missing. Type the value(s) in

the boxes (you may have up to three values). If you have more than three missing values,

click on Range plus one optional discrete missing value and enter the lower and upper

bounds of the discrete variable. Click OK when you have entered in all of your missing

values.

Page 124: RET560 Research Methods Course Material V01

107

Reading SPSS Data Files

We will illustrate how to read an existing SPSS data file. The reader may follow along using the

data accompanying this guide.

To open a data file:

1. Click on File from the menu bar.

2. Click on Open on the file pull-down menu.

3. Click on Data on the open pull-down menu. This opens the Open File dialog box as

shown in Figure 4.5.

4. Choose the correct directory from the Look in: box at the top of the screen.

5. Point the arrow to the data file you wish to open and click on it.

6. Click on Open.

Note: Most of the examples in the following chapters use the SPSS data files that are provided

with this manual. Unless you are required to enter data on your own into a new file, all

procedures assume that you have opened the SPSS data file before beginning any computations

or analyses.

Reading Data Files in Text and Other Formats

To read a text data file, begin at the main menu bar in the Data Editor window:

1. Click on File.

2. Click on Read Text Data.

3. Select the appropriate file from the Open file dialog box and click Open.

4. Follow the steps in the Text Import Wizard to read the data file. You will have to answer

questions about type of data, arrangement of data, number of cases to import, and missing

values. Use the Help button of the Text Import Wizard for more detailed information.

To open data from a file such as an Excel spreadsheet, begin at the Data Editor window:

1. Click on File.

2. Click on Open and then click on Data.

Page 125: RET560 Research Methods Course Material V01

108

3. Select the file format from the drop-down list of file types in the Files of type: box.

4. Choose the appropriate directory and file.

5. Click on Open.

Figure 4.5: Open File Dialog Box

Saving Data Files

Unless you save your files, all of your data and changes will be lost when you leave the SPSS

session. To save a file, first make the Data Editor the active window. Then:

1. Click on File from the menu.

2. Select Save from the list of options in the File pull-down menu.

3. Select the appropriate directory in the Save in: box. Type the name of your file in the

File name box. Notice that the default file type is set for SPSS format as indicated by the

“.sav” extension.

4. Click on Save.

Page 126: RET560 Research Methods Course Material V01

109

By default, this will save the data file as an SPSS data file. If you were working with a

previously existing data file, the old file will be overwritten by the modified data file. To save

the file with a different name, select Save As … from the File pull-down menu.

If you wish to save the data file in a format other than SPSS (e.g., Lotus, Excel, dBASE, fixed-

format ASCII text):

1. Click on File from the menu.

2. Select Save As from the list of options in the File pull-down menu.

3. Select the appropriate directory in the Save in: box. Type the name of your file in the File

name box.

4. Choose the appropriate file type in the Save as type: box.

5. Click on Save.

4.1.2.2 Transforming Variables and Data Files

At times, you may need to alter or transform the data in your data file to allow you to perform

the calculations you require. There are many ways in which you can transform data. This section

discusses three commonly used techniques: computing new variables, recoding variables, and

selecting subsets of cases.

Computing New Variables

There may be occasions when you need to compute new variables that combine or alter existing

variables in your data file. For instance, your data file may contain daytime and nighttime

sleeping hours for a sample of infants, but you are interested in examining total sleep hours (i.e.,

the sum of the separate daytime and nighttime hours).

To create a new variable:

1. Click on Transform from the menu bar.

2. Click on Compute from the pull-down menu. This opens the Compute Variable dialog

box (see Fig.4.6).

Page 127: RET560 Research Methods Course Material V01

110

3. Enter the name of the new variable (in the above illustration, total) in the Target Variable

box. (You also have the option to describe the nature and format of the new variable by

clicking on the Type & Label box.)

4. You will then need to perform a series of steps to construct an expression used to

compute your new variable. In this illustration, you would first select the daytime

variable (“daysleep”) from the variable list box on the lefthand side of the dialog box and

move it to the Numeric Expression box using the right directional arrow.

5. Then click on the “+” from the calculator pad. You will notice that a plus sign is placed

in the Numeric Expression box after the word daytime.

6. Complete the expression by selecting the nighttime variable (“nightsleep”) and moving it

to the Numeric Expression box, following the instructions in step (4) above.

7. When you have completed the expression, click on OK to close the Compute Variable

dialog box. Your new variable will be added to the end of your data file.

In addition to simple algebraic functions on the calculator pad (+, -, x, ÷), there are many other

arithmetic functions such as absolute value, truncate, round, square root, and statistical functions

including sum, mean, minimum, and maximum. These are displayed in the Function group box

to the right of the calculator pad. First, select a procedure in the Function group window, and

then select the specific function in the Functions and Specific Variables window.

4.1.2.3 Using SPSS Syntax

As illustrated throughout this book, most SPSS procedures are conducted using the pull-down

menus because they are convenient and easy to use. However, an alternative way to run SPSS

procedures is through command syntax. SPSS commands are the instructions that you give the

program for conducting procedures.

SPSS syntax commands are typed into a command file using the SPSS syntax editor. Syntax files

have the extension “.sps”. There are several reasons why command syntax is useful, such as

when the user wants to: (1) have a record of the analyses conducted during a session; (2) repeat

long and complex analyses; (3) review how variables were created or transformed; and (4)

modify commands to run slightly different or customized statistics.

Page 128: RET560 Research Methods Course Material V01

111

When working with syntax, the user must enter commands instructing the program what

procedures to conduct. You can enter syntax by either typing or pasting syntax into the syntax

editor. Because most users do not know the commands from memory, it is useful to refer to the

SPSS Syntax Reference Guide for a complete reference to the command syntax. Help is also

available by using the Help button on the toolbar in the syntax editor window. Pasting syntax

commands from dialog boxes is perhaps the easiest way to construct syntax commands. Rather

than typing the commands, you initiate a procedure using pull-down menus and then instruct

SPSS to provide the commands and paste them into the syntax editor.

To open a new window and begin typing commands:

1. Click on File from the main menu.

2. Click on New from the pull-down menu.

3. Click on Syntax to open the SPSS syntax editor (see Fig. 4.7).

4. Begin typing syntax into the editor.

Page 129: RET560 Research Methods Course Material V01

112

Figure 4.6: Compute Variable Dialog Box

For example, suppose you want to open the sleep.sav data file, but you only want to read a subset

of variables — body weight, total sleep, and danger index.

The syntax command would be:

GET FILE = SLEEP.

/KEEP = BODY WT TOTSLEEP DANGER

Figure 4.7: SPSS Syntax Editor

You can also run a procedure by pasting syntax from a dialog box. When you use the paste

button, SPSS creates the syntax commands to execute procedures requested from pull-down

menus. For example, to compute a new variable (total sleep hours) as shown in session 4.1.2.2,

follow steps 1–6. Instead of clicking on OK, click on the Paste button. The compute commands

will automatically be displayed in a syntax window. To run the syntax commands, click the

Right arrow button on the toolbar.

Page 130: RET560 Research Methods Course Material V01

113

Once you have created a syntax file, you can save it using the same procedures described in

Session 4.1.2.1 of this chapter. The file can then be opened and edited for future modifications.

Make sure when you open, edit, and save a syntax file that you correctly identify it with the

“.sps” file type.

4.1.3 Descriptive Statistics

A statistical data set consists of a collection of values on one or more variables. The variables

can be either numerical or categorical. Numerical variables are further classified as discrete or

continuous. These distinctions determine the statistical approaches that are appropriate for

summarizing the data. Examples of data include

crime rates for large cities across Africa;

body temperatures for a randomly chosen sample of adults;

The basic features of any data can be presented in the form of:

Graphical displays

Tabular descriptions

Summary statistics

Linear regressions

4.1.3.1 Tabular description

One approach to organizing data is by using tables. The type of table you use depends in part on

the way the data are measured — in categories (e.g., occupations) or on a numerical scale (e.g.,

number of errors). This chapter demonstrates how to examine different types of data through

frequency distributions.

Summarizing Categorical Data

Page 131: RET560 Research Methods Course Material V01

114

Categorical variables are those that have qualitatively distinct categories as values. For example,

gender is a categorical variable with categories “male” and “female”.

Frequencies

One way to display data is in a frequency distribution, which lists the values of a variable (e.g.,

for the variable region: Accra, Kumasi, Volta, etc.) and the corresponding numbers and

percentages of people for each value. Let us begin by creating a simple frequency distribution of

Regions using the “sec7.sav” SPSS data file from the GLSS5 accompanying this manual. Follow

along by using SPSS to open the data file on your computer (using the procedure given in

Chapter 2). This data set was used in a study of the Housing Characteristics in Ghana.

Notice that the data view lists numbers as the values for all of the variables, even though the

variable is a categorical variable. To see the categories each of the values represents, you can

examine the contents of the data file (variable labels, variable type, and value labels) by clicking

on Utilities on the menu bar and clicking on Variables from the pull-down menu.

To create a frequency distribution of the region variable:

1. Click on Analyze from the menu bar.

2. Click on Descriptive Statistics from the pull-down menu.

3. Click on Frequencies from the second pull-down menu to open the region dialog box.

4. Click on the label/name of the variable you wish to examine (“region”) in the left-hand

box.

5. Click on the right arrow button to move the variable name into the Variable(s) box.

6. Click on OK.

The frequency distribution produced by SPSS is shown in Figure 4.8. This figure shows the

content of the output — that which is in the right-hand frame of your Output Viewer. The

“Statistics” table in the output indicates the number of valid and missing values for this variable.

There are 8687 valid cases and no missing values. The “Region” table displays the frequency

distribution.

Page 132: RET560 Research Methods Course Material V01

115

For example, there are 834 people in the Western region and 1257 people in the Greater Accra

region. The numbers in the “Percent” column represent the percentage of the total number of

cases that are in each region. These are obtained by dividing each frequency by the total number

of cases and multiplying by 100. For example, 18.1% of the people are in the Ashanti region.

Statistics

Region

N Valid 8687

Missing 0

Region

Frequency Percent Valid Percent Cumulative Percent

Valid Western 834 9.6 9.6 9.6

central 689 7.9 7.9 17.5

greater accra 1257 14.5 14.5 32.0

volta 720 8.3 8.3 40.3

eastern 914 10.5 10.5 50.8

ashanti 1574 18.1 18.1 68.9

brong ahafo 795 9.2 9.2 78.1

northern 795 9.2 9.2 87.2

upper east 600 6.9 6.9 94.1

upper west 509 5.9 5.9 100.0

Total 8687 100.0 100.0

Figure 4.8: Frequency Distribution of number of people in the various regions of Ghana

The “Valid Percent” column takes into account missing values. In this case, there are no missing

values, so the “Percent” and “Valid Percent” columns are the same. The “Cumulative Percent” is

a cumulative percentage of the cases for the category and all categories listed before it in the

table.

Worked example 2

Page 133: RET560 Research Methods Course Material V01

116

Draw a table showing the variation in cooking fuel in the urban areas of the Greater Accra

Region. (Use the data in the file “sec7.sav”, from the GLSS5 accompanying this manual).

Solution;

1. Click on Data to open the data pull down menu

2. Click on “select cases”. (To open the select cases pop-up menu )

3. Click on “if condition is satisfied”, click on the “if…” button

4. Type “if region=3 & loc=1%”, and click OK

5. Click on Analyze from the menu bar.

6. Click on Descriptive Statistics from the pull-down menu.

7. Click on Frequencies from the second pull-down menu to open the region dialog box.

8. Click on the label/name of the variable you wish to examine (“Main fuel used for

cooking”) in the left-hand box.

9. Click on the right arrow button to move the variable name into the Variable(s) box.

10. Click on OK.

Figure 4.9 shows the contents of the output.

Main fuel used for cooking

Frequency Percent Valid Percent Cumulative

Percent

Valid None,No Cooking 4 1.4 1.4 1.4

Wood 61 20.7 20.7 22.1

Charcoal 185 62.9 62.9 85.0

Gas 39 13.3 13.3 98.3

Electricity 1 .3 .3 98.6

Kerosene 4 1.4 1.4 100.0

Total 294 100.0 100.0

Figure 4.9: Frequency Distribution of Main fuel used for cooking in Ghana

4.1.3.2 Graphical displays

One approach to organizing data is through a chart or graph. The type of chart you use depends

in part on the way the data are measured — in categories (e.g., occupations) or on a numerical

Page 134: RET560 Research Methods Course Material V01

117

scale (e.g., number of errors). This chapter demonstrates how to examine different types of data

through graphical representations.

Figure 4.10: Frequencies Charts Dialog Box

4.1.3.3 Bar Charts, Pie Charts, Histogram and Line Graphs

These charts are useful for examining categorical data. In a bar chart and histogram, the height of

each bar represents the frequency of occurrence for each category of the variable. Let us create a

bar chart for the region data using an option within the Frequencies procedure. From the

Frequencies dialog box (see steps 1–3 of the Frequencies section):

1. Click on Charts to open the Frequencies Charts dialog box (see Fig. 4.10).

2. Click on Bar charts in the Chart Type box.

3. Choose the type of values you want to chart — frequencies or percentages — in the Chart

Values box. For this example, we have selected frequencies.

4. Click on Continue.

5. Click on OK to run the chart procedure.

A bar chart like that in Figure 4.11 should appear in your SPSS Viewer. The information

displayed in this chart is a graphical version of that shown in the frequency distribution in Figure

4.8. The region with the greatest number of people is the Ashanti region.

Worked example 1

Draw bar graphs to show the Rural – Urban correlation for the various Regions. (Use the data in

the file “sec7.sav”, from the GLSS5 accompanying this manual).

Solution;

Page 135: RET560 Research Methods Course Material V01

118

1. Click on Graphs to open the graphs pull down menu

2. Click on Bar charts. (To open the bar chart pop-up menu )

3. Click on “clustered”, select “summaries for group of cases” and click on “define”

4. Select “% of cases”, move the region variable to the category axis and the rural/urban

variable to the “define clusters by:”

5. Click on OK to run the chart procedure

A bar chart like that in Figure 4.12 should appear in your SPSS Viewer.

Figure 4.11: Bar chart of number of people in the various regions of Ghana

4.1.3.4 Summary statistics

upper

west

upper

east

northernbrong

ahafo

ashantieasternvoltagreater

accra

centralWestern

Region

1,500

1,000

500

0

Freq

uenc

y

Region

Page 136: RET560 Research Methods Course Material V01

119

Summarizing Numerical Data

There are two types of numerical variables — discrete and continuous. The values for discrete

variables are counting numbers. For example, an American football game is won by one, two, or

three points, not a quantity in between. Continuous variables, on the other hand, do not have such

indivisible units. Body temperature, for instance, can be measured to the nearest degree, half

degree, quarter-degree, and so on. For practical purposes in SPSS, there is no difference in

summarizing these two types of numerical data.

Figure 4.12: Bar graphs showing the Rural – Urban correlation of the various regions in Ghana

upper

west

upper

east

norther

n

brong

ahafo

ashantieasternvoltagreater

accra

centralWester

n

Region

30.0%

20.0%

10.0%

0.0%

Per

cen

t

rural

urban

urban/rural-corr

Page 137: RET560 Research Methods Course Material V01

120

4.1.3.5 Mean, Sum, Standard Deviation, Variance, Minimum Value, Maximum Value, and

Range

When generating these statistics, the Data Editor must be open with the appropriate data set

before continuing.

Worked Problem

Using the data in the file “sec7.sav”, determine the mean, sum, standard deviation, variance,

minimum value, maximum value, and range for s7fq6 only.

Solution

1. Repeat steps 1–2 of the Frequencies section, select Descriptives. This will open the

Descriptives dialog box as shown in Fig.4.13.

Figure 4.13: Descriptives Dialog Box

2. In the variable list, select the variable Area in square meters. Left click on the right arrow

button between the boxes to move this variable over to the Variable(s) box. To calculate

statistics for many variables, simultaneously add variables to the Variable(s) box.

3. Click on the Options button. This will open the Descriptives: Options dialog box.

4. Click on mean, sum, standard deviation, minimum value, maximum value, and range.

5. Click on the Continue button when done.

Page 138: RET560 Research Methods Course Material V01

121

6. Click OK. The Descriptives dialog box closes and SPSS activates the Output Navigator

to illustrate the statistics.

4.1.3.6 Measures of Central Tendency and Measures of Variability

Measures of central tendency or location specify the “center” of a set of measurements and

Measures of variability indicate how spread out the observations are, that is, how much the

values differ from individual to individual. This chapter describes ways to use SPSS to obtain

three common measures of location — the mode, the median, and the mean – of a sample. How

SPSS can be used to obtain measures of variability such as range, standard deviation etc., is

discussed in the beginning of this chapter. Measures of central tendency and variability can be

used to:

find the most common college major for a group of students;

find the midpoint of a set of ordered body weights that divides the set in half;

calculate the average gross of the top movies from a given year;

find the difference between the largest and smallest salary paid to people working at a

particular company;

determine how daily hours of sleep vary among different species of mammals;

Page 139: RET560 Research Methods Course Material V01

122

Figure 4.14: Descriptives of areas in square meters for households in Ghana.

The Mode, Median and Mean

The mode, especially useful in summarizing categorical or discrete numerical variables, is the

category or value that occurs with the greatest frequency. One way to obtain the mode with SPSS

for Windows is by using the Frequencies procedure. This is the same procedure used to obtain

frequency distributions, histograms, and bar charts as discussed. To obtain the mode

of any variable:

1. Click on Analyze from the menu bar.

2. Click on Descriptive Statistics from the pull-down menu.

3. Click on Frequencies from the pull-down menu.

4. Click on the “the variable of interest” and then the right arrow button to move the

variable into the Variable(s) box.

5. Click on the Statistics button at the bottom of the screen. This opens the Frequencies:

Statistics dialog box, as shown in Figure 4.15.

6. Click on the Mode option in the Central Tendency section.

Page 140: RET560 Research Methods Course Material V01

123

7. Click on Continue to close this dialog box.

8. Click on OK to close the Frequencies dialog box and execute the procedure.

Notice that the same method employed above, could be used to obtain the median, mean, sum,

percentiles etc.

Figure 4.15: Frequencies: Statistics Dialog Box

Figure 4.16: Descriptives: Options Dialog Box

Page 141: RET560 Research Methods Course Material V01

124

Self Assessment 4.1

(Use the data in the file “sec7.sav”, from the GLSS5 accompanying this manual). Using SPSS;

4.1.1 With the aid of tables and bar charts, show how access to the different cooking fuels

varies between rural and urban areas for the Greater Accra Metropolitan Area

(GAMA).

4.1.2 Still using tables and bar charts, show how access to the different cooking fuels in

rural and urban areas for Accra compares with one other region of your choice.

4.1.3 With the aid of tables and pie charts, show the distribution of different cooking fuel

usage for all the regions of Ghana

4.1.4 Draw bar graphs to show the Rural – Urban correlation for the entire sample in

percentages and actual number of cases

SESSION 4.2: INTRODUCTION TO STATA

Stata is a full-featured statistical programming language for Windows, Macintosh, Unix and

Linux. It can be considered a “stat package,” like SAS, SPSS, RATS, or eViews. The number of

variables is limited to 2,047 in standard Stata/IC, but can be much larger in Stata/SE or

Stata/MP. The number of observations is limited only by memory. Stata has traditionally been a

command-line-driven package that operates in a graphical (windowed) environment. It contains a

graphical user interface (GUI) for command entry.

Page 142: RET560 Research Methods Course Material V01

125

4.2.1 The Stata Environment

When you start Stata for Windows you will see the following windows, the Command window

where you type in your Stata commands, the Results window where Stata results are displayed,

the Review window where past Stata commands are displayed and the Variables window which

list all the variables in the active data file as shown in figure 4.17. The data in the active data file

can be browsed (read-only) in the Browser window, which is activated from the menu Data/Data

browser or by

browse varlist

where varlist (e.g. income age) is a list of variables to be displayed.

The Editor window as shown in figure 4.18, allows to edit data either by directly typing into the

editor window or by copying and pasting from spreadsheet software

edit varlist

Stata has implemented every Stata command (except the programming commands) as a dialog

that can be accessed from the menus. This makes commands you are using for the first time

easier to learn as the proper syntax for the operation is displayed in the Review window.

4.2.1.1 Stata Toolbar

open: open a stata dataset.

save: save a dataset.

print: print contents of active window.

Page 143: RET560 Research Methods Course Material V01

126

Figure 4.17: Stata Environment

log: to start or stop, pause or resume a log file.

viewer: open viewer window, or bring to the front

graph: open graph window, or bring to the front.

do-file editor: open do-file editor, or bring window to the front.

data editor: open data editor, or bring window to the front.

data browser: open data browser, or bring window to the front.

more: command to continue when paused in long output.

break: stop the current task. This command returns the system to as it was before you issued the

command.

Page 144: RET560 Research Methods Course Material V01

127

Figure 4.18: Stata Toolbar

4.2.1.2 Working Directory

The working directory shown in fig. 4.17, displayed at the bottom left hand corner of the window

is your default directory. Any files you save without specifying a directory will be saved here to

change your working directory, use the cd command: cd directoryname

Note: You are advised to use the cd command at the beginning of your do-files and programs;

this will save a lot of editing if the data you are using is moved.

4.2.1.3 Memory

To change the memory assigned to STATA:

set mem#k

where # is a number greater than the size of the dataset, and less than the total amount of

memory available on your system.

To check the size of the dataset, look in My Computer or your Explorer package. To check the

amount of memory (RAM) your system has available, go to the Start menu and click on

\Settings\Control Panel\System. The bottom line, under General tells you how many KB of RAM

you have available.

Page 145: RET560 Research Methods Course Material V01

128

STATA 10 opens with a default memory of 10.00 MB. To increase the default memory: Right

click on the STATA icon and choose Properties\Shortcut

Edit the Target field to say: \\St-server5\stata8$\wsestata.exe /k#

Where k# is the number of kb you wish to assign to STATA.

Note: If you do not have enough memory available on your machine to read a whole dataset,

open a subset of the variables you need.

4.2.1.4 Where to Get Help

The Stata User's Guide is an introduction into the capabilities and basic concepts of Stata. The

Stata Base Reference Manual provides systematic information about all Stata commands. It is

also often an excellent treatise of the implemented statistical methods. The online help in Stata

describes all Stata commands with its options. However, it does not explain the statistical

methods as in the Reference manual. You can start the online help by issuing the command;

help command

If you don't know the exact expression for the command, you can search the Stata documentation

by;

search word

In both cases the result is written into the result window. Alternatively, you can display the result

in the Viewer window by issuing the command

view help command

or by calling the Stata online help in the menu bar: Help/Search...

4.2.1.5 Some Mathematical Expressions, Logical and Relational Operators

Some Mathematical Expressions

• + addition

• - subtraction

• / division

• * multiplication

• ^ exponentiation

Page 146: RET560 Research Methods Course Material V01

129

• abs(x) returns the absolute value of x.

• exp(x) returns the exponential function of x.

• int(x) returns the integer by truncating x towards zero.

• ln(x), log(x) returns the natural logarithm of x if x>0.

• log10(x) returns the log base 10 of x if x>0.

• max(x1,...,xn) returns the maximum of x1, ..., xn.

• min(x1,...,xn) returns the minimum of x1, ..., xn.

• round(x) returns x rounded to the nearest whole number.

• round(x,y) returns x rounded to units of y.

• sign(x) returns -1 if x<0, 0 if x==0, 1 if x>0.

• sqrt(x) returns the square root of x if x>=0.

Logical Operators

& and

| or

! not

∼ not

Relational Operators

> greater than

< less than

>= greater or equal

<= smaller or equal

= = equal(for conditional statements)

!= not equal

.

Page 147: RET560 Research Methods Course Material V01

130

4.2.2 Data Management

4.2.2.1 Data Entry and Importing Data in Stata

There are two ways of getting data in stata, one way of doing this is manual data entry or

inputting interactively from keyboard. This method is useful for small datasets. For example to

enter data on accident rates (ar) and speed limits (sl) directly into Stata, the syntax is;

input ar sl

1. 4 55

2. 1.5 60

3. 1 .

4. end

This data could also be entered manually by clicking on the data editor on the toolbar menu; note

that you can copy-and-paste into the data editor. The output is as shown in figure 4.19.

Figure 4.19: Date Editor

Page 148: RET560 Research Methods Course Material V01

131

Inputting from files and spreadsheets (data entry software) is the common way data are brought

into Stata. (Note; excel is not a data entry software).

To prepare data in a data entry software for conversion;

Make sure that missing data values are coded as empty cells or as numeric values (e.g.,

999 or -1). Do not use character values (e.g -, N/A) to represent missing data.

Make sure that there are no commas in the numbers. You can change this under Format

menu, then select Cells... .

Make sure that variable names are included only in the first row of your spreadsheet.

Variable names should be 32 characters or less, start with a letter and contain no special

characters except ‘-’.

Under the File menu, select Save As... . Then Save as type Text (tab delimited). The file will be

saved with a .txt extension.

Start Stata. Then issue the following command:

insheet using filename [, clear]

where filename is the name of the tab-delimited file (with extension .txt).

If you have already opened a data file in Stata you can replace the old data file using the option

clear.

4.2.2.2 Opening and Saving Data

To open an existing Stata datafile (extension .dta), type the following command at the command

prompt;

use filename [, clear]

where the option clear clears the dataset already in memory.

To save a datafile in Stata format, type

save [filename]

If file name is not specified, the name under which the data was last known is used. If filename is

specified without an extension, .dta is used.

Stata will look for data or save data or save a log file in the drive and directory specified by

Page 149: RET560 Research Methods Course Material V01

132

cd drive:directory

See help memory if you encounter memory problems when loading a file.

4.2.2.3 Creating new variables

New variables are created by the following syntax;

generate newvar = expression [if expression]

where newvar is the name of the new variable and expression is a mathematical function of

existing variables. The if option applies the command only to the data specified by a logical

expression. The (system) missing value code ‘.’ is assigned to observations that take no value.

Some examples:

generate age2 = age^ 2

generate agewomen = age if women = = 1

generate rich = 0 if wealth != .

replace rich = 1 if wealth >= 1000000

generate rich = wealth >= 1000000

4.2.2.4 Changing Existing variables

Existing variables can be changed by the syntax below;

replace oldvar = expression [if expression]

or by double clicking on the variable name in the data editor to open the variable properties

dialog box as shown in fig.20, and typing the new variable name in the name edit box.

Page 150: RET560 Research Methods Course Material V01

133

Figure 4.20: Variable Properties

The command egen extends the functionality of generate. For example

egen average = mean(income)

creates a new variable containing the (constant) mean income for all observations. See the last

section for some available functions.

Both the generate and the egen command allow the by varlist prefix which repeats the command

for each group of observations for which the values of the variables in varlist are the same. For

example,

sort nationality

by nationality: egen referenceinc = mean(income)

generates the new variable referenceinc containing for each observation the mean income of all

observations of the same nationality. Note that the data has to be sorted by nationality

beforehand.

The recode command is a convenient way to exchange the values of ordinal variables:

recode var (rule1) [(rule2)]

Page 151: RET560 Research Methods Course Material V01

134

e.g. replace gender (1=0) (2=1) will produce a dummy variable.

The following system variables (note the ‘-’) may be useful:

_n contains the number of the current observation.

_N contains the total number of observations in the dataset.

_pi contains the value of pi to machine precision.

A lagged variable can be created in the following way: First define a time series index. Second

declare the data a time series. For example this can be done with the commands

generate t = _n /* generate a variable with values 1...N */

tsset t /* declare the time series */

Lagged values can now be designated as L.varname. For example L.gdp designates a lagged

value of the variable gdp, L2.invest designates the variable invest lagged twice.

4.2.2.5 Labelling Values

The command label values, attaches a value label to a variable. If no value label is specified,

any existing value label is detached from that variable. The value label, however, is not deleted.

Value labels may be up to 32,000 characters long.

To define the value label yesno, use the syntax;

label define yesno 1 "no" 2 "yes"

meaning the variable no is labeled 1 and yes 2. Remember that value labels may include many

associations and typing them all on one line can be ungainly or impossible.

4.2.2.6 Deleting variables

You can delete variables from the dataset by either specifying the variables to be dropped or to

be kept:

drop varlist

keep varlist

You can delete observation from a dataset by specifying the observations to be dropped (or kept)

by either logical expression or by specifying the last and first observation;

drop [if expression] [in range first/last]

keep [if expression] [in range first/last]

Page 152: RET560 Research Methods Course Material V01

135

4.2.2.7 Sorting Variables

Arrange the observations of the current dataset in ascending order with respect to varlist

sort varlist

Change the order of the variables in the current dataset:

order varlist

by specifying a list of variables to be moved to the front of the dataset. You can convert the data

into a dataset of the means (or other statistics see help) of varlist. varname speci_es the groups

over which the means are calculated.

collapse varlist, by(varname)

A description of the variables in the dataset is produced by describe and codebook [varlist].

4.2.2.8 Commands and Variables

It is possible to scroll through past commands by using the page up and page down buttons on

your keyboard. Alternatively you can double click on a command in the Review window and it

will appear in your Command window. Similarly you can click on any variable that appears in

the Variables window and they will appear in the Command window (or wherever the Target in

the Variables window specifies).

4.2.2.9 Command Interface

There have been some significant changes in STATA. One of the main ones is that, it now has a

Statistics Menu in the style of SPSS. This enables the user to select an item from a pull down

menu which opens a dialogue box in which you can build STATA commands. The detail on how

to use this method of analysing data is discussed in the first part of this chapter, ie, introduction

to SPSS. Users of STATA are encouraged to learn the commands so that they can write do-files

and programs.

However, one point that may be useful: The command issued by the dialogue box is submitted as

if you typed it by hand. Therefore if you cannot remember the syntax of a command, using the

dialogue box and then checking the command in the Review window is a good way to get a

reminder.

Page 153: RET560 Research Methods Course Material V01

136

4.2.2.10 Files Extensions

Data file filename.dta

Do file filename.do (program file)

Dictionary file filename.dct

Log file filename.scml (only readable in stata)

Log file filename.log (text file)

4.2.2.11 Opening Files

Most of the commands discussed below can also be run from the toolbar or the menus, however

in this document the syntax of typed commands are discussed.

To open a file: use the following syntax

usefilename, clear

usevarlistusing filename, clear [for a subset of the data file]

In some cases you may get the message no room to add more observations or no room to add

more variables. This is because not enough memory has been assigned to STATA.

4.2.3 Descriptive Statistics In Stata

In terms of statistics, Stata provides all of the standard univariate, bivariate and multivariate

statistical tools, from descriptive statistics and t-tests through one-, two- and N-way ANOVA,

regression, principal components, and the like. Stata’s regression capabilities are full-featured,

including regression diagnostics, prediction, robust estimation of standard errors, instrumental

variables and two-stage least squares, seemingly unrelated regressions, vector autoregressions

and error correction models, etc. It has a very powerful set of techniques for the analysis of

limited dependent variables: logit, probit, ordered logit and probit, multinomial logit etc.

Just like in SPSS, the basic features of any data in Stata can be presented in the form of:

Graphical displays

Tabular descriptions

Summary statistics

Page 154: RET560 Research Methods Course Material V01

137

Linear regressions

4.2.3.1 Graphical displays

Stata graphics are excellent tools for exploratory data analysis, and can produce high-quality 2-D

publication-quality graphics in several dozen different forms. Every aspect of the graphics may

be programmed and customized, and new graph types and graph “schemes” are being

continuously developed. The programmability of graphics implies that a number of similar

graphs may be generated without any “pointing and clicking” to alter aspects of the graphs. Stata

does not have 3-D graphics capabilities, but those are under development in the new graphics

system.

To draw a scatter plot of the variables yvar1 yvar2 ... (y-axis) against xvar (x-axis): the syntax is

scatter yvar1 yvar2 ... xvar

To draw a line graph, i.e. scatter with connected points

line yvar1 yvar2 ... xvar

To draw a histogram of the variable var

histogram var

To draw a scatter plot with regression line:

scatter yvar xvar || lfit yvar xvar

4.2.3.2 Summary Statistics

To display univariate summary statistics of the variables in varlist: type

summarize varlist

at the command prompt

4.2.3.3 Tabular descriptions

Report the frequency counts of varname:

tabulate varname [if expression] [, missing]

The missing option requests that missing values are reported.

To display the correlation or covariance matrix for varlist, use the syntax

correlate varlist

Page 155: RET560 Research Methods Course Material V01

138

To produce a two-way table of absolute and relative frequencies counts along with Pearson's chi-

square statistic:

tabulate var1 var2, col chi2

To perform a two-sample t-test of the hypothesis that varname has the same mean within the two

groups defined by the dummy variable groupvar

ttest varname [if exp], by(groupvar) [ unequal]

where the option unequal indicates that the two-sample data are not to be assumed to have equal

variances.

4.2.3.4 Regression

To regress a dependent variable depvar on a constant and one or more independent variables in

varlist use

regress depvar [varlist] [if exp] [, level(#) noconstant]

the if option limits the estimation to a subsample specified by the logical expression exp. The

noconstant option suppresses the constant term.

level(#) specifies the confidence level, in percent, for confidence intervals of the coefficients. See

help regress for more options.

You can access the estimated parameters and their standard errors from the most recently

estimated model;

coef[varname] contains the value of the coe_cient on varname

se[varname] contains the standard error of the coe_cient

Stata calculates predictions from the previously estimated regression by

predict newvarname [, stdp]

The stdp option provides the standard error of the prediction.

[post-estimation commands: predict, cve, ...]

4.2.3.4 Log Files

A log file keeps a record of the commands you have issued and their results during your Stata

session. You can create a log file with;

log using filename [, append replace text]

Page 156: RET560 Research Methods Course Material V01

139

where filename is any name you wish to give the file. The append option simply adds more

information to an existing file, whereas the replace option erases anything that was already in the

file. Full logs are recorded in one of two formats: SMCL (Stata Markup and Control Language)

or text (meaning ASCII). The default is SMCL, but the option text can change that.

A command log contains only your commands

cmdlog using filename

Both type of log files can be viewed in the Viewer:

view filename

You can temporarily suspend, resume or stop the logging with the command:

log f on | off | close g

cmdlog f on | off | close g

4.2.3.5 Do-Files

A do-file is a set of commands just as you would type them in one-by- one during a regular Stata

session. Any command you use in Stata can be part of a do file. The default extension of do-files

is .do, which explains its name. Do-files allow you to run a long series of commands several

times with minor or no changes. Furthermore, do-files keep a record of the commands you used

to produce your results.

To edit a do-file, just click on the new do file icon in the toolbar. To run this file, save it in the

do-file editor and issue the command:

do mydofile

You can also click on the Do current file icon in the do-file editor to run the do-file you are

currently editing.

Comments are indicated by a * at the beginning of a line. Alternatively, what appears inside /* */

is ignored. The /* and */ comment delimiter has the advantage that it may be used in the middle

of a line. Appendix A shows some typical do files.

Worked Example 1

You are required to use Stata to analyze data from the 5th Ghana Living Standards Survey

(GLSS5) on the use of electricity for lighting and traditional fuels for cooking.

Page 157: RET560 Research Methods Course Material V01

140

1. Generate a table showing total household income, main source of lighting and main fuel

used for cooking for all households covered in the GLSS5.

Solution

Command:

tabstat category1 totalincome s7dq13 s7dq11,by(hhid) by(loc2) columns(variables)

where; totalincome is generated from the data by using ;

gen totalincome=totemp+agric1c+agric2c+nsfey1+nsfey2+nsfey3+import+remitinc+

otherinc

and category1 is the income quintiles, to obtain the income quintiles, use the command

below;

pctile category= totalincome,nq(5)

xtile category1= totalincome,cut(category)

A table like that in figure 4.21 should appear in your result window.

Page 158: RET560 Research Methods Course Material V01

141

Figure 4.21: A table of total household income, main source of lighting and main fuel used for

cooking, for all households in Ghana.

Worked Example 2

Generate a bar chart to show the % distribution by income quintile for households in Ghana who

use electricity as their main source of lighting.

Solution

Command:

Page 159: RET560 Research Methods Course Material V01

142

graph bar (count) totalincome if s7dq11==1,over( category1) asyvars percentages

Figure 4.22 shows the content of the output.

Figure 4.22: A bar chart of distribution of electricity as fuel for cooking, for the various income

quintiles in Ghana.

Self Assessment 4.2

Use Stata to analyze the data from the 5th Ghana Living Standards Survey (GLSS5),

accompanying this manual.

4.2.1 Generate a bar chart to show the % distribution by income quintile for households in

Ghana which use traditional energy sources (Wood, Charcoal, Crop Residue/Sawdust,

Animal Waste and Other) as their main fuel for cooking.

2

3

4

10

4

02

46

810

Nu

mbe

r of h

ou

seh

old

s

Group One Group Two Group Three Group Four Group FiveSource:Fifth Ghana Living Standards Survey

FOR THE VARIOUS INCOME QUINTILES IN GHANA

DISTRIBUTION OF ELECTRICITY AS COOKING FUEL

Page 160: RET560 Research Methods Course Material V01

143

4.2.2 With the aid of tables and bar charts, show how access to LPG varies between rural

and urban areas for the Greater Accra region (GAR).

4.2.3 Still using tables and bar charts, show how access to LPG in rural and urban areas for

GAR compares with those for two other regions of your choice, both of which should

not be in the same ecological zone.

Learning Track Activities

Unit Summary

Statistical analysis softwares increase the accuracy and speed of analysing, especially,

sophisticated data. Planning and good policy can only be done more accurately, if the

requisite data analysis is done and done correctly. SPSS and STATA are some of the

common statistical analysis softwares that could be used in statistical analysis of data, such

as, the census data.

SPSS for windows is a menu-driven program, ie., most functions are performed by selecting

an option from one of the menus. Users have less control over statistical output than for

example, Stata or Gauss users.

Even though Stata could also be used as a menu-driven program, it is traditionally

command- line-driven package that operates in a graphical windowed environment. In Stata

researchers have greater control over the equations or the output.

Page 161: RET560 Research Methods Course Material V01

144

Key terms/ New Words in Unit

1. Toolbar

2. SPSS/STATA

3. Data editor

4. GLSS5

Unit Assignments 4

Use Stata to analyze the data accompanying this manual, from the 5th Ghana Living

Standards Survey (GLSS5).

If a question involves drawing table(s), submit your results and commands in a log

format.

For problems involving drawing graphs, write a do file to draw those graphs.

1. With the aid of tables and pie charts, show the variation between rural and

urban usage of LPG for the whole country and compare with those for

Greater Accra Region and any two regions of your choice.

2. With the aid of tables and bar charts, show how access to main source of

lighting varies between rural and urban areas of Ghana as a whole.

3. Again using tables and bar charts, show how the main source of lighting in

rural and urban areas of Ghana compare with those for any one region of

your choice.

4. Still with the aid of tables and bar charts, show how access to electricity as

main source of lighting varies between the region of your choice and Ghana

as a whole for each total household income quintile in Ghana.

Page 162: RET560 Research Methods Course Material V01

145

Unit 5

INTRODUCTION TO JOURNAL ARTICLES,

CONFERENCE PAPERS AND THESES WRITING

Introduction

Research findings are meant to be published so as to add to the body of knowledge in that

particular field of study. Research reports/papers, theses, journal articles and conference papers

are the widely used means of publishing research findings for the benefit of all interested parties.

This unit will guide students through the preparation of research reports/papers, theses, journal

articles and conference papers with proper referencing.

Learning Objectives

After reading this unit you should be able to:

1. Write journal articles and conference papers

for publications.

2. Prepare a full research report or thesis with

proper referencing.

UNIT CONTENT

SESSION 5.1: RESEARCH AND THESIS REPORTS

5.1.1 Thesis Report Writing

5.1.2 Research Proposal Writing

SESSION 5.2: JOURNAL ARTICLES AND CONFERENCE PAPER PREPARATION

SESSION 5.3: ABSTRACTS AND SUMMARIES AND REFERENCING

5.3.1 Abstracts and Summaries

5.3.2 Referencing

5.3.3 Referencing Formats

5.3.4 Introduction to referencing software packages

Page 163: RET560 Research Methods Course Material V01

146

SESSION 5.1: RESEARCH AND THESIS REPORTS

5.1.1 Thesis Report Writing

A thesis is a document submitted in support of candidature for an academic degree or

professional qualification in which the author's research and findings are presented. It is a

demonstration of a graduate student's ability to explore, develop, and organize materials

relating to a certain topic or problem in a field of study. The main aim of a thesis or project

is not only to pursue research and investigation, but also to write an extended scholarly

statement clearly, effectively and directly addressing the research problem.

Title page

This is made up of the full title of the thesis, the name and previous qualification of the

author, the Department to which it is being submitted, in partial fulfillment of requirement

for what degree and in which Faculty and month or year of presentation.

Abstract

An abstract is a brief summary of the thesis and the most likely part of the thesis to be

widely published and read. It should have a concise description of the problem addressed,

the methodology used, the results as well as conclusions. The abstract should usually be

composed as a single paragraph not exceeding 500 words.

Table of contents

This outlines clearly the chapters and subchapters as well content of the materials within

thesis and the pages where they are located.

List of figures, List of tables, List of Acronyms/Abbreviations

Where figures and tables, a list of tables and figures must be provided showing the pages

where the various figures and tables are located.

A list of acronyms/abbreviations with full explanations to the various

acronyms/abbreviations used must also be provided in this section.

Prefatory matter

Materials pertaining to the preface, foreword acknowledgement and etc may be presented

in this section. The acknowledgement page is however mandatory.

Introduction

Page 164: RET560 Research Methods Course Material V01

147

The introduction provides background information as well as the rationale for the

research work. It also provides information related to the need for the research and in the

process builds an argument for the research and presents research question(s) and aims.

The introduction should also give a detailed description of the various chapters as well as

their contents.

Literature Review

The literature review should provide a detailed account of research works done by other

researchers in the selected area of study, highlighting the merits as well as limitations.

Referencing in this particularly important in this section because it contains, mostly, works from

other researchers. This is where plagiarism becomes an issue. It is also important to discuss theory

which is directly relevant to your research in this section.

Methodology

This section of the thesis presents an understanding of the philosophical framework within

which the research will be carried out and gives the methodological approach as well as a

justification of the chosen methodology. This section should also clearly define the

boundaries of the research in terms of methodological approach and describe steps taken to

ensure ethical research practice.

Results and discussion

Research findings should be clearly reported in this section. Figures and tables should be used

where necessary to provide clarity. This section describes the observations made during the

research and the interpretations given to them. Results could be presented in tables, figures

or both, where possible and clear explanations should be given to them. It is important to

note that, information presented in tables should not be repeated in figures and vice versa.

The discussions could also include references to contemporary literature in the area of the

subject being studied.

Conclusions and recommendations

He section draws all the important arguments and findings together and in the process

providing the reader with a strong sense that the work has been done satisfactorily and that it

was worthwhile. It provides summaries of the major findings and presents limitations as well

as the implications. It is important to end this section on a strong note by suggesting

directions for future research in the respective field.

Page 165: RET560 Research Methods Course Material V01

148

References

This comprises a list of the major works (publications and authorities) consulted in the

course of writing the thesis. See the reference sections of these notes for more details of the

various referencing styles.

Appendices

An appendix provides a place for important information which, if placed in the main text,

would distract the reader from the flow of the argument. Includes raw data examples and

reorganised data (eg, a table of interview quotes organised around themes). Appendices

may be named, lettered or numbered (decide early).

5.1.2 Research Report Writing

Title

The title should be concise, attract attention, and highlight the main point of your paper.

It should be clear about the subject matter and devoid of abbreviations.

Abstract

The abstract is a concise summary of the paper and should be able to tell the reader

whether the paper is worth reading or not. It should therefore be as informative as

possible with respect to the objectives, methodology, results as well as the conclusions. It

should mostly not exceed 300 words.

Introduction

The introduction to a research paper should be as brief as possible and should touch on

background of the research problem, a clear justification of why the research is being

undertaken and also the underlying theory and hypothesis. It should contain a short

review of literature in the field of study and should be limited to a maximum of two

pages.

Materials and Methods

This section of the paper should describe concisely the procedure used to undertake the

research, such date anyone wishing to replicate the study can do so and obtain

Page 166: RET560 Research Methods Course Material V01

149

comparable results. It should be as detailed as possible in order to clear all forms of

ambiguity with regards to design of the research as well as the analysis of the results. In

the situation where known methodologies are used however, the details can be ignored

and instead cited in the reference but modifications to known mythologies should be

clearly explained.

Results and Discussions

This section describes the observations made during the research and the interpretations

given to them. Results could be present in table, figures or both, where possible and clear

explanations should be given to them. It is important to note that, information present in

tables should not be repeated in figures and vice versa. The discussions could also

include references to contemporary literature in the area of the subject being studied.

Conclusions and Recommendations

The conclusions drawn from the results of the research should be briefly and clearly

outlined and the importance of these conclusions should also be stated. All conclusions

should be supported by data presented in the research findings. This section should also

contain recommendations for future research in the respective field of study.

References

The report should include a bibliography or list of literature cited, consisting of

references to original literature relevant to the area of inquiry. It must include, but is not

limited to, all works cited in the text. Students should follow the approved departmental

style manual for the format of the reference.

Self Assessment 5.1

Page 167: RET560 Research Methods Course Material V01

150

SESSION 5.2: JOURNAL ARTICLES AND CONFERENCE PAPER

PREPARATION

A Journal Article, sometimes referred to as a Scientific Article, a Peer-Reviewed Article, or a

Scholarly Research Article is the means by which a scholar puts forth the results of an academic

research or information to add to the body of knowledge in their field of study and is usually

published in journals. Conference papers on the other hand are similar to journal articles except

they are delivered at conferences.

Guidelines for journal article/conference paper preparation vary from journal to journal and

from conference to conference but there some basic format that cuts across most of the

journals/conferences. These include title, name of researcher(s) and affiliation(s), abstracts, an

introduction which is made up of background information, problem statement, objectives and

justification of the research topic. The introduction should also give a general overview of the

whole paper.

5.2.1 Title

The title should be concise, attract attention, and highlight the main point of your paper. It should

be clear about the subject matter and devoid of abbreviations.

5.2.2 Authors

The list of authors with their institutional affiliation should be presented immediately after the

title. It should be ordered according to the level of contribution to the paper with the lead

contributor/principal author’s name listed first.

5.2.3 Abstract

It is important to provide a abstract of about 350 words which should summarise the entire paper,

highlighting the most important information such as the purpose of the research, methodology

used, results and conclusions.

5.2.4 Introduction

The introduction should provide a background to the research, state the problem briefly and

clearly outline the objectives of the research. It should

5.2.5 Methodology

The methodology tells how the research was conducted. It is important to describe in details the

various processes involved in carrying out the research with illustrations if possible.

Page 168: RET560 Research Methods Course Material V01

151

5.2.6 Results and discussions

The results of the research should be presented in this section and should be in the clearest forms

possible; whether it is text, figures, or tables. It is also important to use text to provide essential

information on figures and tables and be sure to define all terms in the text, figures and tables.

5.2.7 Conclusions and recommendations

State directly and briefly your conclusions and the utility of these conclusions. All conclusions

should be supported by data presented in the paper. Present your recommendations also in this

section of the paper.

5.2.8 References

References should be listed in alphabetical order at the end of the text in this section.

Self Assessment 5.2

Page 169: RET560 Research Methods Course Material V01

152

SESSION 5.4: SESSION 5.3: ABSTRACTS AND SUMMARIES AND

REFERENCING

5.3.1 Abstracts and Summaries

An abstract is a brief summary of a research article, thesis, review, conference proceeding

or any in-depth analysis of a particular subject and is mostly used to help the reader

quickly grasp the purpose of the paper. An abstract always appears at the beginning of a

manuscript, acting as the first of call for any given academic paper. It usually contains

between 300 and 500 words.

A summary is an abbreviated version of the most significant points in a book, article,

report or meeting. It is usually about 5% to 15% of the length of the original. It is useful

because it condenses material, informing the reader of the original’s most important points.

The commonest of summaries is the executive summary which is mostly for business and

management purposes. It varies from an abstract in that an abstract is usually shorter;

providing a neutral overview or orientation rather than being a condensed version of the

full document. Abstracts are extensively used in academic research where the concept of

the executive summary would be meaningless

5.3.2 Tables and Figures

Research data and results are mostly presented in tables and figures. Tables present lists of

numbers or text in columns, each column having a title or label where as figures are visual

presentations of results, including graphs, diagrams, photos, drawings, schematics, maps,

etc. Graphs are the most common type of figure and will be discussed in detail. When

figures and tables are used in a manuscript, they must be referred to from the text. It is

important to use sentences that draw the reader's attention to the major issues to be

highlighted by referring to the appropriate figure or table. They must also be properly

captioned for clarity.

5.3.2 Referencing

A reference, as defined by the De Montfort University, is the detailed bibliographic

description of the items from which information is gained. The basic idea behind

referencing is to support and identify the evidence you use in your research work. It helps

Page 170: RET560 Research Methods Course Material V01

153

to direct readers of your work to the source of evidence. References can be presented in

two ways; either in-text where it is briefly cited within the text, and/or in the reference list

where it is given in full at the end of the work. All items read for background information

but not referred to in the text are usually given in full at the end of the work in a reference

list sometimes referred to as the bibliography. In short, references should;

• Enable the reader to locate the sources used for a research work

• Help support arguments and add credibility to research work

• Show the scope and breadth of a research work

• Acknowledge the source of an argument or idea by acknowledging the various authors so

as to avoid plagiarism

5.3.3 Referencing Formats

There are so many referencing styles which can be grouped under two main headings; in-text

name style and the numeric referencing system.

In-Text Name Style

In-text name style involves citing the name(s) of author(s) or organization(s) in the text with

the year of publication. All the sources are then listed in alphabetical order at the end of the

work under any of these headings; ‘References’, ‘Reference list’, ‘Work cited, ‘Works

consulted’ or ‘Bibliography’ depending on the style used. Examples of this style of

referencing include the Author-date (Harvard) style, American Psychological Association

(APA) style, Modern Language Association of America (MLA), Chicago style, Modern

Humanities Research Association (MHRA) style and the Council of Science Editors (CSE)

style (Neville, 2010). The Author-date (Harvard) style will be discussed in details but

students can read more on the other referencing styles.

Author-date (Harvard) Style

This style, when used in the text, cites the last, family name or surname of the author(s), or

organizational name and the year of publication in the text of the document being worked on.

A full list of all references in alphabetical order must be given at the end of the text. It is

however important to ensure that the name used in the in-text citation connects with the name

used to start the full reference entry at the end of the text.

E.g. In-Text Citation style

1. There would appear to have emerged by the end of the twentieth century two broad

approaches to the management of people within organizations (Handy 1996).

Page 171: RET560 Research Methods Course Material V01

154

2. Handy (1996) argues that by the end of the twentieth century two broad approaches to

the management of people within organizations had emerged.

3. Some commentators, for example, Handy (1996), have argued that by the end of the

twentieth century two broad approaches to the management of people within

organizations had emerged.

4. It has been argued, (Handy 1996; see also Brown 1999 and Clark 2000), that two

approaches to the management of people within organizations had emerged by the

end of the twentieth century.

5. Charles Handy, amongst others, has argued that by the end of the twentieth century

two broad approaches to the management of people within organizations could be

observed (Handy 1996).

Full Reference Citation

1. Book Reference

AUTHOR(S) (Year) Title. Edition – (if not the 1st). Place of publication: Publisher.

E.g.

o WILMORE, G.T.D. (2000). Alien plants of Yorkshire. Kendall: Yorkshire

Naturalists’ Union.

o LI, X. and CRANE, N.B. (1993) Electronic style: a guide to citing electronic

information. London: Meckler.

2. Books with one or more editor(s)

EDITOR(S) (ed./eds.) – (Year) Title. Edition. Place of Publication: Publisher

E.g. SAUNDERS, M. (ed.) (1998). Advances in food science. Waterford: Nore Press.

3. Chapters in books

AUTHOR(S) (Year) Title of chapter. In: AUTHOR(S)/EDITOR(S), ed(s). Book title.

Edition. Place of publication: Publisher, Pages (use p. or pp.)

e.g. TUCKMAN, A. (1999) Labour, skills and training. In: LEVITT, R. et al, (eds.)

The reorganised National Health Service. 6th ed. Cheltenham: Stanley Thornes, pp.

135­155.

4. Publications from a corporate body (e.g. Government publications)

NAME OF ISSUING BODY (Year) Title. Place of publication: Publisher, Report no.

(where relevant), Pages, use p. or pp.

e.g. ENERGY COMMISSION OF GHANA (2006). Strategic National Energy Plan

2006 – 2020 and Ghana Energy Policy Main version. Ghana: Energy Commission.

5. Journal articles

Page 172: RET560 Research Methods Course Material V01

155

AUTHOR(S) (Year) Title of article. Title of journal, Volume number. (Part

no./Issue/Month), Pages, use p. or pp.

RYAN, J. (2006) ‘Management accounting for developers’, Journal of advanced

accounting, Vol. 1, No 5: p.21-24

6. Papers in conference proceedings

AUTHOR(S) (Year) Title. In: EDITOR(S) Title of conference proceedings. Place and

date of conference (unless included in title). Place of publication: Publisher, Pages,

use p. or pp.

e.g. GIBSON, E.J. (1977) The performance concept in building. In: Proceedings of

the 7th CIB Triennial Congress, Edinburgh, September 1977. London: Construction

Research International, pp. 129­136.

7. Electronic sources

AUTHOR(S) (Year) Title of document [Type of resource, e.g. CD­ROM, e­mail,

www] Organization responsible (optional). Available from: web address [Date

accessed].

e.g. UNIVERSITY OF SHEFFIELD LIBRARY (2001) Citing electronic sources of

information [WWW] University of Sheffield. Available from:

http://www.shef.ac.uk/library/libdocs/hsl­dvc1.pdf [Accessed 23/02/07].

Numeric Referencing System

This system comprises mainly of two referencing styles, name the consecutive numbering

style and the recurrent numbering style.

Consecutive Numbering uses superscript numbers in the text that connect with references in

either footnotes or chapter endnotes (but usually the former). This system uses different and

consecutive number for each reference in the text. A list of sources is included at the end the

document, which lists all the works referred to in the notes (‘References’, ‘Works cited’).

(Neville, 2010)

Recurrent numbering style uses bracketed (or superscript) numbers in the text that connect

with a list of references at the end of the chapter/assignment. In this case, the same number

can recur if a source is mentioned more than once in the text. (Neville, 2010)

Page 173: RET560 Research Methods Course Material V01

156

5.3.4 Introduction to referencing software packages

Referencing Software also referred to as Bibliographic Management Software is designed

to help you store the references which you have located, and then cite those references in

an essay, paper, thesis or book which you are writing. It helps one to;

create a bibliography for a thesis, assignment or journal article in a preferred citation

style

Download and store references

Include abstracts, keywords and notes with the references and also full texts

Produce lists of references for yourself or others

Automatically insert citations of references while typing (Cite While You Write)

Create a bibliography while typing (Cite While You Write)

Examples of Referencing Software Packages

There are numerous referencing software packages but the commonest are endnote, endnote web

and reference manager.

Endnote

EndNote is a commercial reference management software package, used to manage

bibliographies and references when writing essays and articles. EndNote is probably the most

sophisticated referencing product available today and can perform a wide range of referencing

tasks. There are extensive possibilities for the advanced user to customize the software to

individual needs. It can also be used to do the following:

create a personal database or library of reference information

Download and organise references and associated images and PDF files

To insert citations, figures in documents and create bibliographies

To import bibliographic data from external databases and library catalogues

EndNote is compatible with recent versions of Microsoft Word (Windows and Macintosh) and

installs an add-in for easy integration with your word processing software. It is used most

effectively from the start of a project, when information is being resourced, rather than when

writing up begins.

Endnote Web

Page 174: RET560 Research Methods Course Material V01

157

EndNote Web is a simplified version of the full desktop EndNote product. It has only recently

been released and is still under development, but it can perform many common referencing tasks.

EndNote Web is compatible with recent versions of Microsoft Word (Windows and Macintosh).

One must download and install a plug-in to enable EndNote Web to work with Word. Once

registered for Endnote Web one can:

Format citations and footnotes or a bibliography

Use ‘Cite While You Write’ in Microsoft Word to easily cite references in your paper.

Download the ‘Cite While you Write’ plug-in for Word from the EndNote Web site.

Transfer references to and from EndNote on your desktop

Share references with others who have EndNote Web

Reference Manager

Reference Manager is most commonly used by people who want to share a central database of

references and need to have multiple users adding and editing records at the same time. You can

specify whether users are allowed to make edits to the database. Reference Manager offers

different in-text citation templates for each Reference Type. It is however limited to Windows

operating systems only. Use Reference Manager to:

Create a personal database of reference information

Insert citations and create bibliographies

Import bibliographic data from external databases and library catalogues

Reference Manager is used most effectively from the start of a project, when information is

being gathered, rather than when writing up begins.

Further details about the features of Reference Manager are available on the Reference

Manager website along with an online overview of the new features of Reference Manager 12

Self Assessment 5.3

Page 175: RET560 Research Methods Course Material V01

158

Learning Track Activities

Unit Summary

Communicating research findings to interested stakeholders is very important

since research works are usually carried out to address a specific issue in the

society. Journal articles and conference papers are among the commonest ways of

communicating research findings to stakeholders.

Thesis/research papers are a more detailed way of disseminating research findings

with the former directed more towards academics or scholars. It is also very

important for proper referencing to be done when putting together these

documents in order to avoid plagiarism.

Key terms/ New Words in Unit

Abstracts

Bibliography

Endnote

Journal articles

Referencing

Summaries

Page 176: RET560 Research Methods Course Material V01

159

Unit Assignments 5

COURSE SUMMARY

The course is organised under five units. Introduction to research proposal writing and thesis

synopsis development is treated in unit1 while engineering research design and data analysis is

treated in unit 2. Unit 3 looks at social science research design and data analysis with unit for

concentrating on statistical analysis using SPSS and STATA. Finally, unit 5 introduces the

concept of journal article/conference paper writing and thesis report preparation.

Unit 1 sought to introduce students to the preliminary stages of research which involves the

preparation concept notes, which gives a brief idea about the nature of the research. It also

tackled the preparation of a full research proposal where it also looked at the logical framework

analysis as well as detailed budget preparation. The unit ended with an introduction to thesis

synopsis writing.

Unit 2 dealt with the rudiments of engineering research design and data analysis where issues

such as the various contexts in engineering practice which necessitate research, classification of

experiments that may be undertaken as part of the research and procedures for the design of

experiments. It went on to treat error theory and the various sources of research errors. The unit

also treated the concept of probability theory.

Unit 3 talked about social science research design and data analysis where it looked at the

various research methodologies including survey research as well as case study research. The

unit also treated some basic research ethics including balancing cost and benefits in research.

Unit 4 introduces statistical analysis software packages and their importance in increasing the

accuracy and speed of analysing, especially, sophisticated data. It went on to indicate that,

planning and good policy can only be done more accurately, if the requisite data analysis is done

Page 177: RET560 Research Methods Course Material V01

160

and done correctly. SPSS and STATA are some of the common statistical analysis software

packages that could be used in statistical analysis of data, such as, the census data.

Unit 5 put together all the works done during the research into a document for dissemination.

This introduced the concept of journal articles/conference paper writing, research report/thesis

writing and abstracts/summaries. The unit ended with a brief discussion of the various

referencing styles and a more elaborate explanation of the Harvard way of referencing.

APPENDIX A1

KWAME NKRUMAH UNIVERSITY OF SCIENCE AND TECHNOLOGY

THESIS SYNOPSIS

NAME: EMMANUEL YEBOAH OSEI

INDEX NO: PG2678108

PROGRAMME: MSC MECHANICAL ENGINEERING (THERMOFLUIDS)

DEPARTMENT: MECHANICAL ENGINEERING

FACULTY: AGRICULTURAL AND MECHANICAL ENGINEERING

DURATION: TWO YEARS (FULL TIME)

TITLE: FEASIBILITY STUDY FOR A WIND POWER GENERATION

PROJECT IN GHANA

SUPERVISOR: PROF. ABEEKU BREW-HAMMOND

Page 178: RET560 Research Methods Course Material V01

161

……………………… …………………………… ………………………….

E. Y. OSEI DR. A. K. SUNNU PROF. BREW-HAMMOND

(CANDIDATE) (HEAD OF DEPARTMENT) (SUPERVISOR)

August 2009

BACKGROUND

Modern life would come to a halt without energy and this makes it simply impossible to

live without it. Studies have shown that simply harnessing the power of oxen in ancient times for

example increased the power available to the human being by a factor of 10 (World Energy

Council, 2000). The invention of the vertical water wheel increased human productivity by

another factor of 6 (WEC et al., 2000). The use of motor vehicles and airplanes have drastically

reduced journey times and increased the ability of humans to transport goods over wider

distances. Energy being the foundation for industrial civilization coupled together with the

depleting conventional fossil sources has made it necessary for the world to seek alternative

sources to meet the increasing demand.

Renewable energy sources are becoming increasingly attractive due to the limited fossil

reserves and the adverse effects associated with their use. They have the potential to provide

energy with zero or almost zero emissions of greenhouse gases and other air pollutants. The

renewable energy sources including solar, wave, wind, hydro, tidal, geothermal and bio-energy

are readily available and can provide complete energy security if their technologies are well

established (REN21, 2008).

Wind energy, first used by the Egyptians around the 4th century BC is a promising source

of electrical power because it has key advantages such as cleanliness, low cost, sustainability,

Page 179: RET560 Research Methods Course Material V01

162

popularity, safety and abundance in most parts of the world. Studies in Ghana indicate that the

monthly average wind speed measurement at 12 m height above ground level lies in the range of

4.8 – 5.5 m/s (Akuffo et al., 2003). For wind speed of less than or equal to 4.4 m/s at a height of

10 m, the wind power density is less than or equal to 100 W/m2 according to Li and Li (2005).

Despite this potential, the electrification rate in Ghana is 49.2% and 11.3 million people are

without electricity (IEA, 2006). The productivity of this large number of people is seriously

compromised and this constrains their opportunities for economic development and improved

living standards. This project seeks to assess the technical performance and determine the cost of

building a 50 MW wind power plant in Ghana.

JUSTIFICATION

The need to ensure electricity supply security first came to light in the 1980’s when

Ghana suffered a major drought resulting in reduced inflows to the Akosombo Dam. This

disrupted electricity supplies and adversely affected the performance of the economy. Today,

Ghana faces the challenge of providing reliable energy for the rapidly growing demand by all

sectors due to the expanding economy and growing population. It has been estimated that grid

electricity demand would grow from about 6,900 GWh in 2000 to about 18,000 GWh by 2015,

reaching about 24,000 GWh by 2020 (Energy Commission, 2006). The existing installed

electricity generating capacity of 1760 MW would have to be doubled by the year 2020 if Ghana

is to be assured of secured uninterrupted electricity supply (Energy Commission, 2006). To

become wealthy as a country, Ghana needs to grow at a GDP between 8 – 10% and these growth

rates require significant amount of electricity (Brew-Hammond et al., 2007).

Wind power use and development worldwide is growing rapidly, having doubled in the

three years between 2005 and 2008. The global wind industry installed close to 20,000 MW of

new capacity in 2007. This development, led by Spain, China and United States took the

worldwide total to 93,864 MW which was an increase of 31% compared with the 2006 market

and represented an overall increase in global installed capacity of about 27% (GWEC et al.,

Page 180: RET560 Research Methods Course Material V01

163

2008). In 2008, it accounted for 19% of the electricity production in Denmark, 10% in Spain and

Portugal and 7% in Germany and the Republic of Ireland. At the end of that same year, the

worldwide nameplate capacity of wind-powered generators was 120.8 GW (Wikipedia, 2009).

These success stories attest to the efficacy of wind power technology as a viable option in

providing energy and reducing environmental pollution.

The installation of 50 MW wind power plant in Ghana is to augment the existing sources

of electricity in the country which are mainly from thermal and hydro sources. This will to some

extent contribute positively to the aggravating energy situation in the country. Wind energy

being a renewable source has the ability to provide energy in a sustainable manner and with

virtually zero emission of pollutants and greenhouse gases.

The Energy Commission of Ghana in 2003 conducted a study to gather and analyze wind

energy data in some areas of the country (Akuffo et al., 2003). This data would help determine

the wind turbine technology to use and the estimate of the cost required for installation.

OBJECTIVES

The main objective of this thesis is to conduct a feasibility study of generating 50 MW from

wind energy in the coastal areas of Ghana.

The specific objectives are listed as follows:

1. To collate up-to-date wind measurements for Ghana’s coastal belt.

2. To select the area best suited for wind power development along the coastal belt of

Ghana based on the collated data.

3. To select wind turbine technology in the 50 MW range best suited for the selected area.

4. To undertake the technical performance assessment and greenhouse gas emissions

analysis for a 50 MW wind power plant using the selected technology at the selected

area.

5. To do a financial analysis of building the selected 50 MW wind power plant.

Page 181: RET560 Research Methods Course Material V01

164

METHODOLOGY

Literature would be sought in order to get acquainted with the relevant works that have

been done in the field of wind power. The areas of interest would include various wind flow

velocities in the world and particularly in Ghana, energy situation in the country, standard

relationships between wind speed and estimated power that can be generated per squared meter,

the relevance of wind power in the country and wind turbine design technologies. Sources of

information will include the KNUST library, internet, etc.

Prefeasibility study of a 50 MW wind power plant would be done using RETscreen with

in-built data and turbine specifications. The total initial cost will be determined as well as the

simple pay back period. Green house gas analysis will also be done.

Areas of the country best suited for wind power development will be selected based on

the recommendations of Solar and Wind Energy Resource Assessment compiled by the Energy

Commission of Ghana in 2003 and more recent data to be collected from them. The help of the

Ministry of Energy will be sought to approach private companies who have also made their own

measurements for coastal areas with the view to acquiring their data sets to be included with

those of the Energy Commission.

Wind turbine design technologies and their technical performance characteristics plus

their costs would be collected from the manufactures, reviewed and the best ones suited for the

country’s situation determined. The comparison criteria will include merits and demerits,

technical considerations, applicability to the Ghanaian situation, etc. The technical assessment of

the whole plant will be carried out with Wind Atlas Analysis and Application Program (WAsP)

designed by Risø National Laboratory.

The cost of building a 50 MW wind power plant in the areas of interest would again be

determined using Computer Model for Feasibility Analysis and Reporting (COMFAR) software

package designed by UNIDO for feasibility studies.

Page 182: RET560 Research Methods Course Material V01

165

WORK PLAN

TIMELINES FOR THE COMPLETION OF THESIS

2009

MONTHS MAR APR MAY JUN JUL AUG SEP OCT NOV DEC

WEEKS 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Synopsis

Literature

Review

Thesis Writing

Prefeasibility

Study

Technical

Assessment

Financial

Analysis

Thesis wrap up

Submission of

Draft Thesis

BUDGET FOR COMPLETION OF THESIS

EXPENSES TOTAL COST GH¢

Stipend (GH¢ 400 per month for 10 months) 4,000

Printing of Draft Thesis 150

Printing of Final Thesis 150

Page 183: RET560 Research Methods Course Material V01

166

Total 4,300

REFERENCES

Akuffo F. O., Brew-Hammond A., Antonio J., Forson F., Edwin I. A., Sunnu A., Akwensivie F.,

Agbeko K. E., Ofori D. D., Appiah F. K. (2003). Solar and Wind Energy Resource Assessment

(SWERA). Department of Mechanical Engineering, KNUST.

Brew-Hammond A., Kemausuor F., Akuffo F. O., Akaba S., Braimah I., Edjekumhene I.,

Essandoh E., King R., Mensah-Kutin R., Momade F., Ofosu-Ahenkorah A. K., Sackey T. (2007).

Energy Crisis in Ghana: Drought, Technology or Policy? Kwame Nkrumah University of

Science and Technology, Kumasi, Ghana. ISBN: 9988-8377-2-0.

Energy Commission of Ghana (2003). Solar and Wind Energy Resource Assessment (SWERA).

Department of Mechanical Engineering, KNUST.

Energy Commission of Ghana (2006). Strategic National Energy Plan 2006 – 2020 and Ghana

Energy Policy. Main version.

Global Wind Energy Council, Greenpeace, Wind Power Works (2008). Global Wind Energy

Outlook 2008.

International Energy Agency (2006). World Energy Outlook. OECD/IEA, Paris.

Meishen Li, Xianguo Li (2005). Investigation of wind characteristics and assessment of wind

energy potential for Waterloo region, Canada. Department of Mechanical Engineering,

University of Waterloo, 200 University Avenue West, Waterloo, Ont., Canada, N2L 3G1.

REN21 (2008). Renewables 2007 Global Status Report. Paris: REN21 Secretariat and

Washington, DC: Worldwatch Institute.

Resource Center for Energy Economics and Regulation (2005). Guide to Electric Power in

Ghana – First Edition. Institute of Statistical, Social and Economic Research, University of

Ghana, Legon.

Page 184: RET560 Research Methods Course Material V01

167

Wikipedia (2009). Wind Power. http//en.wikipedia.org/wiki/Wind_energy (assessed: 23 March

2009).

World Energy Council, United Nations Development Programme, United Nations Department of

Economic and Social Affairs (2000). World Energy Assessment: energy and the challenge of

sustainability. New York, NY 10017. ISBN: 92-1-126126-0.

APPENDIX A2

KWAME NKRUMAH UNIVERSITY OF SCIENCE AND TECHNOLOGY

THESIS SYNOPSIS

Page 185: RET560 Research Methods Course Material V01

168

NAME: FAISAL WAHIB ADAM

INDEX NO: PG3759209

PROGRAMME: M.Sc. MECHANICAL ENGINEERING

DEPARTMENT: MECHANICAL ENGINEERING

FACULTY: AGRICULTURAL AND MECHANICAL ENGINEERING

DURATION OF PROGRAMME: TWO YEARS (FULL TIME)

TITLE: A STUDY ON THE FAILURE OF THE FEMORAL SHAFT IMPLANTS

(A CASE STUDY OF THE KOMFO ANOKYE TEACHING HOSPITTAL, KATH)

SUPERVISORS: DR.JOSHUA AMPOFO (MECHANICAL ENGINEERING, KNUST)

DR.R.KUMA AMATEPEY (TRAUMA AND OTHOPEDIC

DEPARTMENT, KATH)

SIGNATURES:

………………….. ………………………………. …………….........................

FAISAL WAHIB ADAM DR. A. K. SUNNU DR.JOSHOUA AMPOFO

(Student) (Head of Department) (Supervisor)

BACKGROUND

The femur or thigh bone, is the strongest, longest, and heaviest bone in the body and is essential for

normal ambulation. It consists of three parts; femoral shaft or diaphysis, proximal metaphysis, distal

metaphysic(Douglas et al., 2008).

Page 186: RET560 Research Methods Course Material V01

169

Figure. 1(Wikipedia,2010)

A femoral shaft fracture is a severe injury that generally occurs in high-speed motor vehicle collisions

and significant falls. These injuries are often one of the several major injuries experienced by patients

(Jonathan, 2005). This type of fractures like may other bony fractures has become more common in

Ghana due the exponential increase in the number of motor vehicle accident.

The occurrence of fractures of the femoral shaft in the United States is reported in the bimodal

distribution and it peaks at 25 and 65 years of age with an overall incidence of approximately 1 per

10,000 people per year. Motor vehicle accident is the most common cause, followed by pedestrian

versus automobile, falls from height, and gunshot injuries (Jesse, 2008).A similar studies done by Hinton

et al., 2000, reveal that the rate of femoral shaft fractures in children in Maryland was 19.5 per 100,000

per year, the same as the overall incidence in Finland. The most commonly occurring fracture in children

aged 6 to 9 years was caused when they were struck by cars. Once children reached driving age, the

most frequent cause was a motor vehicle accident. This variation gave rise to a bimodal distribution with

peaks at 2 and 17 years

Page 187: RET560 Research Methods Course Material V01

170

In the Department of Orthopaedic Surgery and Traumatology, Obafemi Awolowo University Teaching

Hospital, Ile-Ife, Osun State, Nigeria, a study of fractures reported indicates that the distribution of the

involved bones included being humerus 10%, femoral shaft 65%, and tibia 25% (Innocent et al., 2006)

Nowadays femoral shaft fractures in adults are usually treated operatively. With more and more of

femoral shaft fractures getting operated the number of complications has proportionately increased.

One such complication is implant failure. An implant is said to have failed if it is found to be inadequate

in performing the function expected of it.

The study of the causes of this failure for engineering purposes requires quantitation of many factors,

most of which the surgeon is aware but cannot access quantitatively the requirements of a particular

situation as an engineer does. This is why an engineering analysis needs to be done to find these causes.

JUSTIFICATION

A discussion with a section of orthopedic doctors and nurses at the orthopedic department (KATH)-

Kumasi-Ghana, has revealed that there is an alarming rate of femoral shaft implant failures, and this

calls for an objective assessment of the exact circumstances that lead to implant failure, as it is

necessary to prevent this complication in one of the major weight bearing bones of the body.

Failure of an implant is a condition that needs to be completely avoided in the human body, because of

the devastating complications that it can bring, for instance a bend in the implant gradually removes the

thin film of oxide on its surface and hastens the corrosive process, the metal if not removed continually

sheds so that the surrounding soft tissue slowly become saturated with metal particles, which may lead

to aseptic inflammation many years after implantation(Charles et al., 1959).Another complication is

shortening of femur, and this leaves the patient with torsion on the pelvic girdle.

The causes of implant is a complex one to look at, because, it involves the engineer(designer),the

surgeon, Operating-room personnel and the patient, all these people have a potential contribution to

failures as well as to successes of the implant. From the standpoint of Mechanical Engineering, every

device has points of weakness at which it will fail when the margin of safety is exceeded. It is the

designer's responsibility to provide an adequate minimum margin, and it is the surgeon's not to exceed

that margin (Cohen, 1964).

A lot of work has been done on the failure of femoral shaft implants in many countries, but to my

knowledge the causes of the failure of femoral shaft implants in operative orthopaedic practice has not

been reported in the Komfo Anokye Teaching Hospital-Ghana. In this background it is decided to study

the causes of the implant failure of the shaft of the femur, from the Mechanical Engineering point of

view, by testing the mechanical properties of the implant, to obtain the allowable stress in order to

compare it with the stresses acting on the implant, so as to suggest guidelines to minimize further

failures.

Page 188: RET560 Research Methods Course Material V01

171

OBJECTIVES

The objective of this work is to find the causes of failure of the femoral shaft implants at the Komfo

Anokye Teaching Hospital (KATH)

The main objectives will be;

To find the mechanical properties of the femoral shaft plate implant that is used

-the tensile strength

-modulus of elasticity

To find out the material composition of this implant

To find out the possible forces that could act on the bone and plate assembly

To find the possible causes of failure so as to suggest guidelines to minimise further failures

METHODOLOGY

Two cases of healed femoral shaft implants and three failed ones, who presented at the department of

Orthopaedics KATH- Kumasi-Ghana, will be studied under the following headings;

Age

Sex

Body weight

Nature of primary injury

Anatomical site of the fracture

Type of primary fixation

Weight bearing

The X-ray of the fracture site will be taken together with the removed implant. The implant will be taken

to the mechanical engineering laboratory for the tensile test to be done. The x-ray will aid in the

computer modeling, to predict the forces that could have cause that kind of failure using the ANSYS

software to do a progressive failure analysis.

Exclusion criteria

Infected implant failure and implant failures in pathological fractures of femur.

Page 189: RET560 Research Methods Course Material V01

172

FACILITIES AVAILABLE

KNUST Library – Kumasi

The Universal Testing Machine (Mech.Dept. KNUST)

A and E Theatre(KATH) – Kumasi

World Wide Web (Internet)

The Metallurgy Laboratory (Mech.Dept. KNUST)

ANSYS Software

REFERENCES

1. Jonathan Cohen, Failure in Performance of Surgical Implants, Journal of Bone and Joint Surgery

http://www.jbjs.org. (Accessed 2010 February 6)

2. Charles Orville Bechtol,Albert Barrnett Ferguson,Patick Gowans Laing(1959),Metal and

Engineering in Bone and Joint Surgery

Page 190: RET560 Research Methods Course Material V01

173

3Alfred O. Ogbemudia,Phillip F.A.Umebee (2006).Implant Failure in Osteosynthesis of Fractures of Long

Bones. Journal of Medicine and Biomedical Research (College of Medical Sciences, University of Benin

Nigeria)

4. C.R.F.Azevedo,E.Hippert Jr. (2002).Failure Analysis of Surgical Implants in Brazil

5. Jesse T.Torbert.Femoral Shaft Fractures. http://www.orthopaedic.com . (Accessed 2010 March 11).

6. Douglas F.Aukerman,John R.Deitch,Janos P.Ertl,William Ertl.(2008). Femur Injuries and Fractures.

7. Richard Y.Hinton,Andrew Lincoln,Gordon Smith.Fracture of the Femoral Shaft in

Children.Incidence,Mechanism and Sociodemographic Risk Factors, Journal of Bone and Joint Surgery

http://www.ejbjs.org, . (Accessed 2010 March 18).

8. Rozbruch,Roberts S;M�̈�ller,Urs;Gautier,Emanuel;Gans,Reinhold. The evolution of femoral shaft

plating technique.http://journals.lww.com. (Accessed 2010 March 4).

9.Feres S.Haddad,Clive P.Duncan,Daniel J.Berry,David G.Lewallen,Allan E.Gross,Hugh P.Chandler.

Periprosthetic Femoral Fractures Rround Well-fixed Implant;Use of Cortical Onlay Allografts with or

without a plate. Journal of Bone and Joint Surgery http://www.ejbjs.org,

( accessed 2010 March 4).

10. RJ Brumback,S Uwagie-Ero,RP Lakatos ,A Poka ,GH Bathon and AR Burgess(1988). Intramedullary

Nailing of Femoral Shaft Fractures.Part II;Fracture Healing with Static Interlocking Fixation. Journal of

Bone and Joint Surge.

11. RW Buchoz ,SE Ross,and KL Lawrence(1987).Fatique Fracture of the Interlocking Nail in the

Treatment of Fractures of the Distal part of the Femoral Shaft. Journal of Bone and Joint Surgery.

DETAIL BUDGET FOR COMPLETION OF THESIS

EXPENSES Unit Cost (GH ¢) Period(month) Total Cost (GH ¢)

Visits to the hospital(KATH) 20 5 100

Page 191: RET560 Research Methods Course Material V01

174

Printing and Binding of Thesis 200 200

Stipends 400 8 3 200

Books 500 500

Miscellaneous 100 100

Total 3 800

Page 192: RET560 Research Methods Course Material V01

175

WORKPLAN FOR COMPLETION OF PROJECT

YEAR 2010

MONTHS MARCH APRIL MAY JUNE JULY AUGUST SEPTEMBER OCTOBER NOVEMBER DECEMBER

WEEKS 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

Literature review, Synopsis Writing, Sponsorship

Taking samples from hospital

Design of experiment

Chapters one and two

Testing of samples

Computer modeling

Chapter three

Analysis of results

Chapters four and five

Submission of draft thesis

APPENDIX B

/*#########################################################################*/

/* DO-FILES WRITTEN BY: FAISAL WAHIB ADAM */

/* MECHANICAL ENGINEERING DEPARTMENT */

/* KWAME NKRUMAH UNIVERSITY OF SCIENCE AND TECHNOLOGY */

/*#########################################################################*/

*DATE: [05-03-2011]

Page 193: RET560 Research Methods Course Material V01

176

*****************************************************************************

//FIRST RESULT (2 tables)

use "C:/Documents and Settings/Administrator/Desktop/Stata

10.0/Faisal/finalgraphV3a.dta"

log using result1,replace text

describe

tabulate s7dq13 category1

tabulate s7dq13 category1,column nofreq

log close

exit

*****************************************************************************

///SECOND RESULT (1 graph)

use "C:/Documents and

Settings/Administrator/Desktop/Stata10.0/Faisal/finalgraphV1.dta"

log using result2,replace

describe

graph bar (count) hhid ,over( s7dq13) over(category1) asyvars percentages

stack title(" DISTRIBUTION OF COOKING FUELS") subtitle("FOR THE VARIOUS

INCOME QUINTILES IN GHANA")ytitle("Percentage of households")

note("Source:Fifth Ghana Living Standards Survey" )

legend(position(3) cols(1) order(8 7 6 5 4 3 2 1))

log close

exit

*****************************************************************************

///THIRD RESULT (1 table)

use "C:/Documents and Settings/Administrator/Desktop/Stata

10.0/Faisal/finalgraphV3b.dta"

log using result3,replace text

describe

tab region if s7dq13==7

log close

exit

*****************************************************************************

///FOURTH RESULT (1 graph)

use "C:/Documents and Settings/Administrator/Desktop/Stata

10.0/Faisal/finalgraphV3b.dta"

log using result4,replace

label define category1 1 "1", modify

label define category1 2 "2", modify

label define category1 3 "3", modify

label define category1 4 "4", modify

label define category1 5 "5", modify

label define region 1 "UE", modify

label define region 2 "Nortn", modify

label define region 3 "UW", modify

label define region 4 "BA", modify

label define region 5 "Volta", modify

label define region 6 "Centl", modify

Page 194: RET560 Research Methods Course Material V01

177

label define region 7 "Eastn", modify

label define region 5 "Westn", modify

label define region 6 "Ashti", modify

label define region 7 "Accra", modify

describe

graph bar (count) hhid ,over( s7dq13) over(category1) over(region) asyvars

percentages stack ///

title(" DISTRIBUTION OF COOKING FUELS") subtitle("FOR THE VARIOUS INCOME

QUINTILES AND REGIONS IN GHANA") ///

ytitle("Percentage of households") note("Source:Fifth Ghana Living Standards

Survey" ) ///

legend(position(3) cols(1) order(8 7 6 5 4 3 2 1))

log close

exit

*****************************************************************************

Selected Answers to Unit Assignments

[Supply selected answers to Unit Assignments]

Page 195: RET560 Research Methods Course Material V01

178

Course Quiz/Exams

[Supply course quiz of this course here for the attention of the Institute’s examinations officer]

Page 196: RET560 Research Methods Course Material V01

179

RESEARCH/PROJECT AREAS AND RELATED TOPICS

IN THIS COURSE

[Supply research/project areas and related topics in this course for use by students]

Page 197: RET560 Research Methods Course Material V01

180

SOME CASE STUDIES IN THIS COURSE

Page 198: RET560 Research Methods Course Material V01

181

MY PAGE

Name: _______________________________________ Learning Centre: _________________

Contact: Tel. ____________ Email: ________________ Emergency Name/Phone: __________

Important numbers: Student number ______________ Examination number ________________

Program: ___________ Year: _______ Course code/title: _______________________________

Course objectives: ______________________________________________________________

_____________________________________________________________________________

Course dates/Semester No ( ): Starts_____________________ Ends ____________________

FFFS schedule/Dates: ___________________________________________________________

Quiz dates: ____________________________________________________________________

Assignments hand in dates: _______________________________________________________

Revision dates: _________________________________________________________________

Group discussion/work members (names and contacts): _________________________________

______________________________________________________________________________

______________________________________________________________________________

End of course Self Evaluation:

I have completed all Units & interactive sessions , mastered all learning objectives, completed

all self Assessments, unit summary, key words and terms, discussion questions, review

questions, reading activity, web activity, unit assignments, and submitted all CA scoring

assignments, learner feedback on this course and submitted my comments and course focus

contributory questions to facilitator for discussion.

Self-grading: self assessment questions score ______ % Unit Assignments scored ______ %

My course conclusion remarks: ____________________________________________________

______________________________________________________________________________

______________________________________________________________________________

Page 199: RET560 Research Methods Course Material V01

182

____________________________________________________ (may continue on reverse side)

Page 200: RET560 Research Methods Course Material V01

183

= =

= =

= =

= =

= =

= =

= =

= =

= =

=

= =

= =

= =

d

etac

h a

nd r

etu

rn t

o I

DL

, K

NU

ST

= =

= =

= =

= =

= =

= =

= =

= =

= =

= =

Learner Feedback Form/[insert course code]

Dear Learner,

While studying the units in the course, you may have found certain portions of the text

difficult to comprehend. We wish to know your difficulties and suggestions, in order

to improve the course. Therefore, we request you to fill out and send the following

questionnaire, which pertains to this course. If you find the space provided

insufficient, kindly use a separate sheet.

1. How many hours did you need for studying the units

Unit no. 1 2 3 4 5 6

No. of hours

2. Please give your reactions to the following items based on your reading of the

course

Items Excellent Very

good

Good Poor Give specific examples, if

poor

Presentation

quality

Language and

style

Illustrations

used

(diagrams,

tables, etc.)

Conceptual

clarity

Self assessment

Feedback to SA

3. Any other comments (may continue on reverse side)

Unit 1: _______________________________________________________________

Unit 2: _______________________________________________________________

Unit 3: _______________________________________________________________

Unit 4: _______________________________________________________________

Unit 5: _______________________________________________________________

Unit 6: _______________________________________________________________

Page 201: RET560 Research Methods Course Material V01

184