The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

22
1 The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio Elizabeth Ruben, Jerry Phelps, Kristianna Pettibone, and Christina H. Drew Program Analysis Branch Division of Extramural Research and Training National Institute of Environmental Health Sciences November 4, 2011

description

The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio. Elizabeth Ruben, Jerry Phelps, Kristianna Pettibone, and Christina H. Drew Program Analysis Branch Division of Extramural Research and Training - PowerPoint PPT Presentation

Transcript of The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

Page 1: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

1

The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants

Portfolio

Elizabeth Ruben, Jerry Phelps, Kristianna Pettibone, and Christina H. DrewProgram Analysis Branch

Division of Extramural Research and Training National Institute of Environmental Health Sciences

November 4, 2011

Page 2: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

NIEHS Mission

Reduce the burden of human illness and disability by understanding how the environment influences the development and progression of human disease.

Page 3: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

Purpose

To investigate the use of the text mining/data visualization tool OmniViz™ as a way to:

•Help us understand patterns in our portfolio that could inform the management of science in a new way.

•Visualize the assignment of grants to program officers.

•Explore emerging areas of science.

•Identify gaps in research.

3

Page 4: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

What is OmniViz?

Software designed to find and display trends in large amounts of data.

Specifically designed for bio-medical, healthcare, pharmaceutical industries.

4

Page 5: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

The Process:

1. Obtain our active grant portfolio data.

2. Limit our data set by grant type and program to focus on our Research Grant Program portfolio.

3. Import data into OmniViz.

4. Select text mining algorithm.

5. Identify words to eliminate in the text mining algorithm. (stop words)

Page 6: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

Question 1: Can OmniViz help us understand patterns in a portfolio that could inform the management of science in a new way?

6

Page 7: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

7

Galaxy: DERT Active Research Grants

Note: Labels are created by NIEHS; not the OmniViz default.

7

= Cluster of grants

Legend:

. = One grant

Page 8: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

8

Galaxy: DERT Active Research Grants

Note: Labels are created by NIEHS; not the OmniViz default.

8

= Cluster of grants

Legend:

. = One grant

Page 9: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

9

Galaxy: DERT Active Research Grants

Note: Labels are created by NIEHS; not the OmniViz default.

9

= Cluster of grants

Legend:

. = One grant

Page 10: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

10

Galaxy: DERT Active Research Grants

Note: Labels are created by NIEHS; not the OmniViz default.

10

= Cluster of grants

Legend:

. = One grant

Page 11: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

11

Galaxy: DERT Active Research Grants

Note: Labels are created by NIEHS; not the OmniViz default.

11

= Cluster of grants

Legend:

. = One grant

Page 12: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

Initial View of Grant ClustersDERT Active Research Project Grant Portfolio

Basic Science

Human Studies

Transitional

Training/Education

12

Note: Labels are created by NIEHS; not the OmniViz default.

Page 13: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

13

DNA Repair Grants

Program OfficerNumber of DNA Repair Grants

Program Officer 1 39

Program Officer 2 2

Program Officer 3 1

Program Officer 4 1

Grand Total 43

13

Page 14: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

Question 2: Understand Program Administrator Workload Distribution

• Examples of individuals across galaxy visualization

• Similar/Different

• Branch Distribution

14

Page 15: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

15

Portfolio Distribution Across Program Officers

Legend:Program Officer 1Program Officer 1Program Officer 2Program Officer 2

15

Page 16: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

16

Portfolio Distribution Across Program Officers

Legend:Program Officer 1Program Officer 1Program Officer 3Program Officer 3

16

Page 17: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

17

Portfolio Distribution Across Branches

Legend:Branch ABranch BBranch C

17

Page 18: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

18

Galaxy: DERT Active Research Grants

Basic Science

Human Studies

Transitional

Training/Education

18

Note: Labels are created by NIEHS; not the OmniViz default.

Page 19: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

19

Program Officers by Category of ScienceProgram Officer

Human Studies

Transitional Basic Science

Training and Education

Number of Categories

KG X       1LO  X       1LC     X   1LR     X   1FT     X   1DC     X   1CT     X   1CD X X     2DB  X   X   2DS   X  X   2JH   X X 2LM   X X   2SN   X X   2CL   X X 2

KM   X X   2MH     X X 2AK  X X X 3CS X X X X 4

Page 20: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

20

Pros and Cons of Using This Tool

Pro Con

• Big picture view of our portfolio

• Output

• Novel way of doing pattern analysis

• Ability to identify outliers

• Cool factor is very high

• Cost of software ($1,000 annually for education and federal government)

• Steep learning curve

• Transferring output to PowerPoint is challenging

• Difficult to interpret

Page 21: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

What questions could this method of analysis answer for you?

• Strategic planning

• Emerging areas of science

• Gaps in research

• Institute/Center niches/across all Institutes/Centers

21

Page 22: The Use of Text Mining and Data Visualization to Assist in Managing a Scientific Grants Portfolio

Contact Information

• Elizabeth Ruben: [email protected]