The KnowledgeMap Project: Development of a Concept-Based Medical School Curriculum Database Joshua...
-
Upload
merryl-sims -
Category
Documents
-
view
213 -
download
0
Transcript of The KnowledgeMap Project: Development of a Concept-Based Medical School Curriculum Database Joshua...
The KnowledgeMap Project: The KnowledgeMap Project: Development of a Concept-Based Development of a Concept-Based
Medical School Curriculum DatabaseMedical School Curriculum Database
Joshua C. Denny, MD
Plomarz R. Irani
Firas H. Wehbe, MD
Jeffrey D. Smithers, MD
Anderson Spickard, III, MD, MS
SettingSetting
• Vanderbilt School of Medicine– 104 Medical students in each class– 4 local hospitals
• No electronic repository or course schedule
Goals for KMGoals for KM
• Provide a set of tools to help improve the curriculum and students’ access to it
• Accommodate a variety of presentation styles
• Automate document conversion
• Provide a secure repository of documents that protects intellectual property
KM StructureKM Structure
• Web application – Apache web server, MySQL database– Written in Perl, VisualC++, and Visual Basic
• Multiple servers
• All documents mapped to UMLS concepts via KM Concept Identifier
Document CorpusDocument Corpus
• Manually converted 2001-2002 preclinical lecture handouts – “legacy” documents
• New handouts/presentations uploaded by faculty
PilotPilot
• Anatomy (Fall) and Cell Biology (Spring) for 2002-2003– 4th year elective later came online
KM Concept IdentifierKM Concept Identifier• Uses NLP techniques
– Abbreviation and acronym extraction
– Semantic regularization
• Score-based– Derivational forms (stenosis stenotic, lungs
pulmonary)
– Document-based disambiguation• Word and concept clustering
• Performs favorably with MetaMap on educational documents (82% Recall, 89% Precision)
Document ProcessingDocument Processing
Document uploaded by lecturer, placed in queue
Document Conversion
Server
KM Concept Identifier
Document Conversion Server pulls next document
off queue, converts to HTML and Text
Text version placed in
queue
Apache Web server
HTML & PDF versions
Identified concepts indexed for searching
Search ProcessingSearch Processing
User enters a search query, example: “Wilson’s disease”
Search Concept Identifier
MySQL database
C0019202 -- “Hepatolenticular
Degeneration”
C0019202 found in index of curriculum documents
Content Coverage QueryContent Coverage Query
• Created to answer questions such as “Where is Women’s Health taught?”– “metaconcepts”
• Uses relationships defined in the UMLS to expand queries with related child and child-like concepts
Content Coverage QueryContent Coverage QueryUser enters a “metaconcept” query: “Women’s Health”
Search Concept Identifier
C0080339 -- “Women’s Health”
These concepts found in index of curriculum
documents
KM finds related UMLS concepts
MySQL database
Other featuresOther features• Relevant PubMed searches
– Based on the document title and the most frequent MeSH concepts in the document
• Definition searching – Based on UMLS SRDEF file and MedlinePLUS
• Course management
• Lecture calendar– Organized by semester and student year
AnalysisAnalysis• Primary data source was Log Files:
– All events in Apache
– Key events in KM, including:• Logins/Logoffs
• Searches
• Documents viewed (by browse or by search)
• PubMed searches
• Content Coverage queries (available only to Course Directors and Administrators)
– Removed all events generated by a developer or researcher
• Downtime measured by a separate server that logged any time the system (or a component) was unavailable
CalculationsCalculations
• Browsed document: any documents accessed via a course home page or via the “browse” function on the toolbar
• Searched document: any document accessed via a search
documents browsed #
documents searched # ratio wseSearch/bro
ResultsResults
• 3271 searches– 84% completed with a concept search– 15% definition searches (since 5/03)
• Total of 526 users logged in 15,885 times and viewed 1,143 documents a total of 32,113 times– All members of the first, second, and third year
classes have logged on
ResultsResults
• 1264 active documents (1489 total)– 722 uploaded by 28 faculty members– 135 documents uploaded by authors– 407 legacy documents
• Total downtime was about 20 hours, including scheduled downtime– No true downtime since 1/03
0
1000
2000
3000
4000
5000
6000
7000
8000
Sep-02
Oct-02
Nov-02
Dec-02
Jan-03
Feb-03
Mar-03
Apr-03
May-03
Jun-03
Jul-03
Aug-03
Sep-03
Oct-03
Month
# d
oc
um
en
ts
20032004200520062007Total
Start of new academic year
0
10
20
30
40
50
60
VMS I VMS II VMS III VMS IV Faculty Coursedirectors
Lo
gin
s/p
ers
on
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Se
arc
h/b
row
se
ra
tio
Logins/person
Search/browse ratio
* p < 0.01 vs VMS I-II
** p < 0.0001 vs all
**
** *
Percentage of classes using KM
0%5%
10%15%20%25%30%35%40%45%
1 (pilot) 2 3
Semesters after introduction
ConclusionsConclusions
• KM is being adopted by more classes– Predominant class and student use still first
year courses
• Students are using KM more– Student use precedes classes coming online– All students with courses online have used KM– Initial reactions seem positive– Heaviest use by 1st and 2nd year students
Future DirectionsFuture Directions
• Automation of content coverage queries• Expand to more courses and another site• More types of media• Expansions of search algorithms to include spell
checker • Support for a PocketPC/Palm-compatible site• Student tracking
AcknowledgementsAcknowledgements• Randy Miller, M.D.
• Michel Décary of Cogilex R & D, Inc
• Dean’s Office
• Art Dalley, Ph.D.
• Cathleen Pettepher, Ph.D.
For more informationFor more information• http://knowledgemap.mc.vanderbilt.edu/research• [email protected]