Archival Stewardship of Email using ePADD Software
-
Upload
glynn-edwards -
Category
Technology
-
view
586 -
download
4
Transcript of Archival Stewardship of Email using ePADD Software
![Page 1: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/1.jpg)
Glynn EdwardsSAA – August 22, 2015Director, ePADD Project
Archival Stewardship of Email using ePADD Software
![Page 2: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/2.jpg)
![Page 3: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/3.jpg)
Developed and funded by:
![Page 4: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/4.jpg)
ePADD programCollection
Development
Pre-Acquisition Appraisal
Capture Normalization Item-level processing Bulk processing
Intellectual Arrangement
Search Capability
Personal/Sensitive Information Processing
Packaging RepositoryOnline
DiscoveryAccess
CERP Parser Email message Email message
DArcMail Email message Email message Fielded
EMCAP Server Version
Email message Email message Server version only
Archivematica Message + attachments
Message + attachments
PeDALS Email message Email message Other: not declared
ePADD Message +
attachmentsMessage +
attachments
NLP; fielded; full-text; lexicon
Identification (Reg. Ex.)
EAS Message + attachments
Message + attachments
fielded; full-text Identification (Reg. Ex.)
eMailchemy MailStore Server
Message + attachments
Message + attachments
Full-text AccessData FTK
Message + attachments
Message + attachments
Full-text Identification (Reg. Ex.)
ZL Unified Archive
Message + attachments
Message + attachments
Full-text Preservica Standard
Message + attachments
Message + attachments
Other: not declared
Paraben Email Examiner
Message + attachments
Message + attachments
Other: not declared
Aid4Mail Professional
Other: not declared
Full support Not Supported Unknown
Lifecycle Tools for Archival Email Stewardship
Preservation Access Accessioning Archival Processing
![Page 5: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/5.jpg)
Appraisal Module
![Page 6: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/6.jpg)
![Page 7: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/7.jpg)
ePADD Technical Information
ePADD is written in Java and Javascript and powered by Apache Tomcat (v7.0) using Java EE Servlet API (v3.x) and Java Mail (v1.4.2). Text and metadata extraction, indexing and retrieval is performed by Apache Lucene (v4.7) and Apache Tika (v1.8). Charting and visualization is supported using the D3‑based reusable chart library (v0.4.10). Oracle's Java Application Bundler and Launch4J are used for packaging on Mac and Windows platforms respectively. Other Java libraries from Apache (Lang, commons, CLI, IO, logging, etc.) are also used. JSON formatting is performed with the libraries org.json and Gson.
ePADD has implemented its own natural language processing (NLP) toolkit which is used for named entity extraction, disambiguation and other tasks. This toolkit supplants the Apache OpenNLP used in earlier beta versions of the ePADD software. We continue to use Muse as an internal library within ePADD. However, the Apache OpenNLP proved insufficient for our needs (at least for name recognition), and after various rounds of customization, we built our own named entity recognizer. This toolkit uses external datasets such as Wikipedia/DBpedia, Freebase, Geonames, OCLC FAST and LC Subject Headings/LC Name Authority File.
The project is developed with IDEs like IntelliJ Idea and Eclipse, built with Apache Maven, Ant, and custom shell scripts, and tracked using Git for source control and issue tracking. The ePADD software client is browser‑based and compatible with Chrome and Firefox. It is optimized for Windows 7 and OSX 10.9/10.10 machines, using Java 7 or 8.
![Page 8: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/8.jpg)
Correspondents: Resolving multiple accounts into single entry
![Page 9: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/9.jpg)
Actions: do not transfer – restrict - reviewed
![Page 10: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/10.jpg)
Processing Module
![Page 11: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/11.jpg)
![Page 12: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/12.jpg)
![Page 13: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/13.jpg)
![Page 14: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/14.jpg)
![Page 15: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/15.jpg)
![Page 16: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/16.jpg)
Disambiguation of names
![Page 17: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/17.jpg)
Discovery & Delivery (Access)
![Page 18: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/18.jpg)
![Page 19: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/19.jpg)
Query generator
![Page 20: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/20.jpg)
![Page 21: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/21.jpg)
Upload of CSV files of email addresses for matching with existing archiveSearch by Date and Date Range
1.1 release - August 2015New features
![Page 22: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/22.jpg)
Future Roadmap• Enhance Natural Language Processing Capability• Enhance the Processing Module Features • Enhance the Discovery/ Delivery Module Features• Recommend and Test Preservation Strategy • Collaboration with other Platforms & Services • Explore Sustainability Model • Add Restriction Management/ Annotation Functions • Enhance the Error Handling Capability
![Page 23: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/23.jpg)
![Page 24: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/24.jpg)
![Page 25: Archival Stewardship of Email using ePADD Software](https://reader033.fdocuments.in/reader033/viewer/2022051709/5871a2571a28ab044e8b7247/html5/thumbnails/25.jpg)
https:/library.stanford.edu/projects/epadd
https://epadd.nimeyo.com/
@e_padd
Glynn Edwards [email protected]
Peter Chan [email protected]
Josh Schneider [email protected]
http://epadd.stanford.edu/epadd/collections