ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in...

19
eRulemaking CS501 Final Presentation

Transcript of ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in...

Page 1: ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in Bedford MA Dan Rassi –Junior in CSInternship with Amazon.com.

eRulemakingCS501 Final Presentation

Page 2: ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in Bedford MA Dan Rassi –Junior in CSInternship with Amazon.com.

Who We AreSummer Plans

• Sam Phillips– MEng in CS R.S.A in Bedford MA

• Dan Rassi– Junior in CS Internship with Amazon.com in Seattle

• Michael Wang– MEng in CS Internship?

• Krzysztof Findeisen– Senior in Astro & CS Research

• Raymond McGill– Senior in IS U.B.S. In Stamford, CT

Page 3: ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in Bedford MA Dan Rassi –Junior in CSInternship with Amazon.com.

Project Overview• Federal Requirement To Read Comments To Proposed

Rulemakings

• Cornell eRulemaking Initiative (CERI) working on a system to Automatically classify comments.

• Classification Techniques Need “Supervised Learning”– Computer must be provided with human-generated examples.

– Accomplished by experts “annotating” comments.

Page 4: ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in Bedford MA Dan Rassi –Junior in CSInternship with Amazon.com.

Overview – Long Term Goal• EARS – Electronic Annotation and Rulemaking

System– Will provide a single interface for managing

comments the government receives as part of its eRulemaking process

– Will use Natural Language Processing (NLP) tools to automate handling of large comment sets

– Each is a computational challenge.

Page 5: ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in Bedford MA Dan Rassi –Junior in CSInternship with Amazon.com.

Our Part• Project is in very early stages

• The Workgroup was asked to investigate computational and functional aspects of future system.

• Goals of project were:– 1) Generate base system framework which will be

extended by future groups.– 2) Investigate and document possible future

functionality.

Page 6: ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in Bedford MA Dan Rassi –Junior in CSInternship with Amazon.com.

Project Summary• Very Successful!!

• Base functionally close to beta stage as Callisto replacement.

• Significant lessons learned regarding importance of various functionality.

• Documentation of all functionality which has been discussed to facilitate future extension.

Page 7: ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in Bedford MA Dan Rassi –Junior in CSInternship with Amazon.com.

Activities From Phase III To Now• Meeting With Hronn Brynjarsdottir For Usability

– Thoughts about summaries of information– Text-to-speech– Access w/out mouse?

• Meeting With LII Annotator Group– Thoughts on features for “context menu”– Feedback on some proposed functionality

• (e.g. list of issues in context menu not so important)

• Meeting With NLP Group– Future Directions

• Meetings with Tom Bruce– Summarized Possible Functionality As Reported In Phase III– Selected A Subset To Focus On

Page 8: ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in Bedford MA Dan Rassi –Junior in CSInternship with Amazon.com.

Working Demo

Go!

Page 9: ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in Bedford MA Dan Rassi –Junior in CSInternship with Amazon.com.

EARS System Overview

Admin ToolsUser

Management

RulemakerTools

Annotation

NLP Tools

Workflow Tools

AutomatedTagging

Active Learning

Intra/Inter Comment Summaries

Web ServiceConsiderations

Flagging Filter / Search

Page 10: ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in Bedford MA Dan Rassi –Junior in CSInternship with Amazon.com.

Proposed Functionality: I• Natural Language Processing

– Program can display the comments for each rule.

– Annotators can choose different comments from the full comment set.

– Comments will support only one set of annotations.

– The interface supports up to 50 issue tags per regulation.

– Annotators can add annotations.

– Annotators can delete annotations.

– Annotators can change the order of issues while annotating.

– Annotators can view comment text with highlighted annotations.

– Program can show annotations in multiple colors.

– Program allows import and export of XML annotation data in the ATLAS format.

– Program allows import and export from NLP format.

– Annotators can feed changes back into the NLP.

– Program supports “inter-annotator agreement” code produced by NLP group.

Page 11: ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in Bedford MA Dan Rassi –Junior in CSInternship with Amazon.com.

Proposed Functionality: II• Workflow Tools

– Program supports nested, hierarchical, or otherwise organized issues.

– Users can view lists of comments which match filters.

– Program supports “flags” non-issue-related full comment metadata.

– Users can view a report summarizing comments for a given rule.

– Program automatically collects workflow metadata (last read by, annotation history, etc).

– Program has Undo and Redo functions.

– Users can see and post to an announcement page. (provided by Drupal)

– Program has a general method to grab text from annotation spans.

– Program reproduces the tab interface from Callisto.

– Program provides an info block displaying information about annotations and issues.

– Users have a general way to jump-to annotations within comments by following a hyperlink.

– Users can customize annotation colors.

Page 12: ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in Bedford MA Dan Rassi –Junior in CSInternship with Amazon.com.

Proposed Functionality: III• Administration Tools

– Users can add rules.– Users can add comments.– Users can add new tags and flags.– Users can change sets of issues associated with a regulation.– Users can modify issue tags (including adding and merging issues).– Users can delete rules, comments, tags, and flags.

• User Management– Program functions require user authentication. (provided by Drupal)– Program has a site administrator who handles user accounts, new sections, etc.. (provided by Drupal)– The annotation database shares user roles from Drupal.

• Web Services– System is compatible with Firefox 1.5+ and Internet Explorer 5.5+.– Program protected from common security threats (e.g. SQL injection attacks)– System handles comment locking and concurrency in a multi-user environment.– Program automatically saves changes.– Program indicates to the user when the connection is lost.– SQL queries are separated from code into their own file.– Design system with low coupling between the front-end, middleware and back-end.– Entire system can be extended into new modules. (provided by Drupal)

Page 13: ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in Bedford MA Dan Rassi –Junior in CSInternship with Amazon.com.

Deliverables• System Framework: Demo System• SVN Repository Access.• Documentation, Documentation, Documentation!

– Our View Of This Project:• Stakeholders• Requirements• Exhaustive Future Functionality

– Subsystem Reports: 1 Per Layer• Requirements• Design Considerations• Implementation Details• Examples of Extension

– Minutes From Meetings– Phase I, II, III Reports– Basic User Interaction Example

Page 14: ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in Bedford MA Dan Rassi –Junior in CSInternship with Amazon.com.

Where To Go: Beta Use For LII• This project can serve as a Callisto Replacement

• Simpler Interface

• Automatic Persistence

• Support for Multiple Users

• Straightforward Customizability

• Support For Hierarchies

• More Natural Coupling To NLP– E.g. partial processing– E.g. Inter-annotator agreement “modules”

Page 15: ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in Bedford MA Dan Rassi –Junior in CSInternship with Amazon.com.

Suggested LII Task List To Beta 1.0Simple Admin Module [Est: 1 week]

Modification to capture Drupal sign-in in annotation module.

[Est: 4 hrs]

Security Review / Update [Est: 8 hrs]

Backup / Rollback [Est: 4 hrs]

Agreement With NLP on Intermediate Format & Implementation

[Est: 10 hrs]

Tracking of user activity / subset of comments for each user

[Est 16 hrs]

Updates for multiple users at same time [Est 8 hrs]

Page 16: ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in Bedford MA Dan Rassi –Junior in CSInternship with Amazon.com.

Where To Go: Beta Use For Gov’t• The EARS system promises many advantages for

Agency Rulemakers which are distinct from the NLP.– Central Location For Comment Storage

– Multi-user, real-time environment

– Sorting / Searching Filtering For Tags / Flags / Read / Author Information

– Comment Summaries / Issue Summaries• E.g. All segments about X on one PDF

• E.g. % of comments read

• E.g. % of comments containing issue X

– Future Handling of Near-Dupes

Page 17: ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in Bedford MA Dan Rassi –Junior in CSInternship with Amazon.com.

Suggested Gov’t Task List To Beta1.0Creation of “Administration Suite”

Add / Remove Issue Sets, Comments, Rules Import Wizards.

[4 weeks]

Creation of “Summary Views”

Filter Across Comments

Like Guided Search

[2.5 weeks]

Add Multi-user interaction

Show / Hide Other’s Annotations

Lock / Unlock Comments

[2 weeks]

NLP Integration [12 + wks]

Self Install / Uninstall [4 wks]

Page 18: ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in Bedford MA Dan Rassi –Junior in CSInternship with Amazon.com.

Where To Go: Beta Use For NLP• NLP presents unique challenges:

– Actual NLP behavior not well defined– Actual NLP running time not well defined– May include “Active Learning”

• This Behavior Is Leading Edge

• System may actively interact with users

• May require “push” behavior

– NLP Performance is unknown and will influence design decisions

• 70% accuracy vs 95% accuracy

Page 19: ERulemaking CS501 Final Presentation. Who We Are Summer Plans Sam Phillips –MEng in CSR.S.A in Bedford MA Dan Rassi –Junior in CSInternship with Amazon.com.

Conclusion• Very Successful Project!

• We Had Fun!

Thanks!