eRulemaking CS501 Presentation 2

20
eRulemaking CS501 Presentation 2

description

eRulemaking CS501 Presentation 2. Who We Are. Sam Phillips MEng in CS Dan Rassi Junior in CS Michael Wang MEng in CS Krzysztof Findeisen Senior in Astro and CS Raymond McGill Senior in IS. Project Overview. Federal Requirement To Read Comments To Proposed Rulemakings - PowerPoint PPT Presentation

Transcript of eRulemaking CS501 Presentation 2

Page 1: eRulemaking CS501 Presentation 2

eRulemakingCS501 Presentation 2

Page 2: eRulemaking CS501 Presentation 2

Who We Are• Sam Phillips

– MEng in CS

• Dan Rassi– Junior in CS

• Michael Wang– MEng in CS

• Krzysztof Findeisen– Senior in Astro and CS

• Raymond McGill– Senior in IS

Page 3: eRulemaking CS501 Presentation 2

Project Overview• Federal Requirement To Read Comments To

Proposed Rulemakings

• Cornell eRulemaking Initiative (CERI) working on a system to Automatically classify comments.

• Classification Techniques Need “Supervised Learning”

Page 4: eRulemaking CS501 Presentation 2

Overview• EARS – Electronic Annotation and Rulemaking System

– Will provide a single interface for managing comments the government receives as part of its eRulemaking process

– Will use Natural Language Processing (NLP) tools to automate handling of large comment sets

• We are working on a prototype EARS for the Legal Information Institute (LII)

• Tom Bruce of the LII is our chief contact, but we are also working with several other LII groups

• As of Phase II, we had a simple, nonfunctional website that demonstrated our interface

Page 5: eRulemaking CS501 Presentation 2

The Stakeholders

Funding: NSFLong-Term Users:Agency Analysts

Grantee: Cornell eRulemaking Initiative

Grantee:Other Universities

Subject Matter Experts: LII Student

Annotators

Researchers:NLP Group

Researchers:Usability

Software:Our Group

Page 6: eRulemaking CS501 Presentation 2

Term Dictionary• Rule / Reg.: Proposed rule by a federal agency• Rulemaker / Analyst: Domain expert in agency• Issue: A logical facet which the Rule impacts.• Annotate / Tag (v): To “highlight” text and

associate it with a specific issue.• Metadata: Data about Data

– (e.g. E-mail to/from/size)

• Tag (n): An issue as metadata• Flag (n): Non-issue related metadata (e.g.

workflow)

Page 7: eRulemaking CS501 Presentation 2

Activities From Start To Phase II• Meetings With Tom Bruce

– Introduced Project– Explained Requirements / Known Unknowns

• Meetings With LII Student Annotators Heidi Craig and Laura Klimpel– Discussed Current Annotation System– Got Feedback for Early Design Ideas

• Created Static Webpage To Prove That It’s Possible

• Attended Full CERI Meetings

Page 8: eRulemaking CS501 Presentation 2

Example

Page 9: eRulemaking CS501 Presentation 2

Activities Since Phase II Report• Creation of Backend / Middleware Architecture

– Backend in relational mySQL database– Middleware in OO PHP

• Clarification of Some Requirements– XML Format– Color of highlights

• Discovery of Some Known Unknowns– How NLP System Should React– How Extra Data Should Be Displayed

Page 10: eRulemaking CS501 Presentation 2

System Overview

Login

Administrator

Annotator

Add / Remove Rules, Tags,Comments

Choose Rule

Choose Comment

Add / Remove Annotations

Page 11: eRulemaking CS501 Presentation 2

Design Overview• Web Site backed by a central database

Page 12: eRulemaking CS501 Presentation 2

General Design Strategy• Our system architecture is highly modular

– Website, database, etc. can be swapped out easily

• All components already available on LII servers

Page 13: eRulemaking CS501 Presentation 2

Database Design• Primary goal: flexibility

– Unified representation of data– Supports more than our web release will– Lots of room for administrator preferences

• Secondary goal: speed– 4000 regulations issued per year– Usually ~100, up to 500,000 comments per regulation– Demands on the LII version will be much lower

Page 14: eRulemaking CS501 Presentation 2

Database Design

Page 15: eRulemaking CS501 Presentation 2

Database Implementation

Page 16: eRulemaking CS501 Presentation 2

Web Technology• Currently using the Drupal Content Management

System on LII server to host our web application, however we have minimized this dependence

• Website uses JavaScript to dynamically change contents of page when user performs an action

• AJAX technology is used to send annotations between client and server without reloading page

• Our primary goal has been client compatibility across major browsers and operating systems

Page 17: eRulemaking CS501 Presentation 2

Working Demo

Go!

Page 18: eRulemaking CS501 Presentation 2

Where We’re Going• Documentation

– Describe SQL Scheme and ER Diagram To Future CS501 Groups

• Include Design Decisions

• Include mySQL specific queries

– Describe How Implemented Features Work• Low Level (Comments in Code)

• High Level (Why Features Are Needed / Trade Offs)

– Describe How Unimplemented Features Might Work• Design Considerations

• Stakeholders Affected

Page 19: eRulemaking CS501 Presentation 2

Where We’re Going (2)• Features

– Will Certainly Add• UI To Add / Remove

– Comments– Rules– Metadata Sets– Metadata Names

– Will Fix UI For• Deleting Comments• Navigating Comments

– May Add• Hierarchical Tags• “Fake” NLP Interaction• Multi-user Interaction• NLP XML To/From Connection• Colors

Page 20: eRulemaking CS501 Presentation 2

Future WorkID Task Name Duration Start Finish

6 Refine And Select Web Layout 3 days Tue 2/27/07 Thu 3/1/07

7 Install Web Manager 1 day Sun 2/25/07 Sun 2/25/07

8 Install CVS System 7 days Fri 2/16/07 Thu 2/22/07

9 Dummy Website 4 days Wed 2/28/07 Sat 3/3/07

10 Refine Website 4 days Sat 3/3/07 Tue 3/6/07

11 1st Stage Presentation 4 days Fri 3/2/07 Mon 3/5/07

12 Presentation and Report 1 day Tue 3/6/07 Tue 3/6/07

13 Website Feedback 3 days Wed 3/7/07 Fri 3/9/07

14 Install DBMS 7 days Wed 2/21/07 Tue 2/27/07

15 Learn About NLP 16 days Sun 2/25/07 Mon 3/12/07

16 Design Database 7 days Tue 3/6/07 Mon 3/12/07

17 Implement Database 5 days Tue 3/13/07 Sat 3/17/07

18 Design Middle Tier 7 days Tue 3/13/07 Mon 3/19/07

19 Implement Middle Tier 7 days Mon 3/19/07 Sun 3/25/07

20 Refine Middle Tier 7 days Mon 3/26/07 Sun 4/1/07

21 Write Manual 7 days Sat 3/31/07 Fri 4/6/07

22 Write Back-End Documentation 4 days Tue 4/3/07 Fri 4/6/07

23 2nd Stage Presentation 7 days Tue 3/27/07 Mon 4/2/07

24 Presentation and Report 1 day Tue 4/3/07 Tue 4/3/07

25 Major Review 4 days Tue 4/3/07 Fri 4/6/07

26 Design Annotation Interface 15 days Tue 3/6/07 Tue 3/20/07

27 Implement Annotation Interface 15 days Tue 3/6/07 Tue 3/20/07

28 Refine Annotation Interface 7 days Fri 4/6/07 Thu 4/12/07

29 Design Issue Set Interface 7 days Fri 4/6/07 Thu 4/12/07

30 Implement Issue Set Interface 7 days Thu 4/12/07 Wed 4/18/07

31 Refine Issue Set Interface 7 days Wed 4/18/07 Tue 4/24/07

32 Write Manual 7 days Mon 4/16/07 Sun 4/22/07

33 Review and Polish Product 7 days Thu 4/19/07 Wed 4/25/07

34 3rd Stage Presentation 7 days Wed 4/25/07 Tue 5/1/07

35 Presentation, Report, and Release 1 day Thu 5/10/07 Thu 5/10/07

3/6

4/3

5/10

16 19 22 25 28 31 3 6 9 12 15 18 21 24 27 30 3 6 9 12 15 18 21 24 27 30March 2007 April 2007 May 2007