Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of...

30
Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress Questions?

Transcript of Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of...

Page 1: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Presentation AgendaIntroduction

NSF Project Overview

Current State Of The Art

Our Understanding Of Your Requirements

Design

Implementation / Demo

Progress

Questions?

Page 2: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

eRulemakingCS501 Presentation 1

The Workgroup

Page 3: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Who We Are• Sam Phillips

– MEng in CS

• Dan Rassi– Junior in CS

• Michael Wang– MEng in CS

• Krzysztof Findeisen– Senior in Astro and CS

• Raymond McGill– Senior in IS

Page 4: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Federal Rulemaking• Executive agencies issue over 40001 regulations

per year– Preliminary regulations published daily as Notices of

Proposed Rulemaking (NPRMs)– Public can submit feedback on NPRMs– Usually ~100, up to 500,000 comments per regulation

1C. Cardie, C. Farina, & T. Bruce. Using Natural Language Processing to Improve eRulemaking. In Proceedings of 2006 International Conference on Digital Government Research, San Diego, 2006.

Page 5: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Rules and Comments• Rules tend to be long and address several “issue topics”.

– Well organized, “written like laws”

• Comments vary in type significantly– From individuals in organizations (e.g. Sierra Club, NRA)– From professionals (e.g. lawyers / lobbyists / domain experts)– From potential stakeholders (beneficiaries, those potentially

hurt)– General public

• Comments may address none to several of the “issue topics”

Page 6: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Putting the “e” in “eRulemaking”• Federal directive to read

and consider all comments

• Currently comments are read and sorted by hand

• For controversial issues, this is a lot of work!

• Natural Language Processors (NLPs) can be used to classify comments

• NLP software is “trained” through annotation of a subset of comments

Ideally the system can be automated

Page 7: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

The Project• The Legal Information Institute (LII) is working

on automating the sorting process

Our propose [sic] to apply and develop a range of methods from the field of natural language processing (NLP) to create NLP tools to aid agency rule writers in:– organization, analysis, and management of the sometimes

overwhelming volume of comments, studies, and other supporting documents associated with a proposed rule; and

– analyzing proposed rules to flag possibly relevant mandates from the large number of statutes and Executive Orders that require studies, consultations, or certifications during rulemaking.

C. Cardie, C. Farina, & T. Bruce. Using Natural Language Processing to Improve eRulemaking. In Proceedings of 2006 International Conference

on Digital Government Research, San Diego, 2006.

Page 8: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

The StakeholdersRulemakers / NSF

Cornell eRulemakingGroup

LII Annotators Our Group

Natural LanguageProcessor Group

Other Universities

Rulemakers / NSF

Cornell eRulemakingInitiative

LII Annotators Our Group

Natural LanguageProcessor Group

Other Universities

Page 9: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Related Projects• Carnegie Mellon is working on a set of analysis

tools2

– Comment statistics• Redundancy• Stakeholder phrases

– Correlations between issues• Unknown interest groups

• University of Pittsburgh and University of Southern California are also working on eRulemaking.

2J. Callan, R. Krishnan, & P. Suen. CMU eRulemaking Project Description. http://erulemaking.cs.cmu.edu/.

Page 10: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Current Analyst Workflow• Analysts receive comments by e-mail• They filter comments for useful statements• Build an issues-(comment summary) matrix as they read

comments• Categorize type of commenter• Organize by section of regulation• Combine massive charts, discuss, analyze• If rule is adopted, analysts publish statement on how they

addressed the comments3

3C. Cardie, C. Farina, & T. Bruce. Using Natural Language Processing to Improve eRulemaking. In Proceedings of 2006 International Conference on Digital Government Research, San Diego, 2006.

Page 11: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Analyst Flowchart

3C. Cardie, C. Farina, & T. Bruce. Using Natural Language Processing to Improve eRulemaking. In Proceedings of 2006 International Conference on Digital Government Research, San Diego, 2006.

Page 12: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Current LII Annotator Workflow• Annotators have set of ~300 comments from Department

of Transportation• Annotators agree a priori on set of issues• Issue set relatively large (38)• Annotators identify phrases in each comment with one or

more issues (this is annotating)– Multiple annotators per comment for research purposes

• Early annotating picks up overlooked issues – Tom Bruce updates issue set

• Annotated comments delivered to the NLP group

Page 13: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Callisto Demo• Callisto is the software LII annotators use to

annotate

• Callisto is published by MITRE, Inc.

• Although it works, it is not well-suited for eRulemaking

Page 14: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Term Dictionary• Rule / Reg.: Proposed rule by a federal agency

• Rulemaker / Analyst: Domain expert in agency

• Issue: A logical facet which the Rule impacts.

• Annotate / Tag (v): To “highlight” text and associate it with a specific issue.

• Tag (n): The implementation of a tag as metadata

• Flag (n): Non-issue related metadata (e.g. workflow)

Page 15: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Requirements• Our understanding of your immediate

requirements is:– The system is accessible to any reasonable client

system– The system can display several hundred annotated or

NLP-processed comments and indicate how each comment is classified

– The system must be extensible, so that the LII can continue working towards a production system

– The system can display the annotations associated with each comment

– The system allows users to add or modify annotations

Page 16: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Requirements• Our understanding of your optional requirements

is:– The system can feed comments with changed

annotations into the NLP– The system allows users (or a subset thereof) to

change the set of issues associated with a regulation (grow/collapse)

– The system allows comments to have flags not directly related to issues

– The system can handle large numbers of regulations (thousands) and comments per regulation (tens or hundreds of thousands)

Page 17: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Requirements• Our understanding of your long-term

requirements is:– The system supports hierarchies of issues– The system blends into the federal department’s

workflow– The system must be easy to set up and install

Page 18: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Assumptions• Government agencies work roughly as

summarized as the transcripts provided.

• When government agencies adopt annotation, they will do so similarly to the LII

• LII prefers a solid but feature-sparse prototype to feature-laden but not as easily extensible version

• LII prefers a system designed for Rulemakers first with “research” interests as a secondary concern.

Page 19: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Design / Implementation• Based on your requirements, we have selected the

iterative design process.– Several iterations at whole project– Implement mock-ups to help clarify

• Many stake-holders– Full requirements unknown– Underspecified UI very important

• Prototype System– Desires may change is practical issues crop up

Page 20: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Design• Our system design is a standard relational database-

driven website– Lots of implementation software available

– “Drag and drop” content modules

– Minimal retraining of team members

– Natural “three tier” architecture

– Front-end / Middleware / Backend can be replaced independently

– Simple cross-platform compatibility because of web interface

Page 21: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Main Module

View info

User model

Is admin Admin model Log offLog in End ?

N

Y Y

N

Start End

Registration model

Page 22: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Registration Module

Registrationmodule start

Registrationmodule end

Agree ? Info satisfied ?Y

N

N

Y Update databaseInput user info

Licenseagreement

Page 23: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Admin Module

Editannouncement

Collapse tags

Add new tags

End ? Y

N

Admin modulestart

Admin module end

Page 24: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

User Module

Choose regulationChange

regulation ?

Is annotationmodel?

Annotation modelView model

Choose annotationor view model

Y

Y

End ?

N

N

Y

N

User module start

User module end

Page 25: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

View Module

Filter by tags orflags

Read matchingcomments or

setionsYEnd ?

N

View module start View module end

Page 26: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Annotation Module

Add tag or flags

Edit tag or flags

Remove tag orflags

Submit to NLP forlearning

End ? Y

N

Annotation modulestart

Annotation moduleend

Page 27: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Implementation• The website will be written using the Drupal

content management system– Designed to produce dynamic websites with minimal

administration– LII was already considering Drupal for another project

• The database will be running on a mySQL system already present on LII servers– No installation required– Most content management systems require an SQL-

based relational database

Page 28: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

UI Alpha 0.1 Demo• First Release Alpha Website

Page 29: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

ID Task Name Duration Start Finish

1 Interface Requirements 2 days? Fri 2/16/07 Sat 2/17/07

2 Web Concept Sketches 3 days? Sat 2/17/07 Mon 2/19/07

3 Pick Web Management System 1 day? Tue 2/20/07 Tue 2/20/07

4 Get Sample Data 5 days? Sat 2/17/07 Wed 2/21/07

5 Learn About Annotating 2 days? Sun 2/25/07 Mon 2/26/07

6 Refine and Select Web Layout 3 days? Tue 2/27/07 Thu 3/1/07

7 Install Web Manager 1 day? Sun 2/25/07 Sun 2/25/07

8 Install CVS System 7 days? Fri 2/16/07 Thu 2/22/07

9 Dummy Website 4 days? Wed 2/28/07 Sat 3/3/07

10 Refine Website 4 days? Sat 3/3/07 Tue 3/6/07

11 Write Back-End Documentation 8 days? Fri 3/2/07 Fri 3/9/07

12 1st Stage Presentation 4 days? Fri 3/2/07 Mon 3/5/07

13 Presentation and Report 1 day Tue 3/6/07 Tue 3/6/07

14 Website Feedback 3 days? Wed 3/7/07 Fri 3/9/07

15 Install DBMS 7 days? Wed 2/21/07 Tue 2/27/07

16 Learn About NLP 16 days? Sun 2/25/07 Mon 3/12/07

17 Design Database 7 days Tue 3/6/07 Mon 3/12/07

18 Implement Database 5 days? Tue 3/13/07 Sat 3/17/07

19 Design Middle Tier 7 days Tue 3/13/07 Mon 3/19/07

20 Implement Middle Tier 7 days Mon 3/19/07 Sun 3/25/07

21 Refine Middle Tier 7 days Mon 3/26/07 Sun 4/1/07

22 Write Manual 7 days Sat 3/31/07 Fri 4/6/07

23 Write Back-End Documentation 4 days Tue 4/3/07 Fri 4/6/07

24 2nd Stage Presentation 7 days Tue 3/27/07 Mon 4/2/07

25 Presentation and Report 1 day Tue 4/3/07 Tue 4/3/07

26 Major Review 4 days Tue 4/3/07 Fri 4/6/07

27 Design Annotation Interface 30 days? Sat 3/10/07 Sun 4/8/07

28 Implement Annotation Interface 3 days Sun 4/8/07 Tue 4/10/07

3/6

4/3

17 21 25 1 5 9 13 17 21 25 29 2 6 10 14 18 22 26 30 4Feb 15, '08 Mar 1, '08 Mar 15, '08 Mar 29, '08 Apr 12, '08 Apr 26, '08

Page 30: Presentation Agenda Introduction NSF Project Overview Current State Of The Art Our Understanding Of Your Requirements Design Implementation / Demo Progress.

Questions?• Any Questions?