documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow...

70
documeta. Team 410 Crystal | Design Presentation December 06 , 2018 410 Crystal 1 | Design

Transcript of documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow...

Page 1: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

documeta.

Team 410 Crystal | Design PresentationDecember 06 , 2018

410 Crystal 1 | Design

Page 2: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Team Bio

❖ Nick Cason

➢ Systems Administrator

❖ Andrew Davidson

➢ Software Engineer

❖ Khang Nguyen

➢ Systems Administrator

➢ VDI Engineer

❖ Josh Pena➢ Biologist

➢ 2nd degree seeking student

410 Crystal 2 | Design

❖ Sean Tosloskie

➢ Sr. Developer

➢ Soccer player

❖ Alex Tucker, RHCSA

➢ Systems Engineer

❖ CAPT Mark “Forrest” Stoops, USN

➢ Director, C4I & Space

Page 3: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Table Of Contents

I. Problem Background 4II. Problem Statement 5

III. Problem Characteristics 7IV. Customer/End user 9V. Current Process Flow 10

VI. Solution Statement 11VII. Proposed Process Flow 12

VIII. Major Functional Components Diagram 13IX. Competition Analysis 14X. Build Tools 15

XI. Core Components 17XII. User Interface 18

XIII. Infrastructure 25XIV. Backend 40XV. Risks: Overview 48

XVI. Technical Risks 49XVII. User Risks 56

XVIII. Customer Risks 60XIX. Agile Development 64XX. Conclusion 65

XXI. References 66XXII. Glossary 69

410 Crystal 3 | Design

Page 4: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Problem Background

“We get stacks of documents as thick as phone books … most don’t read past the executive

summary.”

- Capt. Mark Stoops, USN

❖ Naval intelligence produces more information than can be consumed in a reasonable timeframe. ❖ Finding significant connections between new and existing intelligence documents is a difficult

problem.❖ Meet our mentor on page 3 of the handout

410 Crystal 4 | Design

Page 5: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Problem Statement

“Documents deliver information. The links and relationships between them

contain meaning. Curation of these relationships offers clarity.”

410 Crystal 5 | Design

Page 6: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Documeta is a document insight & comprehension tool.

Visit us at our website for more information.

410 Crystal 6 | Design

Page 7: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Problem Characteristics

❖ The relationships connecting documents and their information aren’t always apparent

❖ Relationships are inferred by the reader

➢ Unnoticed connections

➢ Oversights, inattentiveness, incompetence, bias

❖ Determining relationships can require tedious effort

➢ Error-prone

➢ Frustration-prone

410 Crystal 7 | Design

Page 8: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Problem Characteristics (continued)

❖ Large sets of documents: additional challenges

➢ Time constraints

■ Takes considerably longer to find connections

■ Humans can only process one document at a time

■ Not feasible to find all possible connections

➢ Daunting task

■ “Where should I begin my search? This is over 1,000 documents”

❖ Important questions need quick answers

➢ There isn’t always time available to read through documents

➢ “I don’t have time to read.”

410 Crystal 8 | Design

Page 9: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Who is our specific application relevant to?

❖ Our Customer: Capt. Mark Stoops of the U.S. Navy

❖ Our End Users: Enlisted members of the U.S. Navy

❖ The general case:

➢ Project Managers

➢ Researchers

➢ Consultants

410 Crystal 9 | Design

Page 10: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Current Process Flow

410 Crystal 10 | Design

❖ Inefficient

❖ Time Consuming

❖ Wasted Effort

Page 11: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Solution Statement

“Extracting important characteristics from documents reveals relevant

information on a topic. Visualizing these discoveries in a fun, interactive, and

meaningful way acts as a vehicle of tangential learning.”

410 Crystal 11 | Design

Page 12: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Proposed Process Flow page 4 of handout

410 Crystal 12 | Design

Page 13: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Major Functional Components refer to page 5 of the handout

410 Crystal 13 | Design

Page 14: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Competition Analysis

410 Crystal 14 | Design

Page 15: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Build Tools

Technology Stack❖ Django Web Framework❖ MySQL Community Edition

Deployment Stack❖ Kubernetes Container-Orchestration

System❖ Docker Software

410 Crystal 15 | Design

User Interface❖ Data Driven Documents (D3.js)

Document Text Extraction❖ Textract

RESTful API❖ Flask (Python web-framework)

Version Control❖ Git

Page 16: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Core Components

410 Crystal 16 | Design

Page 17: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

User Interface

Data Driven Documents

410 Crystal 17 | Design

Page 18: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

D3.js library

❖ Javascript library for producing dynamic, interactive data visualizations in web browsers

❖ Embedded within an HTML webpage❖ Uses pre-built Javascript functions to:

➢ Select elements

➢ Create & style SVG objects

➢ Add transitions, dynamic effects, or tooltips to SVG

objects

❖ Large datasets are bound to SVG objects using D3.js functions to generate rich text/graphics charts and diagrams

❖ Binds arbitrary data to a Document Object Model (DOM)

410 Crystal 18 | Design

Page 19: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

D3.js❖ Built upon a series of injectors, or an

injection chain

❖ There are four main components that

play a part in the processing chain of

D3.js:

➢ Input Data

➢ Element binding

➢ Method chain

➢ DOM (Document Object

Model)

19

Page 20: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

D3.js | Technical Principles

Selections:❖ Enables a programmer to select a given set of DOM nodes,

and use operators to manipulate them❖ Selection can be based on:

➢ Tag

➢ Class

➢ Identifier

➢ Attribute

➢ Place in the hierarchy

❖ Operations include:➢ Accessing/mutating attributes

➢ Adding/removing attributes

➢ Display text

➢ Display styles

❖ All operations to HTML elements can be made dependent on data

410 Crystal 20 | Design

Page 21: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

D3.js | Technical Principles

Transitions:❖ Transitions are a form of animation❖ A D3 transition is a “selection-like” interface for animating

changes to the DOM❖ Will smoothly interpolate the DOM from its current state to

a desired target state over a given duration❖ D3 transitions will determine an appropriate interpolator by

inferring a type including:➢ Numbers

➢ Colors

➢ Geometric transforms

➢ Strings with embedded numbers ( e.g., “100px”)

410 Crystal 21 | Design

Page 22: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

D3.js | Technical Principles

Data-binding:❖ Loaded data can drive the creation of elements❖ D3.js loads a given dataset, then for each of its elements

can:➢ Create an SVG object with associated properties

■ Shape■ Color■ Values

➢ Create an SVG object with associated behaviors■ Transitions■ Events

410 Crystal 22 | Design

Page 23: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

User Interface: Proposed Model

410 Crystal 23 | Design

Refer to page 3 of handout for additional information on the D3 API

Page 24: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

D3.js | Data Flow

410 Crystal 24 | Design

Page 25: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Infrastructure

Modern Deployments

410 Crystal 25 | Design

Page 26: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Infrastructure Overview refer to pages 7 & 8 for brief overview of select technologies

Docker & Kubernetes❖ Scalability

❖ Resiliency

❖ Portability

❖ Cloud Ready

410 Crystal 26 | Design

MySQL & Memcached❖ Enterprise Support

❖ Native JSON data types

❖ Encryption & Security

❖ Performance

Nginx❖ Async, Event-driven

❖ Concurrency

❖ Caching, load-balancing

Page 27: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Infrastructure Overview cont.

Kerberos & LDAP❖ Security

❖ Reliability

❖ Scalable

410 Crystal 27 | Design

Traefik & Haproxy❖ HTTPS load-balancing

❖ Ingress

❖ Internal load-balancing

Gitlab❖ Continuous Integration

❖ Continuous Deployment

❖ Issue Tracker

Page 28: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Docker

❖ Patching and Updates

❖ Scaling

❖ Deployment

❖ Cloud Ready

410 Crystal 28 | Design

Page 29: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Kubernetes

❖ Management

❖ Automation

❖ Integration

410 Crystal 29 | Design

Page 30: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Nginx

❖ Performance

➢ Smaller footprint

➢ Modular

➢ Event based

❖ Enterprise Options

➢ Support

➢ More features

➢ Active Development

410 Crystal 30 | Design

Page 31: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Memcached

❖ Performance

➢ UI Experience

➢ Scaling

➢ Improved utilization

410 Crystal 31 | Design

Page 32: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

OpenLDAP

❖ Directory Services

❖ Internal to each Pod

❖ Identifies service UPNs

410 Crystal 32 | Design

Page 33: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Bind9

❖ DDNS

❖ Internal Resource Identification

❖ isc-dhcp-server companion

410 Crystal 33 | Design

Page 34: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

MIT Kerberos KDC

❖ Authentication

❖ Authorization

❖ Encrypt Credentials over the Network

410 Crystal 34 | Design

Page 35: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Traefik

❖ Ingress ACLs

❖ Load Balancing

410 Crystal 35 | Design

Page 36: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Proposed Design

410 Crystal 36 | Design

Page 37: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Application Pod

410 Crystal 37 | Design

Page 38: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Security Considerations

❖ Pod Isolation

❖ Internal API end-to-end Encryption

❖ Ingress & ACLs

❖ Handling Krb5 Keytabs in Containers

410 Crystal 38 | Design

Page 39: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Continuous Integration / Deployment

❖ Gitlab

➢ Continuous Integration

➢ Continuous Deployment

➢ Issue Tracker

➢ Enterprise Support Options

410 Crystal 39 | Design

Page 40: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Backend

Automated Document Classification

410 Crystal 40 | Design

Page 41: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Backend Overview

410 Crystal 41 | Design

Page 42: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Text Extraction

❖ A collection of documents serve as an argument for the backend➢ Collection of various formats (pdf, docx, etc)

❖ Documents are normalized into a uniform text format

❖ Utilizing Textract, a Python text extraction library➢ Wrapper around several different extraction

libraries➢ Supported document formats include:

.pdf, .docx, .html, .xlsx,. pptx, and many more➢ Simple interface:

410 Crystal 42 | Design

import textract text = textract.process(file).decode(‘utf-8’)

Page 43: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Metadata Extraction: Goals

❖ Automate the classification of each document

➢ Determine the category of focus for each document

❖ Create a structure the frontend can process efficiently

➢ Communicate with frontend via RESTful API

410 Crystal 43 | Design

Page 44: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Metadata Extraction: Implementation❖ Extract the metadata

➢ Break text up into sentences

■ Offers more efficient means for locating keywords

● concurrent regex operations over a set of sentences

➢ Extract the relevant information

➢ Sentences containing keywords will be used for relaying information user desires

■ only display the relevant information

❖ Calculate the relatedness strength of target keywords

➢ As sentences are parsed, record a reference count of explicitly mentioned keywords

■ Target categories are read in from file (yaml), placed into fast access structure

■ Text is broken up into individual words

■ Words are matched against target categories

❖ Processed metadata is stored in database

➢ Avoid reprocessing documents every request

410 Crystal 44 | Design

Page 45: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Database | MySQL RDBMS

410 Crystal 45 | Design

❖ dbo.Documents➢ Contains all uploaded documents

❖ dbo.Keywords➢ All predefined keywords used for

text processing

❖ dbo.Frequencies➢ Determine the size of categories

displayed in the U.I.➢ Number of matches made for given

keyword during text processing➢ doc_id and key_id are attributed to

each matched keyword■ establish relationship between

document and keyword

Page 46: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Constructed JSON Object

❖ The extracted metadata is ultimately serialized in a JSON object the frontend will process➢ Constructs the U.I

❖ Acts as a hierarchical structure➢ Establish relationships between defined

keywords and document entities

{ "name": "Order of Battle", "children": [ { "name": "Ship", "children": [ { "name": "Aircraft Carrier", "children": [ { "name": "LIAONING CV-16", "children": [ { "name": "General", "children": [ { "name": "Hull Number", "size": "n" }, { "name": "Power Source", "size": "n" }, { "name": "Propulsion", "size": "n" }, { "name": "Speed", "size": "n" }, { "name": "Crew Size", "size": "n" }, "name": "Mission Areas", "children": [ { "name": "Surface Warfare", "size": "n" }, { "name": "Anti-Submarine", "size": "n" }, { "name": "Surveillance", "size": "n" }, { "name": "Command & Control", "size": "n" }

…..…......

410 Crystal 46 | Design

Page 47: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Communication with Frontend

❖ Constructed JSON object is sent to frontend for U.I.

➢ Assemble and update

❖ Communication is handled through a REST API built with the Flask web-framework

➢ GET

■ User sends server request for updates

■ https://localhost:8080/endpoint/update

➢ PUT

■ Creation or update of JSON object

■ https://localhost:8080/endpoint/write

➢ DELETE

■ Removal of erroneous document references from U.I configuration

■ https://localhost:8080/endpoint/documents?delete=<file_id>

410 Crystal 47 | Design

Page 48: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Risks

410 Crystal 48 | Design

Green: LowYellow: MediumRed: Severe

T: TechnicalU: UserC: Customer

Page 49: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Technical Risks

T1 Inaccurate Backend ResultsT2 Erroneous DocumentsT3 Database FailureT4 Inability to Scale PerformanceT5 Software BugsT6 Docker Failure

410 Crystal 49 | Design

Page 50: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

T1: Inaccurate Backend Results

410 Crystal 50 | Design

Description❖ Backend fails to place documents in

correct categories (incorrect mappings)

Mitigation

❖ Recalibrate connections and rebuild

database

❖ Explore alternative text extraction library

Page 51: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

T2: Erroneous Documents

Description❖ Malformed or corrupted documents are

inadvertently added to document repository

Mitigation

❖ Have backend gracefully skip over any

malformed or corrupted documents during

processing

410 Crystal 51 | Design

Page 52: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

T3: Database Issues

Description❖ Lack of documentation❖ Corrupt logs❖ Duplicate entries

Mitigation

❖ Concise documentation

❖ Log backups

❖ Data entry validation

410 Crystal 52 | Design

Page 53: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

T4: Inability to Scale Performance

Description❖ As the size of document sets increase,

application performance becomes unacceptable

Mitigation

❖ Only process documents that are new or

modified

❖ Perform periodic document processing at

a user specified interval and time

❖ Memcache

410 Crystal 53 | Design

Page 54: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

T5: Software Bugs

Description❖ Serious bugs are introduced to code base

during development, posing risk for application instability

Mitigation

❖ Test driven design

❖ Unit testing

❖ UI testing

❖ Continuous integration

410 Crystal 54 | Design

Page 55: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

T6: Docker Failure

Description❖ Host machine inadvertently loses

communication with Docker daemon

Mitigation

❖ Restart the Docker daemon

❖ Run the Docker daemon in debug mode

❖ Audit the Docker daemon configuration

file

410 Crystal 55 | Design

Page 56: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

User Risks

410 Crystal 56 | Design

U1 Difficult Interface NavigationU2 Erroneous Document DiscoveryU3 Using Internet Explorer

Page 57: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

U1: Difficult Interface Navigation

410 Crystal 57 | Design

Description❖ User has difficulty navigating the

user-interface

Mitigation

❖ Provide excellent user-documentation

❖ UX quality assurance

❖ Intuitive design strategy

Page 58: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

U2: Erroneous Document Discovery

Description❖ User discovers an erroneous document

while navigating the user interface

Mitigation

❖ Implement feature to allow user to flag

inappropriate documents for removal

410 Crystal 58 | Design

Page 59: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

U3: Using Internet Explorer

Description❖ User is currently using Internet Explorer

(i.e a browser that is notoriously broken)

Mitigation

❖ Have application check if user is running

Internet Explorer to display a gentle,

disableable message recommending

installation of Firefox or Chrome for the

best UX

410 Crystal 59 | Design

Page 60: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Customer Risk

410 Crystal 60 | Design

C1 DoD RestrictionsC2 Inefficient HardwareC3 Unsatisfied with Application

Page 61: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

C1: DoD Restrictions

410 Crystal 61 | Design

Description❖ DoD compliance regulations prevent

customer from installing application on DoD computers

Mitigation

❖ Maintain clear and consistent

communication with customer to take

necessary steps to make software DoD

compliant

Page 62: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

C2: Inefficient Hardware

Description❖ Customer is currently using computers

with hardware that does not meet the application minimum system requirements

Mitigation

❖ Customer explores cost effective

hardware to run application

❖ Clearly define minimum hardware

requirements

410 Crystal 62 | Design

Page 63: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

C3: Unsatisfied with Application

410 Crystal 63 | Design

Description❖ Customer is unhappy with the application

UI/UX

Mitigation

❖ Continuous Prototyping

❖ Iterative Design Strategy

Page 64: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Agile Development

410 Crystal 64 | Design

Page 65: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Conclusion

410 Crystal 65 | Design

● A document insight & comprehension tool● Provides a visual interface to explore relationships between documents● Organize and optimize information consumption

Page 66: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

References

Kubernetes Documentation. (2018). Retrieved from https://kubernetes.io/docs/home/

Configure and Troubleshoot the Docker Daemon. (2018). Retrieved from https://docs.docker.com/config/daemon/

MIT Kerberos Documentation for Administrators. (2018). Retrieved from

https://web.mit.edu/kerberos/krb5-devel/doc/admin/index.html

Traefik Basics. (2018). Retrieved from https://docs.traefik.io/basics/

Kerberos Sidecar Container. (2018). Retrieved from https://blog.openshift.com/kerberos-sidecar-container/

Textract Documentation. (2018). Retrieved from https://textract.readthedocs.io/en/stable/

OpenLDAP Software 2.4 Administrator’s Guide. (2018). Retrieved from

https://www.openldap.org/doc/admin24/OpenLDAP-Admin-Guide.pdf

MySQL Documentation. (2018). Retrieved from https://dev.mysql.com/doc/

410 Crystal 66 | Design

Page 67: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

References (continued)

Baker, D. (2017). The Document Similarity Network:A Novel Technique for Visualizing Relationships in Text Corpora.

Retrieved from https://www.math.hmc.edu/~dbaker/thesis/DylanBaker_thesis.pdf

Jusufi, I., Kerren, A., Liu, J., & Zimmer, B. (2014). Visual Exploration of Relationships between Document Clusters. Retrieved

from http://cs.lnu.se/isovis/pubs/docs/kerren-ivapp14-postprint.pdf

Andreotti, A. L., Silva, L. F., & Eler, D. M. (2018). Hybrid Visualization Approach to Show Documents Similarity and Content in

a Single View. Retrieved from http://www.mdpi.com/2078-2489/9/6/129/pdf

Hetzler, B., Harris, W. M., Havre, S., & Whitney, P. (1999). Visualizing the Full Spectrum of Document Relationships. Retrieved

from https://pdfs.semanticscholar.org/ad88/10fb843840dbf2879adfd21a43b7c8ea868a.pdf

SAS. (2018). 4 Reasons Why You Can't Do Without Data Visualization Any Longer. Retrieved from

https://www.sas.com/content/dam/SAS/it_it/doc/other1/ebook/sas-data-visualization-ebook-english.pdf

Burger, R. (2016). The Ultimate Guide to Agile Software Development. Retrieved from

https://blog.capterra.com/the-ultimate-guide-to-agile-software-development/

410 Crystal 67 | Design

Page 68: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

References (continued)

Sherpa Software. (n.d.). Structured and Unstructured Data: What is It? Retrieved from

https://sherpasoftware.com/blog/structured-and-unstructured-data-what-is-it/.

Qlik. (2018). Retrieved from https://www.qlik.com/us/products

Tableau. (2018). Retrieved from https://www.tableau.com/products

Envisioning. (2018). Retrieved from https://www.envisioning.io/

Bind 9 Administrator’s Guide. (2018). Retrieved from https://bind.isc.org/doc/arm/9.13/Bv9ARM.html

DHCP manual pages. (2018). Retrieved from https://www.isc.org/dhcp-manual-pages/

D3 Data Flow. (2018). Retrieved from

https://www.researchgate.net/publication/281286605_Visualizing_the_Universal_Data_Cube

Flask-RESTful User’s Guide. (2018). Retrieved from https://flask-restful.readthedocs.io/en/latest/

410 Crystal 68 | Design

Page 69: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

GlossaryACL: Access Control List, a list specifies to the operating system the access and respective operations a user has towards a listed object.

Agile: A software development methodology focused on incremental, iterative development cycles; where requirements and solutions evolve through collaborations with customers via continuous prototyping.

API: Application Programming Interface, an abstract set of functions, communication protocols, and other various tools that specify how software components can interact.

Async: Asynchronous processing, units of work running separately from the main thread (process is not blocked on async operation).

Containers: A standard unit of software that packages up all code and dependencies into a sandboxed, virtual environment.

DNS: Domain Name System, a decentralized naming system used to associate various information with assigned domain names.

DoD: Department of Defense.410 Crystal 69 | Design

DOM: Document Object Model.

JSON: JavaScript Object Notation.

KRB5: Kerberos 5 (current release), computer network authentication protocol that works on the notion of tickets (allow nodes communicating over a non-secure network to prove their identity to one another in a secure manner).

Method Chain: Technique used in Object Oriented Programming where multiple methods are invoked in sequence, with each method operating on the object returned by the previous call.

RDBMS: Relational Database Management System.

REST: Representational State Transfer.

Tangential Learning: A process by which people educate themselves on a topic when it is presented in a context that they enjoy (e.g video games that teach educational topics)

UI: User Interface.

UX: User Experience.

Page 70: documeta. - Old Dominion Universityadavidso/crystal/documeta_design.pdf · V. Current Process Flow 10 VI. Solution Statement 11 VII. Proposed Process Flow 12 VIII. Major Functional

Thank you all.

410 Crystal 70 | Design