Textus Overview
Transcript of Textus Overview
TEXTUSImport, structure, annotate, read & share
An overview of the TEXTUS project. Early design phase, 15th Feb 2012, Tom Oinn ([email protected])
Document Pipeline
Import Read
Share
Add Structure
Annotate
Annotations● Defined as any piece of metadata pertaining
to a section of the text, whether automatically created or manually added by a user
● Location, type and type specific fields○ References back to scanned pages○ Free text comments○ Author, title, edition, publication date and other
bibliographic metadata● Locations stored as character ranges within
text, but rendered with reference to structure
Structure● Heirarchical, initially containing a single root
node corresponding to the entire text● Nodes have type, location and metadata
○ Types determined by type of text○ Location specified as character range
■ Location must be contained by parent location● Shows context when browsing● Translates annotation locations for display
○ e.g. "20665 to 20782" becomes "Section 1 chapter 2 paragraph 12 character 2 to..."
● Incrementally defined - can read and annotate texts before any structure exists
Read● View text in browser or download as PDF,
ebook etc.● View annotations pertaining to currently
visible text○ Filter annotations by annotation metadata
■ Creator, date etc.■ Simple search mechanism to create filters
○ See count of non-visible annotations● Location and annotation visibility
preferences stored on a per-text, per-user basis
Share● Create sets of annotations
○ Sets have identifiers themselves and can be shared○ A set containing other sets copies the contained sets
■ Sets of annotations don't change unless the creator explicitly changes them!
● Provisional annotations○ Mark annotations as 'not validated' - still visible but
flagged as pending approval● Share any location and set of annotations
○ Stable link which can be shared like any other URL
Specialisation● For each document type we define
○ The node types in the structure tree○ The annotation types available
● Storage and server model is the same for all texts
● Front-end presentation may be different to take advantage of type specific structures