Thinking like the Modern Operating Systems: The Omega ... · Application Programming Interfaces...
Transcript of Thinking like the Modern Operating Systems: The Omega ... · Application Programming Interfaces...
Global Philology Open Conference - February 20-23, 2017 - Leipzig, Germany
Thinking like the "Modern Operating Systems": The Omega architecture and
the Clavius on the Web projectAngelo Mario Del Grosso
Emiliano GiovannettiSimone Marchi
{nome.cognome}@ilc.cnr.it
Istituto di Linguistica Computazionale Consiglio Nazionale delle Ricerche
Literary Computing Grouplicolab.ilc.cnr.it
Global Philology Open Conference - February 20-23, 2017 - Leipzig, Germany
❏ Clavius on the Web Context
❏ Clavius on the Web Demo
❏ Domain-Driven Design within the Digital Humanities Field
❏ Abstract Data Types for the Textual Scholarship Domain
❏ Microkernel Architecture and the Omega Framework
❏ The Omega Core Entities and a Domain Specific API Example
❏ Conclusions
2
Talk Outline
Global Philology Open Conference - February 20-23, 2017 - Leipzig, Germany
Lexica
A Omega Case Study: Clavius on The Web project
3
Web GUIClavius Edition
(Client)
Omega Clavius Edition
(Server)
Developing Omega for enhancing DH tools adopting the domain driven design and the microkernel architecture
Aim: to preserve, exploit and promote the correspondence of Christophorus Clavius (1537-1612) Partner: IIT-CNR, ILC-CNR, APUG Funding: Registro.it
SearchTEA
API
sourceloci
annotations
Global Philology Open Conference - February 20-23, 2017 - Leipzig, Germany 4
1. Clavius Digital Archive
2. Encoder and Annotator sample 1
3. Encoder and Annotator sample 2
4. Search v1
5. Search v2 (Annotarium)
6. CLAVIUS LEXICA URL
7. InfoVis Annotation
DEMO of the Clavius on the Web Prototype Tools
The prototype tools have been developed within a close collaboration between the Institute of Informatics and Telematics (IIT-CNR) and the Institute for Computational Linguistic (ILC-CNR)
Global Philology Open Conference - February 20-23, 2017 - Leipzig, Germany
Are There Tools That Meet The (Digital) Humanists Needs?
5
Monica Berti:It is important to "build a model for
representing quotations and text reuses of lost works in a digital environment." (in jTEI 2014)
Franz Fischer:"There is no out-of-the-box
software available for creating truly critical and truly digital editions at the same time." (in Variants 2013)
Elena Pierazzo:"da molte parti si lamenta la mancanza di
software e strumenti facili da usare e che possano limitare la necessita da parte degli editori di fare tutto da soli [...], ci si potrebbe chiedere come mai con quasi 30 anni di ricerca nel settore delle edizioni digitali ci siano a tutt'oggi un numero così limitato di strumenti di tale genere." (in AIUCD2016)
Peter Robinson:"Study of literary works that exist in
many different forms is one of the most important and difficult tasks in the humanities. There are many editorial tools under development and a few that are already functional." (in DH2016)
Global Philology Open Conference - February 20-23, 2017 - Leipzig, Germany
Designing applications for digital humanities is a challenging task due to incomplete information, discordant and demanding requirements.
Abstraction Tiers and Communication among DH Actors
6
Knoernschild, Kirk. Java application architecture: modularity patterns with examples using OSGi. Prentice Hall Press, 2012.
Global Philology Open Conference - February 20-23, 2017 - Leipzig, Germany
Abstraction Tiers and Communication among DH Actors
7
Knoernschild, Kirk. Java application architecture: modularity patterns with examples using OSGi. Prentice Hall Press, 2012.
Designing applications for digital humanities is a challenging task due to incomplete information, discordant and demanding requirements.
Global Philology Open Conference - February 20-23, 2017 - Leipzig, Germany
Aim of our Approach Following the Domain Driven Design
Designing and developing digital scholarly tools starts from the definition of formal abstractions within the textual scholarship domain.
To formalize conceptual models starting from a set of basic entities representing a first level of data abstraction. Subsequently, to define appropriate Abstract Data Types (ADT) and the relative functionalities. The development of a suitable OO domain model ensures the flexibility and reusability of the implemented software for different contexts and different goals.
8
Graphical User Interface(Client)
DATA(Persistence)
DOMAINMODEL
(Omega)
Global Philology Open Conference - February 20-23, 2017 - Leipzig, Germany 9
Domain Driven Design for Textual Scholarship
Our idea is to avoid to produce software for Digital Humanities too quickly that just resemble a quite success. Indeed, the little focus based on the design of the domain model leads to troubles in reusing, evolving and maintaining digital scholarly applications.
Our research challenge is to adopt the Domain‐Driven Design (DDD) practices, principles and patterns (defined by Eric Evans) within the Digital Textual Scholarship field. In fact, DDD encourages to shape the logical design and structure of a software application by using modeling languages such as UML and Design/Architectural Patterns so that the whole community (also the non‐technical people) can understand and actively participate in how software is constructed.
Global Philology Open Conference - February 20-23, 2017 - Leipzig, Germany
Application Programming Interfaces Design (API)
★ The proper design of Application Programming Interfaces is a critical task within a Domain Driven Application, because APIs are the only point of dependence between clients (users) and service providers (who implement the ADT).
10
★ The APIs formally establish the behavior of the Abstract Data Types for all possible inputs and interactions. Thus the Interfaces can be used to verify the correctness of the implementations against the specifications.
‣ Specification problem: undecidable if implementation meets the specs for all possible inputs.
Global Philology Open Conference - February 20-23, 2017 - Leipzig, Germany
Definition of an Abstract Data Type (ADT)
An ADT is a high-level, user-defined and customizable data type. The ADT internal representation is not directly accessible from users (information hiding). Actually, data and functions which operate on them, are bound into a single entity.
11
The OO paradigm inherently guarantees data encapsulation as ADT in an effective and efficient way.
There is a very strong distinction between the usage and the implementation of the ADT.
The focus has been turned from the data value and data representation to the component behavior by means of suitable Application Programming Interfaces (API).
The ADT functions have mathematically well-defined
properties.
Stack ADT
Global Philology Open Conference - February 20-23, 2017 - Leipzig, Germany
Generic Model of Abstraction Reference
Document doc = AbstractBuilderFactory.buildDocument( new File("features.properties"), new File("teiDocument.xml") );Sentence[] sentences = doc.getContentCollection(Sentence.class);
Abstraction of an electronic document (encoded at low-level with TEI-XML) through the OOP by using Java programming language.
doc is an instance of the ADT Document Class representing the encoded document. This hides the implementation details and exposes all and only the necessary functionalities for processing.
12
Global Philology Open Conference - February 20-23, 2017 - Leipzig, Germany
What Do We Need?
13
A suitable digital scholarly framework for encoding, processing, persisting, indexing, comparing, interpreting, and retrieving literary and historical documents is still missing.The main reason of this matter is the lack of a formal, shared and structured representation of different types of textual entities that occur in the humanities domain, followed by a standardization of the relative behavior and operations.
Type Systemfor Textual Scholarship
Global Philology Open Conference - February 20-23, 2017 - Leipzig, Germany
Microkernel Architecture Pattern in the Omega Framework
The Microkernel is a POSA Pattern - adopted in various modern operating systems - which provides the minimal functional entities of a system. It manages the evolution of the application in term of functional and non-functional requirements. The external functionality can be plugged into the microkernel through specific interfaces and Adapters.
14
An important modern approach in designing applications - that we believe it is important to care about - is the re-engineering of the traditional monolithic approach into independent modules/components. These components run on top of the microkernel that handles domain use cases as micro services. These modules message to each other and to the microkernel by using suitable communication mechanisms and patterns.
We are putting in place the microkernel approach to realize a modular framework for textual scholarship applications including existing software libraries as well.
UML Class Diagram shaping the Microkernel Architectural Pattern
Internal Server
Adapter Client
Calls Calls
Calls
Calls
External Server Microkernel
Global Philology Open Conference - February 20-23, 2017 - Leipzig, Germany
Multilayered Annotation
LexicaTerminologies
Ontologies
✓ Follow the Domain Driven Design and Domain Specific Modeling
✓ Implement the Microkernel Architecture
✓ Define the Core Entities ✓ Model the domain of interest by adopting an
Object-Oriented Approach✓ Implement the Domain Specific
Abstract Data Types ✓ Design the domain model by using the
Unified Modeling Language (UML)✓ Develop components by following Design
Patterns and Technologies for Knowledge Representation
Provide a flexible method, reusable tools and long-term infrastructure for
scholarly textual processing
15
Towards the Omega Framework
Assisted Translation
Advanced Search
Editing and Publishing
ΩDS-ADT Middle API
DS-ADT Low API
μkernel
UseProvides
Use cases
Image management
Linguistic Analysis
Text Processing
Global Philology Open Conference - February 20-23, 2017 - Leipzig, Germany 16
The Omega Core Entities for Domain Specific-ADTIt encapsulates the information conveyed by the resource.
It is in charge of managing the raw data.
It identifies specific data fragments of the resource content, and it is used to establish the boundaries of an annotation.
It represents an information associated to a locus; an annotation is a source in itself and, thus, it can be recursively annotated.
It indicates the nature of the Source (e.g. text, image, audio, etc.).
It represents a coordinate of a locus; depending on the SourceType, POIs define the boundaries of a sequence of interest (textual fragment) or a region of interest (image portion).
It indicates the type of the Annotation (e.g. a token, a lemma, a named entity, etc.).
These classes implement the Role Design Pattern, which is meant to manage changes in the underlying text representation schema.
Global Philology Open Conference - February 20-23, 2017 - Leipzig, Germany
text = Text.of("Literary Text to process",URI.create("//source/text/000"));
annotation = AnnotationText.of("Annotation on the text", URI.create("//annotation/text/123"));
annotation.addLocus(text, 13, 18); annotation.save();
Application Program Interface and Data Representation
17
Global Philology Open Conference - February 20-23, 2017 - Leipzig, Germany 18
WHAT WE HAVE DONE:➔ Outlined a methodology for Software Engineering in Textual Scholarship
based on Abstract Data Types defined through the Domain Driven Design
➔ Implemented a microkernel architecture for modular applications➔ defined a set of “core entities” for Domain Specific-ADT➔ Started to apply this approach whenever we develop new tools as shown
in Clavius on the Web case study➔ Published the source code on GitHub (stay tuned at https://goo.gl/mMip5T)
WHAT WE WANT TO DO:★ To carry on the work by enhancing the first set of APIs★ To implement a first set of use cases★ To share Omega with the community, foster its application to
more use cases and, last but not least, to involve other developers in its advancement
Conclusions
Global Philology Open Conference - February 20-23, 2017 - Leipzig, Germany
Selected ReferencesB. Almas, M.C. Beaulieu, 2013. Developing a New Integrated Editing Platform for Source Documents in Classics. LLC (28). pp. 493-503.
M. Berti, et al., 2015. The Linked Fragment: TEI and the Encoding of Text Reuses of Lost Authors. JTEI (8).
F. Boschetti, A.M. Del Grosso, 2015. TeiCoPhiLib: A library of components for the domain of collaborative philology. JTEI (8).
G. Crane, et al. 2014. Participatory Philology: Computational Linguistics and the Future of Historical Language Education. Human Computation 1(2). pp. 177-84.
B. Dathan, S. Ramnath, 2015. Object--Oriented Analysis, Design and Implementation, An Integrated Approach. Springer.
A.M. Del Grosso, et al., 2016. Defining the Core Entities of an Environment for Textual Processing in Literary Computing, DH
E. Evans, 2003. Domain‐Driven Design: Tackling Complexity in the Heart of Software. Addison‐Wesley Professional
F. Fischer, 2013. All texts are equal, but... Textual Plurality and the Critical Text in Digital Scholarly Editions. Variants (10). pp. 77-92.
M. Gabbrielli, S. Martini, 2010. Programming Languages: Principles and Paradigms. Springer pp. 265--332.
F. Gibbs, T. Owens, 2012. Building Better Digital Humanities Tools: Toward broader audiences and user--centered designs, DHQ 6 (2).
W. McCarty, 2005. Humanities Computing. Palgrave Macmillan.
S. Millett, N. Tune, 2015. Patterns, Principles, and Practices of Domain-Driven Design. John Wiley & Sons, Inc.
E. Pierazzo, 2015. Digital Scholarly Editing : Theories, Models and Methods. Farnham Surrey: Ashgate.
P. Robinson, 2013. Toward a Theory of Digital Editions, Variants (10). pp. 105--132.
P.L. Shillingsburg, 2015. Development Principles for Virtual Archives and Editions, Variants (11). pp. 9-28.
S. Schreibman, et al., 2016. A New Companion to Digital Humanities, 2nd Edition. Wiley -Blackwell.
M. Thaller, 2006. Waiting for the Next Wave: Humanities Computing in Computers, Literature and Philology (CLiP), King’s College London.
F. Tomasi, et al., 2014. Proceedings of the Third AIUCD Annual Conference on Humanities and their Methods in the Digital Ecosystem. ACM, New York, NY, USA.
E. Vanhoutte, 2010. Defining Electronic Editions: A Historical and Functional Perspective, in Text and Genre in Reconstruction. Effects of Digitalization on Ideas, Behaviours, Products and Institutions, pp. 119--144.
V. Vernon, 2013. Implementing domain--driven design. Addison- Wesley.
19
Global Philology Open Conference - February 20-23, 2017 - Leipzig, Germany
Angelo Mario Del GrossoEmiliano Giovannetti
Simone Marchi{nome.cognome}@ilc.cnr.it
Thinking like the "Modern Operating Systems": The Omega architecture and the Clavius on the Web project
Istituto di Linguistica Computazionale Consiglio Nazionale delle Ricerche
Literary Computing Grouplicolab.ilc.cnr.it
THANK YOU!Any questions?