Developing Chemical Information Systems (An Object-Oriented Approach Using Enterprise Java) ||...

6

Click here to load reader

Transcript of Developing Chemical Information Systems (An Object-Oriented Approach Using Enterprise Java) ||...

Page 1: Developing Chemical Information Systems (An Object-Oriented Approach Using Enterprise Java) || Software Architecture

43

CHAPTER 8

Software Architecture

Modern information systems provide many layers of abstractions to ease thedevelopment and increase the portability of the software. The operatingsystem is an abstraction layer that hides hardware architecture. For example,both Windows and Linux can run on Intel and AMD hardware, and applica-tion developers usually do not care what underlining hardware is being used.Virtual machines are a layer of abstraction that hides operating systems. Forexample, Java Virtual Machines are available for almost all kinds of operat-ing systems so that in most cases Java developers can write portable codewithout even thinking about the underlining operating systems that his/hercode has to run on. Microsoft Common Language Runtime (CLR) is asimilar concept, although its implementations on operating systems otherthan Windows remains to be seen. At least CLR is language independent inthat you can write your code in any language that is supported by .NET, andthey can call each other and interoperate seamlessly within the CLR.Application server specifications are another layer of abstraction withcontainers in which business components are deployed. For example, if youadhere to J2EE standard APIs, your J2EE components should be easilyportable from one application server implementation (Weblogic) to another(JBoss) or vice versa.

These abstraction layers offer tremendous benefits to the software devel-opment process with reduced development complexity and costs andincreased productivity. Application server platforms and blueprints alsoprovide software development frameworks to help the software fit intospecific architecture patterns. One of the most commonly adopted softwarearchitecture patterns for enterprise systems is the layered architecture(Buschmann et al., 1996; Fowler, 2003a). It is also the heart of the J2EEblueprint (Alur et al., 2003).

Developing Chemical Information Systems: An Object-Oriented ApproachUsing Enterprise Java, by Fan LiCopyright © 2007 John Wiley & Sons, Inc.

JWUS_Dcis_cH008.qxd 10/12/2006 8:42 PM Page 43

Page 2: Developing Chemical Information Systems (An Object-Oriented Approach Using Enterprise Java) || Software Architecture

The Layered Architectural Pattern: This helps to structure applica-tions that can be decomposed into groups of subtasks, in whicheach group of subtasks is at a particular level of abstraction.

In a layered architecture, the software system is divided into layers of sub-systems in which lower layers provide services to upper layers. A classicexample of the layered architecture is the ISO’s network protocol (Figure 8.1).

Please note that layers and tiers are two different concepts. Tiers mean thephysical separation of subsystems—each subsystem runs on a different hard-ware or the same hardware but in different processes. In a multitiered system,the interaction between the subsystems is accomplished through remote pro-cedure calls (RPCs). Any RPC involves network overhead and therefore hasa performance penalty whether the remote procedure is on a separate hard-ware or on the same physical hardware but in a different process. Layers, onthe other hand, are logical separations of the subsystems. Each layer can runon a different physical tier, or all layers can run on a single tier. The purposeof physical tiers is to leverage distributed hardware resources or to reuse apiece of software that is deployed on a different hardware that your systemwants to leverage. The purpose of layered software architecture is to separatethe system into highly cohesive and loosely coupled modules (see Chapter 2for software development principles).

44 SOFTWARE ARCHITECTURE

Figure 8.1 ISO network reference model.

Application Layer

Presentation Layer

Session Layer

Transport Layer

Network Layer

Link Layer

Physical Layer

JWUS_Dcis_cH008.qxd 10/12/2006 8:42 PM Page 44

Page 3: Developing Chemical Information Systems (An Object-Oriented Approach Using Enterprise Java) || Software Architecture

Figure 8.2 is a typical layered architecture in a Web application. It alsoshows how the layers are typically distributed among the physical tiers.

From the top, the client layer resides on an end user’s desktop, laptop, orhandheld device, which is typically, but not limited to, a Windows PC with aWeb browser. Usually a Web-based client layer is called thin client, which islightweight. The programs that run inside the browser are typicallyJavaScript, VB Script, Java Applets, ActiveX, or Web browser plug-in. Usinga rich client such as .NET or Java Swing is another choice, although this bookfocuses on a Web-based architecture.

The next three layers reside on an application middleware server, althoughin some systems, there is a further physical separation between the presenta-tion layer, which runs on a different hardware from the domain and dataaccess layers. If EJB is used in a J2EE application, the presentation layer runson a Web container and the domain layer runs on an EJB container. With theEJB local interface in J2EE 1.3, the separation becomes unnecessary, whicheliminates the network overhead between the two.

SOFTWARE ARCHITECTURE 45

Figure 8.2 A layered architecture in a Web application and how typically these layers aredistributed among the three physical tiers.

Client Tier

Client Layer

Middle Tier

Presentation Layer

Data Access Layer

Domain Layer

Data Tier

Data Source Layer

JWUS_Dcis_cH008.qxd 10/12/2006 8:42 PM Page 45

Page 4: Developing Chemical Information Systems (An Object-Oriented Approach Using Enterprise Java) || Software Architecture

Although the J2EE application usually implies a Java Servlet, JSP, andEnterprise Java Beans (EJB) based Web application, it does not mandate theuse of EJB. In fact, not using EJB gives you some performance advantagesand some programming freedoms. On the other hand, EJB, if used effec-tively, can ease the development and deployment efforts because the EJBcontainer provides a lot of low-level services to allow you to focus on busi-ness logic. However, being an effective EJB developer does not mean justunderstanding the APIs. You have to understand how the EJB containerworks to write robust and fast EJB objects. For example, Stateful SessionBeans are more expensive than Stateless Session Beans and should beavoided when possible. Entity Beans are far more expensive than sessionbeans and therefore should be used to represent “first class” entities (e.g.,Employee) in the database. Dependent objects should be used to represent“second class” entities (e.g., Address).

The presentation layer is responsible for receiving and “interpreting”requests from the client layer, delegating the request to the domain layer, andgenerating and presenting responses back to the client layer. Please note thatthe presentation layer should not actually process the requests. It should dele-gate the requests to the domain layer. This separation of the responsibilitiesincreases the cohesion of each layer and makes changes easier—a singledomain layer can support multiple flavors of presentation layers and vice versa.

The business layer (or domain layer) is the center of the system that doesthe real work. It implements all business logic and workflows. In J2EE, EJBcan be used to implement the Business layer. However, you can also use PlainOld Java Object (POJO) with an object-relational mapping tool or directJDBC API to do the job.

The data access layer (or data persistence layer) encapsulates interactions(select, insert, update, and delete) with the backend databases. The purposeof this layer is to hide database schemas from the business objects in thedomain layer so that when the database schemas change, the domain layer isnot affected. The data access layer can be implemented in Entity Bean orPOJO using the JDBC API. Entity Bean is not recommended for several rea-sons. First, as was discussed, there is a huge performance impact when EntityBeans are used. Second, if you use MDL RCG Oracle Gateway, you will notbe able to use Container Managed Persistence (CMP). In J2EE 1.3 and 1.4,the Enterprise Java Beans Query Language (EJB QL) does not supportMDLDirect operators. I do not know if it ever will. Not being able to useCPM inhibits one of the biggest advantages of Entity Bean. An alternative isto use an object-relational mapping tool such as Hibernate. In the chemicalinformation domain, there is the MDL Isentris Integrated Data SourceFramework. It does similar work that other object-relational mapping tools

46 SOFTWARE ARCHITECTURE

JWUS_Dcis_cH008.qxd 10/12/2006 8:42 PM Page 46

Page 5: Developing Chemical Information Systems (An Object-Oriented Approach Using Enterprise Java) || Software Architecture

do, but it has a high license fee. Hibernate, on the other hand, is an opensource tool and is free.

The very bottom layer in the architecture is the data storage layer. This iswhere the compound data are stored when a registration or update is com-mitted. Almost all chemistry database vendors use Oracle as the data storageDBMS, including MDL, Daylight, Accelrys, Tripos, and CambridgeSoft.They provide some kind of chemistry data cartridge that allows you to query,insert, update, and delete compound data using direct SQL. I have experienceusing MDL’s MDLDirect Data Cartridge version 2.0 with the MDL RCGdatabase, and I am very satisfied with it. Storing compound data in an Oracledatabase allows you to query across chemical and biological data easily,which is a huge advantage.

An obvious benefit of the layered architecture is that you can easily swapout a particular layer and replace it with a different one without impacting theservice consumer layer above it provided that the service consumer layer isdependent on the interface of the layer being swapped out rather than itsimplementation. For example, the chemistry intelligence component residesin the domain layer in Figure 8.2. Assume the component has an implemen-tation independent interface on which the presentation layer is dependent.Today you are using vendor A’s implementation of that interface. For somereason (maybe the vendor is going out of business; another vendor has a bet-ter implementation or provides a better price; or you have developed a betterin-house implementation) you want to replace it with a different implemen-tation; all you have to do is to swap out the component and replace it with theother. To achieve easy plug and play, the higher level layer must be depend-ent on the abstraction of the lower level layer, not its implementation. This iscalled The Dependent Inversion Principle (Martin, 2003), which is discussedfurther in Chapters 10–12.

This book demonstrates how the layered architecture can be used in anenterprise chemical information system.

MDL’s new architecture Isentris is based on a layered architecture. It pro-vides services that a standard J2EE application server provides such as ses-sion management, object lifecycle management, messaging, object pooling,and object-relational mapping. It also provides chemical informatics func-tionality such as chemistry rules, compound registration, and a standardquery language for both chemical structure and alpha-numeric data. In myview, Isentris is still young and needs some time to become mature. It is alsonot cheap compared with J2EE application server products such as BEAWeblogic and IBM Websphere. However, if your organization does not haveJ2EE or .NET expertise in-house, it is worth considering as a post-ISISarchitecture.

SOFTWARE ARCHITECTURE 47

JWUS_Dcis_cH008.qxd 10/12/2006 8:42 PM Page 47

Page 6: Developing Chemical Information Systems (An Object-Oriented Approach Using Enterprise Java) || Software Architecture

There are other architecture patterns, one of which is Pipe and Filter(Buschmann et al., 1996). Pipeline Pilot of SciTegic (now part of Accelrys) isa good application of the Pipe and Filter Pattern and is widely used in thechemical information domain.

Pipe and Filter Pattern: Data flows between the filters via pipes, andthe filters apply some logic to the data so that the data that flowsout from a filter is the data needed by the next filter.

48 SOFTWARE ARCHITECTURE

JWUS_Dcis_cH008.qxd 10/12/2006 8:42 PM Page 48