Hippo TechWP Understanding Hippo CMS 7 Software Architecture

13
  Colofon Author: Woonsan Ko Get in touch with Hippo: [email protected] North America: +1 877 414 47 76 (toll free) Europe: +31 20 522 44 66 Introduction This document describes the architecture of Hippo CMS on an abstract level. This document aims to provide the basic understanding of the Hippo CMS software architecture to enable Architects to dene their custom requirements for their specic IT ecosystem. Whitepaper Amsterdam Boston Follow the Hippo trail: onehippo.com Understanding Hippo C MS 7 Software Architecture

description

Hippo TechWP Understanding Hippo CMS 7 Software Architecture

Transcript of Hippo TechWP Understanding Hippo CMS 7 Software Architecture

  • Colofon

    Author: Woonsan Ko

    Get in touch with Hippo: [email protected]

    North America: +1 877 414 47 76 (toll free)

    Europe: +31 20 522 44 66

    Introduction

    This document describes the architecture of Hippo CMS on an abstract level. This document aims

    to provide the basic understanding of the Hippo CMS software architecture to enable Architects to

    define their custom requirements for their specific IT ecosystem.

    Whitepaper

    Amsterdam Boston Follow the Hippo trail: onehippo.com

    Understanding Hippo CMS 7 Software Architecture

  • Table of Contents

    1. Introduction 3

    2. Use Cases Overview 4 Actors List 5

    Use Cases List 5

    2. Quality Attributes 5 User/Usability Considerations 5

    Runtime Qualities 5

    Availability 5

    Interoperability 5

    Manageability 5

    Performance 5

    Reliability 5

    Scalability 5

    Security 5

    Design Qualities 5

    Modifiability 5

    Maintainability 5

    Reusability 5

    System Qualities 5

    Supportability 5

    4. Software Architectures 6 Overall Views 7

    Content Production 8

    Content Delivery 9

    Security Concerns 10

    Module View 10

    Simple Enterprise SSO Enabled Architecture-Deployment View 10

    View Shibboleth/SAML Enabled Architecture 11

    5. Summary 12

    6. References 12

    2

    Amsterdam Boston Follow the Hippo trail: onehippo.com

    Whitepaper

  • Und

    erst

    andi

    ng H

    ipo

    CM

    S 7

    Sof

    twar

    e A

    rchi

    tect

    ure

    1. Introduction This document describes the software architecture of

    Hippo CMS on a high level to help various stakeholders

    understand Hippo CMS in general.

    For enterprise architects and technology leaders,

    this document will help to understand how Hippo CMS

    can solve technical problems and challenges and how

    Hippo CMS can be integrated with industry standard

    technologies transparently and seamlessly.

    For developers,

    this will help understand how components are interacting

    with each other and how modules are constructed and and

    relate to each other.

    For system administrators,

    this will help understand how systems are deployed,

    maintained and monitored.

    For testers,

    this will help understand how systems can be integrated

    and tested properly.

    Amsterdam Boston Follow the Hippo trail: onehippo.com

    Whitepaper

    3

  • Actors

    Web User Users accessing the websites to

    search, view and edit (e.g., adding

    comments to an article) the

    content of the system. Can be an

    anonymous or authenticated user.

    Author CMS users who can create and

    edit content in the system, but

    normally cannot publish the

    content directly by themselves.

    They can request publication on

    the content instead.

    Editor CMS users who can create, edit,

    publish and depublish content in

    the system. Normally they can

    approve the publication requests

    from authors.

    Webmaster Webmaster who manages

    channels, content delivery web

    application configurations and

    personas.

    Administrator CMS Administrator who can

    manage access controls of the

    system, manage content migration

    tasks from the external sources

    and do all other CMS related

    administration tasks.

    Search Engine External search engines. e.g.,

    Google, Bing, etc. CMS provides

    Search Engine Optimization

    features for external search

    engines.

    Content Subscriber External users or systems,

    subscribing the content through

    RSS, Atom feeds, e-mails, etc.

    External Content Source External content source which can

    be migrated into CMS. e.g., XML

    file/folders and images/assets,

    external repositories, etc.

    Use Cases

    Search Content Web User searches content by

    text, categories, tags or metadata.

    View Content Web User views selected content

    on the website. External Search

    Engines also crawl content or

    metadata on the website for

    indexing purpose. The website

    content includes generated

    content in the markups, meta tags,

    sitemap data, etc.

    View Social Content Web User views opinions,

    comments, etc. shared by Social

    Applications (e.g., Facebook,

    Twitter, Disqus, etc.) on the content.

    Edit Content

    Web User can create and edit

    content on the website. e.g.,

    comments on an article.

    Syndicate Content Content Subscriber can receive

    content through RSS, Atom feeds,

    e-mails, etc.

    Author Content Author creates and edits content

    in CMS, and requests publication

    on the content.

    Manage Publication Editor can accept or reject the

    publication requests on the

    content, and schedule publication

    of the content.

    Manage Channel Webmaster manages channels and

    their associated content delivery

    web application configurations.

    Manage Persona Webmaster configures collectors,

    defines characteristics and

    manages persona for targeting

    and personalization.

    Manage Access Control Administrator manages access

    controls by defining users, groups

    and privileges per each content

    security domain.

    Migrate Content Administrator configures and

    monitors content migration tasks.

    External Content Source External content source which can

    be migrated into CMS. e.g., XML

    file/folders and images/assets,

    external repositories, etc.

    Site

    CMS

    Search Engine

    Editor

    Author

    Web User

    Social Application

    Webmaster

    Content Subscriber

    Administrator

    External Content Source

    Key Roles and their Use Cases

    Search Content

    View Content

    View Social Content

    Edit Content

    Author Content

    Syndicate Content

    Manage Channel

    Manage Persona

    Manage Publication

    Manage Access Control

    Migrate Content

    >

    Amsterdam Boston Follow the Hippo trail: onehippo.com

    Whitepaper

    4

    Und

    erst

    andi

    ng H

    ipo

    CM

    S 7

    Sof

    twar

    e A

    rchi

    tect

    ure

  • Und

    erst

    andi

    ng H

    ipo

    CM

    S 7

    Sof

    twar

    e A

    rchi

    tect

    ure

    Amsterdam Boston Follow the Hippo trail: onehippo.com

    2. Use Cases Overview

    The use cases overview diagram on the previous page

    shows higher level use cases. Please note that the actors

    and use cases will vary in specific systems and the

    overview shows a simplified generic model only.

    3. Quality Attributes

    Hippo follows a number of core architectural principles

    when designing features and functionalities for Hippo

    CMS, which are described in more detail below.

    User/Usability Considerations

    Intuitive and customizable user experiences for easy

    creation and management of web content without IT

    assistance.

    Enables authors to manage multi lingual have content

    and deliver the right content based on user locale.

    Enables to publish content to multiple, disparate

    channels such as mobile, email, website, landing page,

    social network, print, and campaign management

    system.

    Enables to target and personalize to any visitor based

    on context, behavior, geography and profile data.

    Easy management of rich media files such as images

    and asset files, and seamless integration for streaming

    audio and videos.

    Provides social and community features that can be

    incorporated into existing sites for sharing of opinions,

    articles and reviews.

    Runtime Qualities

    Availability

    Websites are mission critical systems for online

    businesses. The website and CMS system have to be

    functional and working without being affected by any

    critical system errors, infrastructure problems, malicious

    attacks and system loads.

    Interoperability

    Conforms web standards (e.g., W3C XHTML, CSS, etc.).

    Supports search engine optimizations for website.

    Supports enterprise search engine integration.

    Enables industry standards based communications

    (e.g., Servlet/JSP/JSTL, JSR-283, etc.) and supports

    easy JEE standard based integrations.

    Allows standards based information exchanges. e.g.,

    XML, RSS, Atom feeds, RESTful APIs.

    Enables to use external authoring tools such as Office

    documents.

    Content migration support from external sources in

    various formats such as XML and image files.

    Application Integration support. Enables seamless

    integration with various enterprise systems such as

    eCommerce, CRM, BPM, etc.

    Manageability

    Supports easy management with useful tools for

    monitoring and tuning.

    Performance

    Meets performance needs for online business. System

    have to be responsive to execute any action within a given

    time interval. Reliability

    Remains operational over time. System will not fail to

    perform its intended functions over a specified time

    interval.

    Scalability

    Meets massive scalability for online business. System

    should handle increases in load without impact on the

    performance of the system and should be readily enlarged.

    Security

    Provides safe and secure access, and supports access

    control with user groups on domain rules.

    Supports integration with industry security standards

    based solutions, e.g., JAAS, LDAP, Single sign-on using

    popular Central Authentication Services (e.g. OpenID,

    SAML, Shibboleth, SiteMinder etc.)

    Supports customization and plugging in custom

    authentication implementations, Remember-me,

    CAPTCHA, concurrent sessions etc.

    Whitepaper

    5

  • Amsterdam Boston Follow the Hippo trail: onehippo.com

    Whitepaper

    6Design Qualities Modifiability

    Support templates to be applied to new and existing

    content, allowing the appearance of all content to be

    changed from one central place.

    Support plugin architecture. Developers should be able

    to add functionalities with plugins easily.

    Maintainability

    Should be easy to undergo changes on components,

    services, features and interfaces when adding or

    changing functionalities, fixing errors, and meeting new

    business requirements.

    Reusability

    Components and subsystems should be designed to

    be suitable for use in other applications and in other

    scenarios as much as possible in order to minimize the

    duplication of components and implementation time.

    System Qualities

    Supportability

    Should provide information helpful for identifying and

    resolving issues when it fails to work correctly.

    Und

    erst

    andi

    ng H

    ipo

    CM

    S 7

    Sof

    twar

    e A

    rchi

    tect

    ure

  • Und

    erst

    andi

    ng H

    ipo

    CM

    S 7

    Sof

    twar

    e A

    rchi

    tect

    ure

    Browser Clients can visit websites through HTTP/S connections.

    Search Engines can crawl websites and metadata through HTTP/S connections.

    Content Subscribers (e.g., RSS/Atom feed clients) can retrieve contents through content syndication protocols.

    JCR Clients can communicate with the repository, which can be deployed onto the same Application Server node as shown above or a

    separate Application Server node, through the JCR API. The underlying

    connection for JCR API for remote clients can be either WebDAV or

    RMI. By the standard of JCR API, JCR Clients can communicate with

    the repository in either client-server invocation style or asynchronous

    event subscription style.

    The content delivery web application, SITE, can be deployed onto any JEE compliant servlet containers such as Tomcat 6, JBoss, Glassfish,

    WebSphere, etc.

    The content production web application, CMS, can be deployed onto any JEE compliant servlet containers such as Tomcat 6, JBoss,

    Glassfish, WebSphere, etc.

    The repository server module, Repository, can be deployed onto an application server together with the CMS web application, but also can

    be deployed onto a separate application server or as a separate web

    application.

    Each repository instance has its own Lucene index, while all the cluster repository nodes should share the same DBMS.

    Hippo CMS can leverage CouchBase as separate server installation for storing visitor data used for targeting and analytics.

    Hippo repository supports various DBMS such as MySQL, PostgreSQL, Oracle, MS SQL, Amazon RDS and IBM DB2.

    System Admin Tools are mostly web-based applications and can be deployed onto any JEE compliant servlet containers. System

    administrators can also use JMX tools to monitor JVMs and Content

    Delivery web applications, by connecting through JMX protocol (either

    local, RMI, or HTTP based).

    Content Import Tool application, which imports XML files and binary files into the repository, can be deployed onto any JEE compliant

    servlet containers, too.

    Amsterdam Boston Follow the Hippo trail: onehippo.com

    Whitepaper

    7System Overview

    An explanation of core nodes:

    JCR

    (ove

    r W

    ebDA

    V o

    r R

    M)

    RSS/

    Atom

    < < artifact > > < < war > >

    CMS

    < < artifact > > < < war > >

    System Admin Tools

    l

    < < artifact > > < < war > >

    http

    http

    Load Balancer

    HTTPd DBMS

    Lucene Index (1)

    Lucene Index (n)

    < < artifact > > < < jar > >

    Repository

    < < artifact > > < < war > >

    SITE

    Application Server (1)

    Browser Client

    Content Subscriber

    JCR Client

    Search Engine

    CouchBase Server

    Application Server (n)

    Import Tool

    4. Software Architecture

    Overall Views

    Deployment View

    The following deployment view shows a simplified deployment with core systems. For simplicity, it just shows a typical

    simple deployment, without considering specific concerns such as security, caching options,

    etc. in detail.

  • Amsterdam Boston Follow the Hippo trail: onehippo.com

    Whitepaper

    8

    The Content production CMS web application is based on the Apache Wicket framework, providing a very flexible Hippo plugins architecture

    and extensions points.

    Like normal Apache Wicket applications, a page component consists of descendant components. In addition, the CMS Frontend Plugin

    Architecture allows dynamic plugin components aggregation, at

    runtime, which can be configured in the repository, without having to

    know all the descendant components at design time.

    With rich component set of Wicket based plugins, native Wicket AJAX support and extensions of ExtJS and jQuery by Hippo, usability can be

    maximized.

    WicketApplication is just a standard Apache Wicket filter, and Main is the entry point Wicket application which shows homepages such

    as PluginPage, LoginPage, etc. and has the UserSession which is

    associated with a JCR Session.

    PluginPage consists of multi ple Perspective plugin components and each Perspective plugin component consists of multiple child plugin components.

    For example, Dashboard Perspective plugin component consists of Document Wizard, Document History, Todo List, Folder Tree, Document

    List plugin components, etc.

    A plugin component can contain multiple child plugin components by defining configurations in the repository, which increases

    customizability, maintainability and reusability.

    All component may use the JCR Session in the UserSession to retrieve/update content in the repository through JCR API. Also they may use

    Hippo Repository API to handle virtual nodes and workflows.

    Because it has to communicate with SITE web application at runtime when composing page layouts or assembling components in pages,

    ChannelManager Perspective can connect to the ChannelManager

    through RESTful APIs.

    < < http/ rest > >

    < < Servlet Filter > > Wicket Application

    User Session Hippo JCR Repository

    Plugin Config Service

    Plugin Page

    Login Page

    Browse Perspective

    Admin Perspective

    Reports Perspective

    Targeting Perspective

    JCR Session

    Plugin Config

    Main

    Document Wizard Plugin

    Document History Plugin

    Folder Tree Plugin

    Document List Plugin

    Dashboard Perspective

    Todo List Plugin

    Experience Optimizer Plugin

    Google Analytics Plugin

    Channel Manager

    Channel Manager

    Perspective

    Content Management System Application Components and Connectors View

    An explanation of core components and connectors:

    Und

    erst

    andi

    ng H

    ipo

    CM

    S 7

    Sof

    twar

    e A

    rchi

    tect

    ure

  • Und

    erst

    andi

    ng H

    ipo

    CM

    S 7

    Sof

    twar

    e A

    rchi

    tect

    ure

    Amsterdam Boston Follow the Hippo trail: onehippo.com

    Whitepaper

    9

    HST Core is the core module of the Hippo Delivery Tier, including component manager, pipeline, valves, component invoker, etc.

    HST Client is the base module for Content Delivery applications such as custom HSTComponents, containing base component classes,

    utilities, etc.

    HST JAXRS contains RESTful API support components based on JAX-RS standard. Custom JAXRS Resource Beans can be implemented

    based on the JAX-RS standard and configured with this module in the

    Spring Framework configurations.

    HST Security contains authentication/authorization support for websites, including JAAS and form based authentication support.

    Spring Security Framework can be configured with this, too, in order to

    support various security requirements such as SiteMinder integration,

    Enterprise SSO integrations, etc.

    HST Session Pool has JCR Session pool support, with sophisticated resource management and JMX management features.

    HST Content Beans has Object-Content mapping support, which allows to map JCR content nodes to POJOs and vice versa.

    HST Rewriter has HTML content rewriting support with link and image rewriting features.

    HST Commons has default implementations of standard interfaces and common utilities.

    HST API provides all the standard APIs of the Hippo Delivery Tier.

    HST Mock contains some necessary mocking classes for easy unit testing, which increases testability.

    The Content Delivery Framework depends on Hippo Repository API and JCR API.

    Hippo Delivery Tier Modules view

    HST Core

    HST Security HST Content Beans HST Rewriter

    HST Commons HST API HST Mock

    JCR API

    HST Client

    HST Session Pool

    Hippo Repository API

    HST JAXRS

    The figure above shows module dependencies of the Hippo Delivery Tier:

    Und

    erst

    andi

    ng H

    ipo

    CM

    S 7

    Sof

    twar

    e A

    rchi

    tect

    ure

  • Amsterdam Boston Follow the Hippo trail: onehippo.com

    Whitepaper

    10

    httpd authenticates the secured resources by mod_shib2 module, which invokes Shibboleth Daemon. Shibboleth Daemon may communicate with Enterprise Shibboleth Identity Provider via SAML/HTTPS.

    If the user is successfully authenticated by the handshakes between Client Browser and Shibboleth Identity Provider, then httpd will do reverse proxy to Tomcat.

    If the user is authenticated in the httpd level by mod_shib2 module and Shibboleth Daemon, then it is regarded as pre-authenticated state from the viewpoint of Java Web Applications on Tomcat.

    Spring Security Framework Filter initializes proper user principal based on the pre-authenticated user information, which should be provided by HTTP Header. [8] The components initializing the pre-authenticated user detail can be easily customized. See [9] for an example.

    Now, HST-2 Container can use the initialized user principal on serving secured page resources.

    Also, CMS Frontend Application can create a user JCR session based on the initialized user principal.

    In most cases, HTTPS connection for browser clients is configured and enabled on the HTTPd (Apache2) layer.

    In many cases, HTTPd or other reverse proxy layer node can redirect to the Enterprise SSO Server for authentication required clients and the

    request can be redirected back with a valid security token.

    Enterprise SSO Server such as SiteMinder can be accessed by applications on the Application Server to validate the security tokens if

    needed.

    CMS and site applications on Application Server can also authenticate users against LDAP Server if configured.

    CMS and site applications on Application Server can also authenticate users by either Form Authentication or JAAS or Spring Security

    Integration. If Enterprise SSO Server is used, Spring Security

    Integration is capable of integrating with it seamlessly.

    Security Aspects An Enterprise SSO Enabled Architecture-Deployment View

    < < artifact > > SSL

    Configuration

    < < artifact > > < < war > >

    CMS

    < < artifact > > < < war > >

    Site

    http/sBrowser Client

    HTTPd Application Server

    SSO Server

    DBMS

    LDAP Server

    < < artifact > > Form

    Authentication Configuration

    < < artifact > > JAAS

    Configuration

    < < artifact > > Spring Security

    Integration Configuration

    An explanation of core nodes:

    Und

    erst

    andi

    ng H

    ipo

    CM

    S 7

    Sof

    twar

    e A

    rchi

    tect

    ure

  • WhitepaperWhitepaper

    11Deep Dive Example for SSO Shibboleth/SAML Deployment

    Browser Client attempts to access WCMS system through the web URLs served by the frontend Apache HTTPd Server.

    The Apache HTTPd Server has authentication configuration for secured resources (e.g., /cms, /SITE/secured/articles/, etc.) with

    Shibboleth authentication option. For the Shibboleth authentication

    option, the Apache HTTPd Server invokes mod_shib2 module which is

    deployed onto the server.

    The mod_shib2 module communicates with Shibboleth Daemon to request authentication through either Unix socket or TCP socket based

    on configuration.

    Shibboleth Deamon may initiate user sessions, manage the sessions for the specified time duration, let Browser Client be redirected to the

    Shibboleth Identity Provider, and provide authentication information

    environment variables or http headers. Shibboleth Identity Provider

    is normally deployed centrally in the enterprise, and so WCMS system

    should normally be configured to access remote Shibboleth Identity

    Provider through Shibboleth Damon.

    Shibboleth Deamon is configured by shibboleth2.xml. Because the authentication information should be used in Java Applications

    connected through either mod_proxy or mod_jk2, Shibboleth Daemon

    should be configured to leave the authentication information as HTTP

    Headers.

    For authenticated user session, Apache HTTPd Server will serve Java Web Applications (e.g., Content Production Application and Content

    Delivery Application) hosted by Tomcat, which is connected by either

    mod_proxy or mod_jk2.

    CMS (Content Production) Application and SITE (Content Delivery) Application can be configured with Spring Security Framework

    enabled. Spring Security Framework can read the pre-authenticated

    HTTP Header [8], which is provided by Shibboleth Daemon, and it can

    build a user principal based on the pre-authenticated HTTP Header.

    CMS (Content Production) Application and SITE (Content Delivery)

    Application should be configured to use the user principal provided

    by the Spring Security, instead of trying to authenticate the user by

    themselves.

    CMS (Content Production) Application should be configured to synchronize the user data from the LDAP Server.

    Amsterdam Boston Follow the Hippo trail: onehippo.com

    An explanation on core nodes:

    < < http/s > >

    < < http/s > >

    < < SAM

    L/https > >

    Browser Client

    Apache2 HTTPd

    Shibboleth Deamon

    Shibboleth Identity Provider

    < < artifact > > Spring Security

    Integration Configuration

    < < artifact > > mod_shib2

    < < artifact > > < < war > >

    CMS

    < < artifact > > < < war > >

    SITE

    < < artifact > > shibboleth2.xml

    WCMS

    Enterprise Federated

    Security Resources

    LDAP Server

    Tomcat

    Und

    erst

    andi

    ng H

    ipo

    CM

    S 7

    Sof

    twar

    e A

    rchi

    tect

    ure

    Shibboleth/SAML Enabled Architecture

    Shibboleth is a single-sign in, or logging-in system for computer networks and the internet. It allows people to sign in,

    using just one identity, to various systems run by federations of different organizations or institutions. The federations

    are often universities or public service organizations. [5] This section describes how Hippo CMS 7 can be deployed,

    integrating with Enterprise Shibboleth SSO Solution, and how internal components interact with each other. For detail of

  • About Hippo

    Amsterdam Boston Follow the Hippo trail: onehippo.com

    Whitepaper

    At Hippo, we believe, digital is here to make our lives a

    little bit better.

    Hippo sets the standard for how organizations can bring

    real-time relevance to their audience and is the foundation

    for personalized communication across all channels:

    mobile, social and web. Our purpose is to facilitate

    innovation so our customers can create digital miracles.

    We serve our customers, by creating a platform that is fun

    to use, easy to implement and open for innovation.

    Hippo CMS is a powerful, enterprise-class foundation

    to deliver outstanding Customer Experiences based on

    Enterprise Agility and Innovation Power.

    Hippo CMS is open source, 100% Java and convinces

    with its lean product architecture that is built for uptime,

    security and performance.

    Our dedicated, Certified Partner Network delivers Hippo

    Awesomeness around the globe to our valued customers.

    Hippo is proud to serve organizations such as Disney,

    British Telecom, Dolce & Gabbana, Max Bahr, the Dutch

    Police, Thomson Reuters and Crdit Agricole.

    Hippo is headquartered in Amsterdam, The Netherlands

    and Boston, USA.

    Curious for more? Visit www.onehippo.com

    12

    Und

    erst

    andi

    ng H

    ipo

    CM

    S 7

    Sof

    twar

    e A

    rchi

    tect

    ure

    1. Hippo Campus Community

    http://www.onehippo.org

    2. Spring Framework

    http://www.springsource.org/spring-framework

    3. Spring Security Framework

    http://static.springsource.org/spring-security/

    site/

    4. Apache Wicket

    http://wicket.apache.org

    5. Shibboleth (Internet2)

    http://en.wikipedia.org/wiki/

    Shibboleth_%28Internet2%29

    6. Shibboleth Documentation

    https://wiki.shibboleth.net/confluence/

    dashboard.action

    7. Deployment of Shibboleth Service Provider (SP)

    2.0 on Debian GNU/Linux 4.0

    https://www.switch.ch/aai/docs/shibboleth/

    SWITCH/2.0/sp/deployment-sp-2.0-debian-

    4.0.html

    8. Spring Security Framework-Pre-Authentication

    Scenario

    http://static.springsource.org/spring-security/

    site/docs/3.1.x/reference/preauth.html

    Resources