path/to/content - the Apache Jackrabbit content repository

download path/to/content - the Apache Jackrabbit content repository

If you can't read please download the document

description

Looking for a database where user profiles and image galleries are equally at home? That comes with built-in full text search, fine-grained access control, flexible schemas, versioning and many more advanced features? Take a look at Apache Jackrabbit, the Java-based content repository that combines the best parts of file systems and databases. This introductory presentation covers Apache Jackrabbit and its hierarchical content model, and shows how it can be used as a powerful foundation of modern content-based applications.

Transcript of path/to/content - the Apache Jackrabbit content repository

  • 1./path/to/content the Apache Jackrabbit content repository

2. Outline Repository model Property and node types Sessions and namespaces References and versioning Search and observation Access control Persistence and clustering Deployment and configuration Questions? 3. Repository model 4. Repository structure Repository Workspace A Workspace B Workspace C /jcr:system 5. or more commonly Repository default workspace /jcr:system 6. Workspace structure Root node / /jcr:systemNode /a Node /a/cNode /a/b a b c jcr:system 7. Node structure Property name Type Value jcr:primaryType Name nt:unstructured jcr:mixinTypes Name[] mix:referenceable jcr:uuid String c6d27a10-bf23-11e3-b title String My new node author String Jukka Zitting Child nodes foo, bar, baz[1], baz[2] 8. Property and node types 9. Common property types Property type Used for Examples String Short to medium-sized text foo, This paragraph Binary Binary data and long text PNG, PDF, This book Name Node and property names nt:folder, content Path Node and property paths /jcr:system, /etc/map Boolean, Long, Double Scalar data true, 0, -2846, 3.14, NaN Date ISO 8601 timestamp 2014-04-08T12:00:00.000Z Reference Graph structures c6d27a10-bf23-11e3-b 10. Multi-valued properties Zero or more values Limit at around 10-100k values, depending on size of values All values must be of the same type Duplicates allowed No null values Automatically removed Order is preserved 11. Common node types nt:base - jcr:primaryType: Name - jcr:mixinTypes: Name[] nt:unstructured - * (any properties OK) + * (any child nodes OK) oak:Unstructured (w/o order) - * (any properties OK) + * (any child nodes OK) mix:referenceable - jcr:uuid: String mix:versionable - mix:lockable - 12. Common node types, cont. nt:hierarchyNode (abstract) - jcr:created: Date nt:file + jcr:content nt:folder + *: nt:hierarchyNode nt:resource - jcr:data: Binary - jcr:mimeType: String - jcr:lastModified: Date 13. Example Site nt:unstructured form nt:folder style.css nt:file logo.png nt:file function nt:folder jquery.js nt:file Blog nt:unstructured Post 1 nt:unstructured attachment.pdf nt:file Post 2 nt:unstructured Comment 1 nt:unstructured 14. Sessions and namespaces 15. Sessions workspace Session Session Session Session Session 16. Session All content access goes through a session Sessions are created with an authenticated login() call Session-based authorization of reads, writes and other operations Tracking of transient changes Atomic save() Not thread-safe! for concurrent operations, use multiple sessions 17. Namespaces The repository has a set of prefix -> URI namespace mappings jcr: http://www.jcp.org/jcr/1.0 nt: http://www.jcp.org/jcr/nt/1.0 mix: http://www.jcp.org/jcr/mix/1.0 xml: http://www.w3.org/XML/1998/namespace etc. Used to prevent naming conflicts between different clients Each session can override (non-default) mappings locally designed for cases like XML imports, etc. in practice seldom used, and often not recommended 18. References and versioning 19. mix:referenceable - jcr:uuid = c6d27a10-bf23-11e3-b1b6-0800200c9a66 References - seeAlso = c6d27a10-bf23-11e3-b 20. References, cont. hard references enforced integrity; target can not be removed least flexibility; think twice before using weak references remains valid across moves/renames paths, names, URLs, etc. no backreferences 21. mix:versionable Versioning checkin 22. Versioning, cont. To make a node versionable, add the mix:versionable mixin scope of versionability determined by node types (OPV) A checkin freezes a piece of content and makes a copy of it in the version history A checkout unfreezes the content and allows it be modified A restore goes back in time to a previously checked in version A merge combines changes from another workspace to those made in this workspace 23. Search and observation 24. Search examples // find all PDF files within this workspace, most recent first SELECT * FROM [nt:file] WHERE [jcr:mimeType] = application/pdf ORDER BY [jcr:lastModified] DESC // find all content about Christmas within my blog /jcr:root/sites/myblog//*[jcr:contains(., Christmas)] 25. Search By default all content is indexed Configurable per repository Support for full text search Also binaries indexed with automatic text extraction Full access control of search results However: Limited join support/performance No facets or aggregate queries 26. Observation An observation listener can select to receive events on changes of specified types on changes at or below a specified path on changes at nodes with specified identifiers on changes at nodes of specified types The events are delivered in asynchronous callbacks Remember the non-thread-safety of sessions! Often used to maintain a cache of expensive-to-compute data 27. Access control 28. Access control Fine-grained, ACL-based access control Applies to all content accesses Writes Reads Search Observation etc. Support for custom privileges e.g. an execute privilege 29. Persistence and clustering 30. Persistence managers Repository Workspace A Workspace B Workspace C /jcr:system Persistence Manager 1 Persistence Manager 2 Persistence Manager 3 Persistence Manager 4 31. Persistence alternatives Embedded Database PM Derby, H2 External Database PM PostgreSQL, Oracle, etc. 32. Data store Repository Workspace A /jcr:system Persistence Manager 1 Persistence Manager 2 Data Store 33. Data store alternatives File Data Store Local FS, NFS Database Data Store S3 Data Store PostgreSQL, Oracle, etc. S3 34. Clustering Persistence Manager PostgreSQL, Oracle, etc. Repository Persistence Manager Repository Persistence Manager Repository 35. Deployment and configuration 36. Deployment packages jackrabbit-webapp basic web interface (still no content browser/editor) exposes the repository through JNDI, WebDAV, RMI jackrabbit-standalone runnable jar jackrabbit-webapp plus embedded Jetty basic tooling: backup/migration, CLI, etc. jackrabbit-jca designed for full J2EE environments support for managed transactions 37. Embedded deployment jackrabbit-core plus all dependencies Maven recommended slf4j used for logging Full control over the repository Extra work to make the repository externally manageable 38. Repository configuration repository.xml main repository configuration file security, clustering, data store, /jcr:system, etc. workspace.xml configuration of each workspace persistence manager, search index, etc. automatically created based on template in repository.xml indexing_configuration.xml optional, customizes the search index see http://jackrabbit.apache.org/jackrabbit-configuration.html 39. Questions?