Improving Performance and Flexibility of Content Listings Using Criteria API

Post on 23-Jan-2017

250 views 0 download

Transcript of Improving Performance and Flexibility of Content Listings Using Criteria API

Improving Performance and Flexibility of Content Listings Using Criteria API Nils Breunese

Public Broadcaster since 1926 The Netherlands

Online since 1994 Open-source CMS released in 1997

Using Magnolia since 2010 Still migrating websites

Tens of thousands of pages Multiple sites like that

Overview pages Lots of them

Thanks for the warning… Even 10 seconds would be way too long

WARN info.magnolia.module.cache.filter.CacheFilter -- The following URL took longer than 10 seconds (63969 ms) to render. This might cause timeout exceptions on other requests to the same URI.

Overview models Standard Templating Kit

Tracking back from the template newsOverview.ftl

(...) [#assign pager = model.pager] [#assign newsList = cmsfn.asContentMapList(pager.pageItems)!] (...)

Constructing the pager AbstractItemListModel

public STKPager getPager() throws RepositoryException { (...) return new STKPager(currentPageLink, getItems(), content); }

Four step pipeline AbstractItemListModel

public Collection<Node> getItems() throws RepositoryException { List<Node> itemsList = search(); this.filter(itemsList); this.sort(itemsList); itemsList = this.shrink(itemsList); return itemsList;}

1

23

4

Step 1a: Constructing the query TemplateCategoryUtil

public static List<Node> getContentListByTemplateNames(...) { (...) StringBuffer sql = new StringBuffer( "select * from nt:base where jcr:path like '" + path + "/%'"); (...add 'mgnl:template=' clauses...) (...add 'ORDER BY' clauses...) return getWrappedNodesFromQuery(sql.toString(), repository, maxResultSize); } maxResultSize == Integer.MAX_VALUE

Step 1b: Executing the query TemplateCategoryUtil

public static List<Node> getContentListByTemplateNames(...) { (...) NodeIterator items = QueryUtil.search( repository, sql.toString(), Query.SQL, NodeTypes.Content.NAME); }

Step 2: Filtering the item list STKDateContentUtil

public static void filterDateContentList(...) { CollectionUtils.filter(itemsList, new Predicate() { @Override public boolean evaluate(Object object) { (...) return date.after(minDate) && date.before(maxDate); } });}

Step 3: Time to sort STKDateContentUtil

public static void sortDateContentList(...) { Collections.sort(itemsList, new Comparator<Node>() { @Override public int compare(Node c1, Node c2) { (...) if (StringUtils.equals(sortDirection, ASCENDING)) { return date2.compareTo(date1); } return date1.compareTo(date2); } });}

Step 4: Shrinking the list STKTemplatingFunctions

public List<Node> cutList(List<Node> itemsList, final int maxResults) { if (itemsList.size() > maxResults) { return itemsList.subList(0, maxResults); } return itemsList;}

NewsOverviewModel passes Integer.MAX_VALUE, so shrink does effectively nothing in this case

Step 5: Get the items from the pager STKPager

public Collection getPageItems() { Collection subList = items; int offset = getOffset(); if (count > 0) { int limit = maxResultsPerPage + offset; if (items.size() < limit) { limit = count; } subList = ((List) items).subList(offset, limit); } return subList;}

maxResultsPerPage is typically something like 20

When this becomes a problem We have multiple sites like this

select * from nt:base where jcr:path like '/siteX/news/%' AND

mgnl:template = 'standard-templating-kit:pages/stkNews'

20000 pages under website:/siteX/news

Four step pipeline returns STKPager with 20000 items (page nodes)

[#assign model.pager]

[#assign newsList = cmsfn.asContentMapList(pager.pageItems)!]

STKPager returns list with 20 page nodes

19980 Node objects created, but not rendered

A query could do all steps at once JCR queries are pretty flexible

Everything in a single JCR query Only 20 nodes returned

SELECT * FROM nt:base WHERE jcr:path LIKE '/siteX/news/%' AND

mgnl:template = 'standard-templating-kit:pages/stkNews'

AND jcr:created < cast('2016-06-07T00:00:00.000Z' AS DATE)

ORDER BY date ASCENDING

LIMIT 20 OFFSET 20

Search

Filter

Sort

Paging

Criteria API For those familiar with Hibernate/JPA

Criteria criteria = JCRCriteriaFactory.createCriteria() .setBasePath("/siteX/news") .add(Restrictions.eq( "@mgnl:template", "standard-templating-kit:pages/stkNews")) .add(Restrictions.betweenDates("@jcr:created", minDate, maxDate)) .addOrder(Order.asc("date")) .setPaging(20, 1); ResultIterator<...> items = criteria.execute(session).getItems();

SortPaging

Filter

Search

Criteria API for Magnolia CMS Magnolia module by Openmind

jcr-criteria https://github.com/vpro/jcr-criteria

Custom pager Only a single page worth of items

public class VtkPager<T> extends STKPager { private final List<? extends T> items; private final int pageSize; private final int count; (...) @Override public List<? extends T> getPageItems() { return items; } }

Use it in your model classes VtkContentListModel (vpro)

public abstract class VtkContentListModel ... { protected final VtkPager<ContentMap> pager; @Override public String execute() { pager = createPager(); return super.execute(); } protected abstract VtkPager<T> createPager(); (...) }

Concrete Example VtkNewsOverviewModel (vpro)

@Overrideprotected VtkPager<Node> createPager() { (...) AdvancedResult result = JCRCriteriaFactory.createCriteria() .setBasePath(path) .add(Restrictions.in("@mgnl:template", templates)) .add(Restrictions.betweenDates("@jcr:created", minDate, maxDate)) .addOrder(Order.asc("date")) .setPaging(itemsPerPage, pageNumberStartingFromOne) .execute(session);

List<Node> items = new ArrayList<>(); for (AdvancedResultItem item : result.getItems()) { items.add(item.getJCRNode()); } int count = result.getTotalSize(); return new VtkPager<>(link, items, content, itemsPerPage, count); }

Still this. Was it all for nothing? :o(

WARN info.magnolia.module.cache.filter.CacheFilter -- The following URL took longer than 10 seconds (63969 ms) to render. This might cause timeout exceptions on other requests to the same URI.

Example VtkNewsOverviewModel (vpro)

@Overrideprotected VtkPager<Node> createPager() { (...) AdvancedResult result = JCRCriteriaFactory.createCriteria() .setBasePath(path) .add(Restrictions.in("@mgnl:template", templates)) .add(Restrictions.betweenDates("@jcr:created", minDate, maxDate)) .addOrder(Order.asc("date")) .setPaging(itemsPerPage, pageNumberStartingFromOne) .execute(session);

List<Node> items = new ArrayList<>(); for (AdvancedResultItem item : result.getItems()) { items.add(item.getJCRNode()); } int count = result.getTotalSize(); return new VtkPager<>(link, items, content, itemsPerPage, count); }

This call takes 10-60+ seconds!

AdvancedResultImpl (jcr-criteria)

@Overridepublic int getTotalSize() { if (totalResults == null) { int queryTotalSize = -1; try { // jcrQueryResult instanceof JackrabbitQueryResult) { Method m = jcrQueryResult.getClass().getMethod("getTotalSize"); queryTotalSize = (int) m.invoke(jcrQueryResult); } catch (InvocationTargetException | IllegalAccessException e) { LOG.error(e.getMessage(), e); } catch (NoSuchMethodException e) { } if (queryTotalSize == -1 && (itemsPerPage == 0 || applyLocalPaging)) { try { totalResults = (int) jcrQueryResult.getNodes().getSize(); } catch (RepositoryException e) { // ignore, the standard total size will be returned } } if (queryTotalSize == -1) { totalResults = queryCounter.getAsInt(); } else { totalResults = queryTotalSize; } } return totalResults; }

We end up here

jackrabbit-core 2.8.0

protected void getResults(long size) throws RepositoryException { (...) result = executeQuery(maxResultSize); // Lucene query (...) // Doesn’t use result.getSize(), call collectScoreNodes(...) }

private void collectScoreNodes(...) { while (collector.size() < maxResults) { ScoreNode[] sn = hits.nextScoreNodes(); (...) // check access if (isAccessGranted(sn)) { collector.add(sn); } else { invalid++; } }} QueryResultImpl

It used to be fast! https://issues.apache.org/jira/browse/JCR-3858

jackrabbit-core 2.10.0+

protected void getResults(long size) throws RepositoryException { (...) if (sizeEstimate) { numResults = result.getSize(); // Use count from Lucene } else { // do things the Jackrabbit 2.8.0 way (...) } (...) }

QueryResultImpl

Enable Jackrabbit’s 'sizeEstimate' Jackrabbit 2.10+

<SearchIndex class="org.apache.jackrabbit.core.query.lucene.SearchIndex"> (...) <param name="sizeEstimate" value="true"/></SearchIndex>

Rendering times down to 1-2 seconds Bingo

Time for questions

Anyone?

Feel free to contact me

Nils Breunese@breunn.breunese@vpro.nl