The digital magazine for enterprise developers Polyglots do it better · 2018-06-27 · The digital...

The digital magazine for enterprise developers

Issue April 2015 | presented by www.jaxenter.com #44

Speak ALL the languages!Shutterstock’s polyglot enterprise and Komodo’s polyglot IDE

RethinkDB and HBaseA closer look at two database alternatives

Top performance failsFive challenges in performance management

Polyglotsbetterdo it

©iStock

pho

to.com/epic11

Editorial

www.JAXenter.com | April 2015 2

It’s an old and rather out-of-fashion motto of the United States. E pluribus unum, one out of many. Early twentieth-century ideals of the cultural melting pot may have failed in western society. But they may still work in IT. Developers will forever dream of one Holy Grail language that will rule them all. As nice as it sounds to be a full-stack developer, know-ing your entire enterprise’s technology inside out, from the ListView failure modes to the various components’ linguistic syntaxes – that’s a near superhuman talent. As much as all de-velopers would love to be fl uently multilingual, in practice it’s diffi cult to keep up. But instead it’s the unity of the enterprise itself that can create one whole out of many languages, not the individual developer.

In this issue, Chris Becker explains how Shutterstock’s gradual evolution from one to many languages was central

E pluribus unum

to the success of the company’s technology, and how out of one language became many specialist development areas, in turn all unifi ed by the enterprise. Meanwhile for web develop-ers looking for one tool to rule all their front-end languages, we’ve got a helpful guide to the polyglot IDE Komodo, which just unveiled its ninth release.

We’ve also got some useful introductions to HBase and RethinkDB (both big salary-earners according the recent Dice. com survey), as well as Vaadin web applications. And fi nally for anyone concerned with the speed of their website during high traffi c, we have a couple of valuable lessons about how to avoid performance failures.

Coman Hamilton,Editor

Polyglot enterprises do it better 5Shutterstock’s multilingual stackChris Becker

An introduction to polyglot IDE Komodo 7One IDE to rule all languages Nathan Rijksen

HBase, the “Hadoop database” 12A look under the hoodGhislain Mazars

An introduction to building realtime apps with RethinkDB 16First stepsRyan Paul

Performance fails and how to avoid them 20The fi ve biggest challenges in application performance managementKlaus Enzenhofer

Separating UI structure and logic 22Architectural differences in Vaadin web applicationsMatti Tahvonen

Model View ViewModel with JavaFX 24A look at mvvmFXAlexander Casall

Inde

x

www.jaxlondon.com

featuring

Business Design Centre, London

October 12 – 14th, 2015

MARK YOUR CALENDAR!

Presented by Organized by

Hot or Not


HTML6If you’re a web developer that doesn’t check the news that often, make sure you’re sitting before reading this. The web community is furiously debating a radical new proposal for HTML6. The general idea is that the next version of HTML will be developed in a way that allows it to dynamically run single-page apps without Java-Script. Yes, a HTML that wants to compete with JavaScript. The community is torn, but proposal author Bobby Mozumder makes an interesting case, claiming that HTML needs to follow the “standard design pattern emerging via all the front-end JavaScript frameworks where content is loaded dynamically via JSON APIs.” He’s won over quite a few members of the community, but there’s no telling if the W3C will ever lower its eyebrows to this request.

How banks treat technologyIn a recent series of interviews on JAX-enter, developers in finance have told us about the pros and cons of develop-ing for banks. And while the salary is very definitely a major pro (banks pay 33 percent to 50 percent more, says HFT developer Peter Lawrey), the management-level attitude to technol-ogy is less than favourable. “Many financial companies see their develop-ment units as a necessary evil,” says former NYSE programmer Dr. Jamie Allsop. Not only does technology come across as being second-class, but it’s often impossible to drive innovation when working on oldschool banking systems. “If it’s new then it must be risky,” dictates the typical financial ops attitude according to finance techno-logy consultant Mashooq Badar.

Cassandra salariesIt’s often considered impolite to ask another programmer how much they earn. But that doesn’t mean colleagues, recruiters and employers aren’t picturing an estimated annual salary hovering over your head while they pretend to listen to you. It turns out that right now, Cassandra, PaaS and MapReduce pros are the ones with the biggest dollar signs above their heads (according to the latest Dice.com survey). Anyone lucky enough to be an expert in this area (and living in the US) will be making an average of at least $127k per year. America’s Java developers will be lucky to scrape anything above $100k, while poor JavaScript experts earn as little as $94k a year. We’d like to invite you to join us in performing the world’s smallest violin concerto for the JS com-munity – because seriously that’s still a mighty shedload of cash to the kinds of people that write the texts you’re reading [long sigh].

Publishing a book in ITAt some stage in their career, most developers will ponder the idea of publishing a book. And the appeal is understandable. It’s a great milestone on your CV, developers in the community may look up to you and you’ll be sure to make your parents proud (even if they don’t have a clue what you’re writing about). But don’t expect to make big money (like those Cassandra developers, above) when you self-publish on Leanpub. Meanwhile you’ll need to be dedicating much of your spare time to promoting the book. Before your eyes go lighting up with dollar signs at the thought of becoming a wealthy international superstar IT author, give yourself a quick reality check with a couple of Google searches on IT publishing.

Languages


by Chris Becker

A technology company must consider the technology stack and programming language that it’s built on. Everything else flows from those decisions. It will define the kinds of engi-neers who are hired and how they will fare.

During Shutterstock’s early days, a team of Perl develop-ers built the framework for the site. They chose Perl for the benefits of CPAN and flexibility that language would bring. However, it opened up a new problem – we could only hire those familiar with Perl. We hired some excellent and skilled engineers, but many others were left behind because of their inexperience with and lack of exposure to Perl. In a way, it limited our ability to grow as a company.

But in recent years, Shutterstock has grown more “multilin-gual.” Today, we have services written in Node.js, Ruby, and Java; data processing tools written in Python; a few of our sites written in PHP; and apps written in Objective-C.

Developers specialize in each language, but we communi-cate across different languages on a regular basis to debug, write new features, or build new apps and services. Getting from there to here was no easy task. Here are a few strate-gic decisions and technology choices that have facilitated our evolution:

Service-oriented ArchitecturesFirst, we built out all our core functionality into services. Each service could be written in any language while provid-

Being a multilingual, multicultural company doesn’t just bring benefits on a level of corporate culture. Shutterstock search engineer Chris Becker explains why enterprises need to stop speaking just one language.

© iS

tock

phot

o.co

m/j

amto

ons

Shutterstock’s multilingual stack

Polyglot enterprises do it better

Languages


ing a language-agnostic interface through REST frameworks. It has allowed us to write separate pieces of functionality in the language most suited to it. For example, search makes use of Lucene and Solr, so Java made sense there. For our trans-lation services, Unicode support is highly important, so Perl was the strongest option.

Common FrameworksBetween languages there are numerous frameworks and standards that have been inspired or replicated by one anoth-er. When possible, we try to use one of those common tech-nologies in our services. All of our services provide RESTful interfaces, and internally we use Sinatra-inspired frameworks for implementing them (Dancer for Perl, Slim for PHP, Ex-press for Node etc.). For templating we use django-inspired frameworks such as Template::Swig for Perl, Twig for PHP, and Liquid for Ruby. By using these frameworks we can help improve the learning curve when a developer jumps between languages.

Runtime ManagementOften, the biggest obstacle blocking new developers is tech-nical bureaucracy needed to manage each runtime – man-aging library dependencies, environment paths, and all the command line settings and flags needed to do common tasks. Shutterstock simplifies all of that with Rockstack. Rockstack provides a standardized interface for building, running, and testing code in any of its supported runtimes (currently: Perl, PHP, Python, Ruby, and Java).

Not only does Rockstack give our developers a standard interface for building, testing, and running code, but it also supplies our build and deployment system with one stand-ard set of commands for running those operations as well for any language. Rockstack is used by our Jenkins cluster for running builds and tests, and our home-grown deployment system makes use of it for launching applications in dev, QA, and production.

Testing FrameworksIn order to create a standardized method for testing all the services we have running, we developed (and open sourced!) NTF (Network Testing Framework). NTF lets us write tests that hit special resources on our services’ APIs to provide sta-tus information that show the service is running in proper form. NTF supplements our collection of unit and integration tests by constantly running in production and telling us if any functionality has been impaired in any of our services.

Developer MeetupsWe also want our developers to learn and evolve their skill-sets. We host internal meetups for Shutterstock’s Node de-velopers, PHP Developers, and Ruby developers where they can get fresh looks and feedback on code in progress. These meetups are a great way for them to continue professional development and also to meet others who are tackling similar projects. There’s no technology to replace face-to-face com-munication, and what great ideas and methods can come from it.

OpennessWe post all the code for every Shutterstock site and service on our internal GitHub. Everyone can find what we’ve been working on. If you have an idea for a feature, you can fork off a branch and submit a pull request to the shepherd of that service. This openness goes beyond transparency; it en-courages people to try new things and to see what others are implementing.

Our strategies and tools echo our mission. We want engi-neers to think with a language-agnostic approach to better work across multiple languages. It helps us to dream bigger and not get sidelined by limitations. As the world of program-ming languages becomes much more fragmented, it’s becom-ing more important than ever from a business perspective to develop multilingual-friendly approaches.

We’ve come a long way since our early days, but there’s still a lot more we can do. We’re always reassessing our process and culture to make both the code and system more familiar to those who want to come work with us.

This has been adapted from a post that originally ran on the Shutterstock Tech blog. You can read the original here: bits.shutterstock.com/2014/07/21/stop-using-one-language/.

Chris Becker is the Principal Engineer of Search at Shutterstock where he’s worked on numerous areas of the search stack including the search platform, Solr, relevance algorithms, data processing, analytics, interna-tionalization, and customer experience.

“The biggest obstacle blocking new developers is often the

technical bureaucracy needed to manage each runtime.”

Languages


by Nathan Rijksen

A few weeks ago, Komodo IDE 9 was released, featuring a host of improvements and features. Many of you have since downloaded the 21-day trial and have likely spent some time learning more about what this nimble beast can do. I thought now would be a good time to run you through some sim-ple workflows and give you a general idea of how Komodo works. This is just a short introduction, and much more in-formation can be found on the Komodo website under screen-casts, forums and of course the documentation.

User Interface Quick StartFigure 1 shows what you’ll see when you first launch Komodo.

The first few icons on the toolbar are pretty self explanatory, and the ones that aren’t immediately self-explanatory are easily discovered by hovering your mouse over them. You’ll find quick access to a lot of great Komodo features here: de-bugging, regex testing, source code control, macro recording/playback, etc. Of course these aren’t all displayed by default; you would need a big screen to show all that. You can easily toggle individual buttons or button groups by right clicking the toolbar and selecting “Customize”. Three buttons you’ll likely be using a lot are the pane toggles (Figure 2).

Clicking these allows you to toggle a bottom, left and right pane, each of which holds a variety of “widgets” that allow you to do anything from managing your project files to unit testing your project. You can customize the layout of these widgets by right-clicking their icon in the panel tab bar.

At the far right of the toolbar you’ll find a search field, dubbed Commando, which lets you easily search through your project files, tools, bookmarks, etc.

Type in your search query, and Commando will show you results in real-time. Select an entry and press “Enter” to open/activate it, or press tab/right arrow to “expand” the selection, allowing you to perform contextual actions on it. For exam-ple, you could rename a file right from Commando (Figure 3) using nothing but your keyboard. Commando doesn’t need to be accessed from the toolbar – you can use a shortcut to launch it so you can also be 100 percent keyboard driven (CTRL + Shift + O on Windows and Linux, or CMD + Shift + O in Mac).

One IDE to rule all languages

An introduction to polyglot IDE Komodo The developer world is becoming “decidedly more polyglot”, Komodo CTO recently told JAXenter. To cater to this changing community, the multi-lingual IDE Komodo is steadily increasing its number of supported languages. Let’s take a closer look at what it does best and how to get started.

Figure 1: Komodo start screen

Figure 2: Pane buttons

Languages


Starting a ProjectThe first thing you’ll want to do is start a new project. You don’t need to use projects, but it’s highly encouraged as it gives Komodo some context: “This is what I’m working with, and this is how I like working with this particular project.”

To start a new project, simply click the “Project” menu and then select “New Project”. On Windows and Linux the menus are at the far right of the toolbar under the “burger menu” (Figure 4), as people tend to lovingly call it.

You can also start the project from the “Quick Launch” page which is visible if no other files are opened, or you can use Commando by searching for “New Project” and selecting the relevant result (it should show up first).

When creating a project, Komodo will ask you where to save your project; this will be the project source, and Ko-modo will create a single project file in this folder and one

project folder that can hold your project specific tools. You can separate your project from your source code by creating your project in the preferred location and afterwards modify-ing the project source in the “Project Preferences”.

Once your project is created you might want to do just that: adjust some of the project specific preferences. To do this, open your “Project” menu again and select “Project Prefer-ences” (Figure 5), or use Commando to search for “Project Properties”. There you can customize your project to your liking, changing it’s source, exclude files, using custom inden-tation, and more. It’s all there.

If you’re working with a lot of different projects, you’ll definitely want to check out the Projects widget at the bottom of the Places widget. Open the left side pane and select the “Places” widget, then look at the bottom of the opened pane for “Projects”.

Figure 6: Scope

Figure 3: Commando Figure 4: The “burger menu”

Figure 5: Project properties

Languages


Opening (and Managing) FilesYou now have your project and are ready to start opening and editing some files. The most basic way of doing that is by using the “Places” widget in the left pane, but where’s the fun in that? Again, you could use Commando to open your files… Simply launch Commando from your toolbar or via the shortcut and type in your filename. Commando by de-fault searches across several “search scopes”, so you may get results for tools, macros, etc. If you want to focus down on just your files you can hit the icon to the left of the search field to select a specific “search scope”. In this case you’ll want the Files scope (Figure 6).

You can define custom shortcuts to instantly launch a spe-cific Commando scope, so you don't need to be using this scope menu with your mouse each time if that’s not your pre-ferred method.

Now that you’ve opened a couple of files, you may be start-ing to notice that your tab bar is getting a bit unwieldy. This is nothing new to editors, and a lot of programmers either deal with it or constantly close files that aren’t immediately relevant anymore. Some editors get around this by giving you another way of managing opened files; luckily Komodo is one of these editors, and even goes a step further. We’ve already spoken about Commando a lot so I’ll skip past the details and just say “There’s an Opened Files search scope.” A more UI-driven method is available through the “Open Files” widget (Figure 7), accessible under the left pane right next to the “Places” widget (you’ll have to click the relevant tab icon at the top of the pane).

The Open Files widget allows you to, well, manage your open files. But more than that it allows you to manage how you manage open files – talk about meta! When you press the “Cog” icon you will be presented with a variety of grouping and sorting options, allowing you to group and sort your files the way you want. If you’re comfortable getting a bit dirty

with a custom JavaScript macro you can even create your own groups and sorting options.

Using SnippetsMany programmers use snippets (also referred to as Abbrevi-ations in Komodo), and many editors facilitate this. Komodo again goes a step further by allowing very fine-tuned control of snippets. First let’s just use a simple snippet though.

Open a new file for your preferred language. We’ll use PHP in our example (because so many people do, I’ll refrain from naming my personal favourite). Komodo comes with some pre-defined snippets you can use, but let’s define our own. Open your right pane and look for the “Samples” folder; these are the samples Komodo provides for you to get started with. I would suggest you cut and paste the “Abbreviations” folder residing in this folder to be at the root of your toolbox, as it provides a good starting structure. Once that’s done, head into Abbreviations | PHP; these are abbreviations that only get triggered when you are using a PHP file. Right-click the PHP folder and select Add | New Snippet.

Here you can write your snippet code. Let’s write a “pri-vate function” snippet. Choose a name for your snippet: this will be the abbreviation that will trigger the snippet, and I’ll use “prfunc” because I like to keep it short and simple. Then write in your code. You can use the little arrow menu to the right of the editor to inject certain dynamic values. Most rel-evant right now is the “tabstop” value, which tells Komodo you want your cursor to stop and enter something there. Fig-ure 8 shows what my final snippet looks like.

You’ll note I checked “Auto-Abbreviation”. This will allow me to trigger the snippet simply by writing its abbreviation. The actual trigger for this is configurable, so you can instead have it trigger when you press Tab. Now we have our snippet and are ready to start using it. Simply write “prfunc”. Next, let’s talk about version control, wait ... What? You were

Figure 8: SnippetFigure 7: Open files

Languages


expecting more? No that’s it, you type “prfunc” and it will auto-trigger your snippet. It’s that easy.

There’s many other ways of triggering and customizing snippets (ahem, Commando again); you can even create snip-pets that use EJS so you can add your own black magic to the mix. And for use cases where even that isn’t enough, you can

create your own macros, which – aside from extending your editor – can be used to do anything from changing UI font size to writing your own syntax checker, or overhauling the entire UI. You have direct access to the full Komodo API that is used to develop Komodo. That’s a whole different topic though … One far too big to get into now, but be sure to have a look at all that the “Toolbox” offers you, because snippets are just the tip of the iceberg.

Previewing ChangesSo, you’ve created some files, edited them (using snippets I hope – that took a while to write down, you know!) and now want to see the results. Rather than leaving Komodo for your terminal or your browser, why not just do it from inside Ko-modo?

Since it was a PHP file we were working on, you can simply launch your code by pressing the “Start or continue debug-ging” button in your toolbar. This will of course run your code through a debugger but it’s also useful just to run your code and see the output. You could skip the debugger altogether by using the Debug | Run Without Debugging menu. As-suming your PHP file actually outputs anything, Komodo will open the bottom pane and show you your code output. Out-put not what you expected it to be? Set a few breakpoints and jump right into the middle of your code (Figure 9).

You could take it even further and start editing variables while debugging, starting a REPL from the current break-point, etc, provided the language you are using supports this functionality.

I did say browser back there though, so what about pre-viewing HTML files? Simply open your HTML file and hit the “Browser Preview” button in your toolbar. Select “In a Komodo Tab” for the “Preview Using” setting and customize the rest to your liking, then hit Preview. The preview will be rendered using Gecko. Komodo is built on Mozilla so basi-cally it’s rendering using Firefox, right from Komodo.

With Komodo 9, you can now even preview Markdown files in real-time. Just open a Markdown file and hit the “Pre-view Markdown” button in your toolbar to start previewing (Figure 10).

Using Version ControlIf you’re a programmer, you probably use version control of some kind (and if not – you really should!) whether it be Git, Mercurial, SVN, Perforce or whatever else strikes your fancy. Komodo doesn’t leave you hanging there; you can easily access a variety of VCS tasks right from inside Komodo (Figure 11).

Let’s assume you already have a repository checked out/cloned. We’ll go straight to the most basic and most frequent-ly used VCS task – committing. Make your file edits, and sim-ply hit the VCS toolbar button (you may need to enable this first by right clicking the toolbar and hitting “Customize”). Then select “Commit”, enter your commit message and fire it off. Komodo will show you the results (unless you disabled this) in the bottom pane.

Komodo IDE 9 even shows you the changes to files as you are editing them, meaning it will show what was added, ed-ited and deleted since your last commit of that file (Figure 12).

Figure 9: Debug

Figure 10: Preview markdown

Figure 11: VCS

Languages


The left margin (with your line numbers) will become colored when edits are made. Clicking on one of these colored blocks will open a small pop-in showing what was changed and allowing you to revert the change or share the changeset via kopy.io. Kopy.io is a new tool created by the Komodo team, which serves as a modernized pastebin, implemeting nifty new features you probably havent seen elsewhere such as client-side encrypted “kopies” and auto-sizing text to fit to your browser window.

CustomizingYou’ve now gone through a basic workflow and are start-ing to really warm up to Komodo (hopefully), but you wish your color scheme was light instead of dark, or the icon set is too monotone and grey for your taste, or ... or ... Again Komodo’s got you covered. Head into the Preferences dialog via Edit | Preferences or Komodo | Preferences on OSX. Here you can customize Komodo to your heart’s content: the

Figure 12: Track changes

Figure 13: Preferences

“Appearance” and “Color Scheme” sections are probably of particular interest to you. Note that by default Komodo hides all the advanced preferences that you likely wouldn’t want to see unless you are a bit more familiar with Komodo. You can toggle these advanced preferences by checking the “Show Advanced” checkbox at the bottom left of your Preferences dialog (Figure 13).

Inevitably some of you will find something that simply isn’t in Komodo though, because there isn’t a single IDE out there that has ALL the things for EVERYONE. When this happens, the community and customizability of Komodo is there for you, so be sure to check out the variety of addons, macros, color schemes etc. at the Komodo Resources website, and if you want to get creative, share ideas or request features from the community, then head on over to the Komodo forums.

Just a GlimpseHopefully I’ve given you a glimpse of what Komodo has to offer, or at least given you an idea of how to get started with it. Komodo is a great IDE, even if you aren’t in the market for a huge IDE, and the best thing about it is you don’t need to buy/launch another IDE each time you want to work on another language. Komodo has you covered for a variety of languages and frameworks, including (but not limited to!) Python, PHP, Go, Ruby, Perl, Tcl, Node.js, Django, HTML, CSS and JavaScript. Enjoy!

Nathan Rijksen, Komodo developer, has web dev expertise and experi-ence as a backend architect, application developer and database engi-neer, and has also worked with third-party authentication and payment modules. Nathan is a long time Komodo user and wrote multiple macros and extensions before joining the Komodo team.

Databases


by Ghislain Mazars

HBase and the NoSQL Market: In the myriad of NoSQL databases today available on the market, HBase is far from having a comparable share to market leader MongoDB. Easy to learn, MongoDB is the NoSQL darling of most applica-tion developers. The document-oriented database interfaces well with lightweight data exchanges format, typically JSON, and has become the natural NoSQL database choice for many web and mobile apps (Figure 1).

Where MongoDB (and more generally JSON databases) reaches its limits is for highly scalable applications requiring complex data analysis (the oft denominated “data-intensive” applications). That segment is the sweet spot of column-ori-ented databases such as HBase. But even in that particular category, HBase has lately oft been overlooked in favour of Cassandra. Quite a surprising turn of events actually as Face-book, the “creator” of Cassandra, ditched its own creation in 2011 and selected HBase as the database for its Messages ap-plication. We will come back to the technical differences be-tween the two databases, but the main reason for Cassandra’s remarkable comeback is to be found elsewhere.

Cassandra, the comeback kid of NoSQL databasesWith Cassandra, we find a pattern common to most major NoSQL databases, i.e. the presence of a dedicated corporate sponsor. Just as MongoDB (with MongoDB Inc, formerly called 10gen) and Couchbase (with Couchbase Inc.), the tech-nical and market development of Cassandra is spearheaded by Datastax Inc. From a continued effort on documentation (Planet Cassandra) to the stewardship of the user community with meetups and summits, Datastax has been doing a re-markable job in waving high the Cassandra flag. These efforts have paid off, and Cassandra now holds the pole position among wide-column databases.

It is worth noting however that in the process, Cassandra has lost a lot of its open-source nature. 80 percent of the committers on the Apache project are from Datastax and the management features beloved by enterprise customers are proprietary and part of DSE (“DataStax Enterprise”). Going one step further, the integration with Apache Spark, the new whizz-kid of Big Data, is currently only available as part of DSE …

HBase, a community-driven open-source projectUnlike Cassandra, HBase very much remains a community-driven open-source project. No less than 12 companies are represented in the Apache project committee and the three Hadoop distributors, Cloudera, Hortonworks and MapR,

Figure 1: Relative adoption of NoSQL skills

A look under the hood

HBase, the “Hadoop database” Even MongoDB has its limits. And where the favourite of the database world reaches its limits in scalability, that’s just where HBase enters. Tech entrepreneur Ghislain Mazars shows us the strong points of this community-driven open-source database.

Databases


share the responsibility of marketing the database and sup-porting its corporate users.

As a result, HBase sometimes lacks the marketing firepow-er of one company betting its life on the product. If it had been the case, no doubt that HBase would be in a 1.x release by now: while Hadoop made a big jump from 0.2x to 1.0 in 2011, HBase continued to move steadily in the 0.9x range! And the three companies including the database in their port-folio show a tendency to privilege other (more proprietary) of-ferings of theirs and thus provide a restrictive image of HBase.

In this context, it is quite an achievement that HBase oc-cupies such an enviable place among NoSQL databases. It owes this position to its open-source community, strong in-stalled base within web properties (Facebook, Yahoo, Grou-pon, eBay, Pinterest) and distinctive Hadoop connection. So in spite or maybe thanks to its unusual model, HBase could still very much win… As Cassandra has shown in the last 2/3 years, things can move fast in this market. But we will come back to that later on, for now, let us take a more technical look at HBase.

Under the HoodHadoop implementation of Google’s BigTable: HBase is an open-source implementation of BigTable as described in the 2005 paper from Google (http://research.google.com/archive/bigtable.html). Initially developed to store crawling data, BigTable remains the distributed database technology underpinning some of Google’s most famous services, Google Docs & Gmail. Of course, as should be expected from a creation of Google, the wide-column da-tabase is super scalable and works on commodity servers. It also features extremely high read performance, ensuring

for example that a Gmail user instantaneously retrieves all its latest emails.

Just like BigTable, HBase is designed to handle massive amounts of data and is optimized for read-intensive applica-tions. The database is implemented on top of the Hadoop Distributed File System (HDFS) and takes advantage of its linear scalability and fault tolerance. But the integration with Hadoop does not stop at using HDFS as the storage layer: HBase shares the same developer community as Hadoop and offers native integration with Hadoop MapReduce. HBase can serve as both the source or the destination of MapReduce jobs. The benefit here is clear: there is no need for any data movement between batch MapReduce ETL jobs and the host operational and analytics database.

HBase schema designHBase offers advanced features to map business problems to the data model, which makes it way more sophisticated than a plain key-value store such as Redis. Data in HBase is placed in tables, and the tables themselves are composed of rows, each of which has a rowkey.

The rowkey is the main entry point to the data: it can be seen as the equivalent of the primary key for a traditional RDBMS database. An interesting capability of HBase is that its rowkeys are byte arrays, so pretty much anything can serve as the rowkey. As an example, compound rowkeys can be cre-ated to mix different criteria into one single key, and optimize data access speed.

In pure key-value mode, a query on the rowkey will give back all the content of the row (or to take a columnar view, all of its columns). But the query can also be much more pre-cise, and specifically address (Figure 2):

Figure 2: HBase schema design; source: Introduction to HBase Schema Design (Amandeep Khurana)

Databases


•A family of columns•A specific column, and as a result a cell which is the inter-

section of a row and a column•Or even a specific version of a cell, based on a timestamp

Combined, these different features greatly improve the base key-value model. With one constraint, the rowkey cannot be changed, and should thus be carefully selected at design stage to optimize row-key access or scan on a range of rowkeys. But beyond that, HBase offers a lot of flexibility: new columns can be added on the fly, all the rows do not need to contain the same columns (which makes it easy to add new attributes to an existing entity) and nested entities provide a way to define relationships within what otherwise remains a very flat model.

Cool Features of HBaseSorted rowkeys: Manipulation of HBase data is based on three primary methods: Get, Put, and Scan. For all of them, access to data is done by row and more specifically accord-ing to the rowkey. Hence the importance of selecting an ap-propriate rowkey to ensure efficient access to data. Usually, the focus will be on ensuring smooth retrieval of data: HBase is designed for applications requiring fast read performance, and the rowkey typically closely aligns with the application’s access patterns.

As scans are done over a range of rows, HBase lexicographi-cally orders rows according to their rowkeys. Using these “sorted rowkeys”, a scan can be defined simply from its start and stop rowkeys. This is extremely powerful to get all relevant data in one single database call: if we are only interested in the most recent entries for an application, we can concatenate a timestamp with the main entity id to easily build an optimized request. Another classical example relates to the use of geo-hashed compound rowkeys to immediately get a list of all the nearby places for a request on a geographic point of interest.

Control on data shardingIn selecting the rowkey, it is important to keep in mind that the rowkey strongly influences the data sharding. Unlike tra-ditional RDBMS databases, HBase provides the application developer with control on the physical distribution of data across the cluster. Column families also have an influence (all column members for a family share the same prefix), but the primary criteria is the rowkey to ensure data is evenly distrib-uted across the Hadoop cluster (data is sorted in ascending order by rowkey, column families and finally column key). As rowkeys determine the sort order of a table’s row, each region in the table ends up being responsible for the physical storage of a part of the row key space.

Such an ability to perform physical-level tuning is a bit unu-sual in the database world nowadays, but immensely power-ful if the application has a well-defined access pattern. In such cases, the application developer will be able to guide how the data is spread across the cluster and avoid any hotspotting by skillfully selecting the rowkey. And, at the end of the day, disk access speed matters from an application usability perspec-tive, so it is really good to have some control on it!

Strong consistencyIn its overall design, HBase tends to favour consistency over availability. It even supports ACID-level semantics on a per-row basis. This of course has an impact on write performance, which will tend to be slower than comparable consistent data-bases. But again, typical use cases for HBase are focused on a high read performance.

Overall, the trade-off plays in favour of the application developer, who will have the guarantee that the datastore always (vs eventually...) delivers the right value of the data. In effect, the choice of delivering strong consistency frees the application developer from having to implement cum-bersome mechanics at the application level to mimic such a guarantee. And it is always best when the application devel-oper can focus on the business logic and user experience vs the plumbing ...

What’s next for HBase?In the first section, we had a look at HBase's position in the wider NoSQL ecosystem, and vis-à-vis its most direct com-petitor, Cassandra. In our second and third sections, we reviewed the key technical characteristics of HBase, and high-lighted some key features of HBase that make it stand out from other NoSQL databases. In this final section, we will discuss recent initiatives building out on these capabilities and the chances of HBase becoming a mainstream operational da-tabase in a Hadoop-dominated environment.

Support for SQL with Apache PhoenixUntil recently, HBase did not offer any kind of SQL-like in-teraction language. That limitation is now over with Apache Phoenix, an open-source initiative for ad hoc querying of HBase.

Phoenix is an SQL skin for HBase, and provides a bridge between HBase and a relational model and approach to ma-nipulate data. In practice, Phoenix compiles SQL queries to native HBase calls using another recent novelty of HBase, co-processors. Unlike standard Hadoop SQL tools such as Hive, Phoenix can both read and write data, making it a more ge-neric and complete HBase access tool.

Further integration with Hadoop and SparkOver time, Hadoop has evolved from being mainly a HDFS + MapReduce batch environment to a complete data plat-form. An essential part of that transformation has been the advent of YARN, which provides a shared orchestration and resource management service for the different Hadoop components. With the delivery of project Slider end of 2014,

“HBase tends to favour consistency over availability.”

Databases


HBase cluster resource utilisation can now be “controlled” from YARN, making it easier to run data processing jobs and HBase on the same Hadoop cluster.

With a different spin, the ongoing integration work behind HBase and Spark also contributes to the unification of da-tabase operations and analytic jobs on Hadoop. Just as for MapReduce, Spark can now utilize HBase as both a data source and a target. With nearly 2/3 of users loading data into Spark via HDFS, HBase is the natural database to host low-latency, interactive applications from within a Hadoop cluster. Advanced analytics provided by Spark can be fed back directly into HBase, delivering a closed-loop system, fully integrated with the Hadoop platform (Figure 3).

Final thoughtsWith Hadoop moving from exploratory analytics to opera-tional intelligence, HBase is set to further benefit from its

position as the “Hadoop database”. The imperative of limiting data move-ments will play strongly in its favour as enterprises start building complete data pipelines on top of their Hadoop “data lake”.

In parallel, HBase is a strong contend-er for emerging use cases such as the management of IoT-related time series data. Incidentally, the recent launch by Microsoft of a HBase as a Service of-fering on Azure should be read in that context.

For these reasons, there is no doubt that HBase will continue to grow stead-ily over the next few years. Still the opportunity is here for more, and for HBase to have a much bigger impact on the enterprise market. MapR has in this

perspective recently made a promising move by incorporating its HBase-derived MapR-DB in its free community edition. For their part, Hortonworks and Cloudera have been active on the essential integrations with Slider and Spark. Now is the time for the HBase community and vendors to move to the next stage, and drive a rich enterprise roadmap for the “Hadoop database”, to make HBase sexy and attractive for mainstream enterprise customers!

Ghislain Mazars is a tech entrepreneur and founder of Ubeeko, the company behind HFactory, delivering the application stack for Hadoop and HBase. He is fascinated by the wave of disruption brought by data-driven businesses, and the underlying big data technologies underpin-ning this shift.

Figure 3: Spark Hadoop integration

Advert

Databases


by Ryan Paul

RethinkDB is an open source database for building realtime web applications. Instead of polling for changes, the devel-oper can turn a query into a live feed that continuously pushes updates to the application in realtime. RethinkDB’s streaming updates simplify realtime backend architecture, eliminating superfluous plumbing by making change propagation a native part of your application’s persistence layer.

In addition to offering unique features for realtime applica-tion development, RethinkDB also benefits from some useful characteristics that contribute to a pleasant developer experi-ence. RethinkDB is a schemaless JSON document store that is designed for scalability and ease of use, with easy shard-ing, support for distributed joins, and an expressive query language.

This tutorial will demonstrate how to build a realtime web application with RethinkDB and Node.js. It will use Socket. io to convey live updates to the frontend. If you would like to follow along, you can install RethinkDB or run it in the cloud.

First steps with ReQLThe RethinkDB Query Language (ReQL) embeds itself in the programming language that you use to build your applica-tion. ReQL is designed as a fluent API, a set of functions that you can chain together to compose queries.

Before we start building an application, let’s take a few minutes to explore the query language. The easiest way to experiment with queries is to use RethinkDB’s administrative console, which typically runs on port 8080. You can type Re-thinkDB queries into the text field on the Data Explorer tab and run them to see the output. The Data Explorer provides auto-completion and syntax highlighting, which can be help-ful while learning ReQL.

By default, RethinkDB creates a database named test. Let’s start by adding a table to the testdatabase:

r.db("test").tableCreate("fellowship")

Now, let’s add a set of nine JSON documents to the table (Listing 1).

When you run the command above, the database will out-put an array with the primary keys that it generated for all of the new documents. It will also tell you how many new re-cords it successfully inserted. Now that we have some records in the database, let’s try using ReQL’s filter command to fetch the fellowship’s hobbits:

r.table("fellowship").filter({species:"hobbit"})

The filter command retrieves the documents that match the provided boolean expression. In this case, we specifically want documents in which the species property is equal to

Listing 1

r.table("fellowship").insert([ { name: "Frodo", species: "hobbit" }, { name: "Sam", species: "hobbit" }, { name: "Merry", species: "hobbit" }, { name: "Pippin", species: "hobbit" }, { name: "Gandalf", species: "istar" }, { name: "Legolas", species: "elf" }, { name: "Gimili", species: "dwarf" }, { name: "Aragorn", species: "human" }, { name: "Boromir", species: "human" }])

First steps

An introduction to build-ing realtime apps with RethinkDB Built for scalability across multiple machines, the JSON document store RethinkDB is a distributed da-tabase that uses an easy query language. Here’s how to get started.

Databases


hobbit. You can chain additional commands to the query if you want to perform more operations. For example, you can use the following query to change the value of the species property for all hobbits:

r.table("fellowship").filter({species: "hobbit"}) .update({species: "halfling"})

ReQL even has a built-in HTTP command that you can use to fetch data from public web APIs. In the following example, we use the HTTP command to fetch the current posts from a popular subreddit. The full query retrieves the posts, orders them by score, and then displays several properties from the top five entries:

r.http("http://www.reddit.com/r/aww.json")("data")("children")("data") .orderBy(r.desc("score")).limit(5).pluck("score", "title", "url")

As you can see, ReQL is very useful for many kinds of ad hoc data analysis. You can use it to slice and dice complex JSON data structures in a number of interesting ways. If you’d like to learn more about ReQL, you can refer to the API reference documentation, the ReQL introduction on the RethinkDB website, or the RethinkDB cookbook.

Use RethinkDB in Node.js and ExpressNow that you’re armed with a basic working knowledge of ReQL, it’s time to start building an application. We’re going to start by looking at how you can use Node.js and Express to make an API backend that serves the output of a ReQL query to your end user.

The rethinkdb module in npm provides RethinkDB’s offi-cial JavaScript client driver. You can use it in a Node.js appli-cation to compose and send queries. The following example shows how to perform a simple query and display the output (Listing 2).

The connect method establishes a connection to Rethink-DB. It returns a connection handle, which you provide to the run command when you want to execute a query. The ex-ample above finds all of the halflings in the fellowship table and then displays their respective JSON documents in your console. It uses promises to handle the asynchronous flow of execution and to ensure that the connection is properly closed when the operation completes.

Let’s expand on the example above, adding an Express server with an API endpoint that lets the user fetch all of the fellowship members of the desired species (Listing 3).

If you have previously worked with Express, the code above should look fairly intuitive. The final path segment in the URL route represents a variable, which we pass to the filter com-mand in the ReQL query in order to obtain just the desired doc-uments. After the query completes, the application relays the JSON output to the user. If the query fails to complete, then the application will return status code 500 and provide the error.

Realtime updates with changefeedsRethinkDB is designed for building realtime applications. You can get a live stream of continuous query updates by ap-

pending the changes command to the end of a ReQL query. The changes command creates a changefeed, which will give you a cursor that receives new records when the results of the query change. The following code demonstrates how to use a changefeed to display table updates (Listing 4).

The cursor.each callback executes every time the data with-in the fellowship table changes. You can test it for yourself by making an arbitrary change. For example, we can remove Boromir from the fellowship after he is slain by orcs:

Listing 2var r = require("rethinkdb"); r.connect().then(function(conn) { return r.db("test").table("fellowship") .filter({species: "halfling"}).run(conn) .finally(function() { conn.close(); });}).then(function(cursor) { return cursor.toArray();}).then(function(output) { console.log("Query output:", output);}).error(function(err) { console.log("Failed:", err);});

Listing 3var app = require("express")();var r = require("rethinkdb");

app.listen(8090);console.log("App listening on port 8090");

app.get("/fellowship/species/:species", function(req, res) { r.connect().then(function(conn) { return r.db("test").table("fellowship") .filter({species: req.params.species}).run(conn) .finally(function() { conn.close(); }); }) .then(function(cursor) { return cursor.toArray(); }) .then(function(output) { res.json(output); }) .error(function(err) { res.status(500).json({err: err}); })});

Listing 4r.connect().then(function(c) { return r.db("test").table("fellowship").changes().run(c);}).then(function(cursor) { cursor.each(function(err, item) { console.log(item); });});

Databases


r.table("fellowship").filter({name:"Boromir"}).delete()

When the query removes Boromir from the fellowship, the demo application will display the following JSON data in std-out (Listing 5).

When changefeeds provide update notifications, they tell you the previous value of the record and the new value of the record. You can compare the two in order to see what has changed. When existing records are deleted, the new value is null. Similarly, the old value is null when the table receives new records.

The changes command currently works with the following kinds of queries: get, between, filter, map, orderBy, min, and max. Support for additional kinds of queries, such as group-operations, is planned for the future.

A realtime scoreboardLet’s consider a more sophisticated example: a multiplayer game with a leaderboard. You want to display the top five users with the highest scores and update the list in realtime as it changes. RethinkDB changefeeds make that easy. You can attach a changefeed to a query that includes theorderBy and limit commands. Whenever the scores or overall composition of the list of top five users changes, the changefeed will give you an update.

Before we get into how you set up the changefeed, let’s start by using the Data Explorer to create a new table and populate it with some sample data (Listing 6).

Creating an index helps the database sort more efficiently on the specified property – which is score in this case. At the present time, you can only use the orderBy command with changefeeds if you order on an index.

To retrieve the current top five players and their scores, you can use the following ReQL expression:

r.db("test").table("scores").orderBy({index: r.desc("score")}).limit(3)

We can add the changes command to the end to get a stream of updates. To get those updates to the frontend, we will use Socket.io, a framework for implementing realtime messaging between server and client. It supports a number of transport methods, including WebSockets. The specifics of Socket.io usage are beyond the scope of this article, but you can learn more about it by visiting the official Socket.io documentation.

The code in Listing 7 uses sockets.emit to broadcast the updates from a changefeed to all connected Socket.io clients.

On the frontend, you can use the Socket.io client library to set up a handler that receives the updateevent:

var socket = io.connect(); socket.on("update", function(data) { console.log("Update:", data);});

That’s a good start, but we need a way to populate the initial list values when the user first loads the page. To that end, let’s extend the server so that it broadcasts the current leaderboard over Socket.io when a user first connects (Listing 8).

The application uses the same underlying ReQL expression in both cases, so we can store it in a variable for easy reuse. ReQL’s method chaining makes it highly conducive to that kind of composability.

To wrap up the demo, let’s build a complete frontend. To keep things simple, I’m going to use Polymer’s data binding system. Let’s start by defining the template:

<template id="scores" is="auto-binding"> <ul> <template repeat="{{user in users}}"> <li><strong>{{user.name}}:</strong> {{user.score}}</li> </template> </ul></template>

Listing 6r.db("test").tableCreate("players")r.table("players").indexCreate("score")r.table("players").insert([ {name: "Bill", score: 33}, {name: "Janet", score: 42}, {name: "Steve", score: 68} ...])

Listing 7var sockio = require("socket.io");var app = require("express")();var r = require("rethinkdb"); var io = sockio.listen(app.listen(8090), {log: false});console.log("App listening on port 8090"); r.connect().then(function(conn) { return r.table("scores").orderBy({index: r.desc("score")}) .limit(5).changes().run(conn);}).then(function(cursor) { cursor.each(function(err, data) { io.sockets.emit("update", data); });});

Listing 5{ new_val: null, old_val: { id: '362ae837-2e29-4695-adef-4fa415138f90', name: 'Boromir', species: 'human' }}

Databases


It uses the repeat attribute to insert one li tag for each user. The contents of the li tag display the user’s name and their current score. Next, let’s write the JavaScript code (Listing 9).

The handler for the leaders event simply takes the data from the server and assigns it to the template variable that stores the users. The update handler is a bit more complex. It finds the entry in the leaderboard that correlates with the old_val and then it replaces it with the new data.

When the score changes for a user that is already in the leaderboard, it’s just going to replace the old record with a new one that has the updated number. In cases where a user in the leaderboard is displaced by one who wasn’t there previ-ously, it will replace one user’s record with that of another. The code in Listing 9 above will properly handle both cases.

Of course, the changefeed updates don’t help us maintain the actual order of the users. To remedy that problem, we simply sort the user array after every update. Polymer’s data binding system will ensure that the actual DOM representa-tion always reflects the desired order.

Now that the demo application is complete, you can test it by running queries that change the scores of your users. In the Data Explorer, you can try running something like:

r.table("scores").filter({name: "Bill"}) .update({score: r.row("score").add(100)})

When you change the value of the user’s score, you will see the leaderboard update to reflect the changes.

Next stepsConventional databases are largely designed around a query/response workflow that maps well to the web’s traditional request/response model. But modern technologies like Web-Sockets make it possible to build applications that stream up-dates in realtime, without the latency or overhead of HTTP requests.

RethinkDB is the first open source database that is designed specifically for the realtime web. Changefeeds offer a way to build queries that continuously push out live updates, obviat-ing the need for routine polling.

To learn more about RethinkDB, check out the official documentation. The introductory ten-minute guide is a good place to start. You can also check out some RethinkDB demo applications, which are published with complete source code.

Listing 9var scores = document.querySelector("#scores");var socket = io.connect(); socket.on("leaders", function(data) { scores.users = data;}); socket.on("update", function(data) { for (var i in scores.users) if (scores.users[i].id === data.old_val.id) { scores.users[i] = data.new_val; scores.users.sort(function(x,y) { return y.score - x.score }); break; }});

Listing 8var getLeaders = r.table("scores").orderBy({index: r.desc("score")}).limit(5); r.connect().then(function(conn) { return getLeaders.changes().run(conn);}).then(function(cursor) { cursor.each(function(err, data) { io.sockets.emit("update", data); });}); io.on("connection", function(socket) { r.connect().then(function(conn) { return getLeaders.run(conn) .finally(function() { conn.close(); }); }) .then(function(output) { socket.emit("leaders", output); });});

Ryan Paul is a developer evangelist at RethinkDB. He is also a Linux en-thusiast and open source software developer. He was previously a con-tributing editor at Ars Technica, where he wrote articles about software development.

Web


by Klaus Enzenhofer

Whether it was on Black Friday, Cyber Monday or just dur-ing general Christmas shopping, this year’s holidays have proven that too many online shops were far from well pre-pared for big traffic on their website or mobile offering. The great impact of end-user experience is an underestimated aspect for the whole business. Application performance management (APM) has come a long way in a few short years, but despite the numerous solutions available in the market, many businesses still struggle with fundamental problems.

With a view to the next stressful situations that will effect company applications, business- and IT-professionals are requested to evolve APM strategies to successfully navigate multi-channels in a multi-connected world. The optimizing of application performance to deliver high-quality, friction-less user experiences across all devices and all channels isn’t easy, especially if you’re struggling with these heavy chal-lenges:

1. SamplingLooking at an aggregate of what traffic analytics tell you about daily, weekly and monthly visits isn’t enough. And

What is good news for sales, can be bad news for IT. Sudden spikes in application usage need plenty of preparation, so before you unwittingly make any performance no-nos, here are the five areas where you might be slipping up.

© iS

tock

phot

o.co

m/e

njoy

nz

The five biggest challenges in application performance management

Performance fails and how to avoid them

Web


counting on a sampling of what users experience is also a scary approach for sure. Having a partial view of what is hap-pening across your IT systems and applications is akin to try-ing to drive a car when someone is blindfolding you.

Load Testing is essential and although it is an important part of preparation for peak event times like Black Friday or Christmas. Because it is no substitute for real user monitoring and to ensure a good customer journey for every visitor, a bundle of different methods is requested: Load testing, syn-thetic monitoring AND real user monitoring. Not only does this limit your understanding of what’s happening across the app delivery chain, it leads to the next major scare that or-ganizations face.

2. Lessons learned about performance issues It’s Black Friday at 11 a. m., the phone rings and your boss screams: “Is our site down? Why are transactions slowing to a crawl? The call center is getting overwhelmed with customer questions why they can’t check out online – Fix it!” This is the nightmare scenario that plays out too often, but it doesn’t need to be that way.

For best results in performance and availability it’s a must have for continuously real user monitoring of all transac-tions, 24 hours, 7 days each week. Only this will ensure you will see any and all issues as they come up, before customers are involved. Only this gives to your the ability to respond immediately and head off a heart-stopping call about issues that should have been avoided. If your customers are your “early warning system” they will be frustrated and likely start venting on social media – which can be incredibly damaging your business' reputation. As a result frustrated customers will move to a competitor and revenue will be lost.

3. Problems identified, but no explanation So you and your team can manage the first two challenges without having a lot of trouble. But now you have to face the next major hurdle. The Application Performance Moni-toring shows you there’s a problem, but you can’t pinpoint the exact cause. Combing through waterfall charts and logs – especially while racing against the clock to fix a prob-lem – can feel like looking for needles in haystacks. You can’t get any solution and the hurdle seems insurmountable. When every minute can mean tens of thousands of dollars in lost revenue, the old adage “time is money” is likely to be ringing in your ears.

But your IT doesn’t just need more data, it needs trans-parency from the end users into the data center, into the applications and deep to the level of the individual code line. It needs a look through a magnifying glass with a new generation APM solution. Today, synthetic monitoring em-powers businesses to detect, classify, identify and gather information on root causes of performance issues, instant triage, problem ranking and cause identification. “Smart analytics” reduces hours of manual troubleshooting to a matter of seconds. Not all APM tools are covering a deep-dive analysis, so you need to test and check all your impor-tant needs.

4. Third-parties – the unknown starsYou are flying blind if you can’t cover the impact of inte-grated third-party services and if you don’t have the control of their SLA compliance. Modern applications execute code on diverse edge devices, often calling elements from a vari-ety of third-party services well beyond the view of traditional monitoring systems. Sure, third-party services can improve end-user experiences and deliver functionality faster than standalone applications, but they have a dark side. They can increase complexity and page weights and decrease site per-formance to actually compromise the end-user experience.

Not only that, when a third-party service goes down, whether it’s a Facebook “like” button, the “cart” in an online shop, ad or web analytics, IT is often faced with performance issues that’s not their fault, and not within their view. Trouble is inevitable if they cannot explain the reason for a bad per-formance or a crash on the website and frustration will effect not only your end-users, but also your IT team.

5. The Cloud – performance in the darkA global survey of 740 senior IT professionals found that nearly 80 percent of interviewed persons said that they fear cloud providers hide performance problems. Additionally, 63 percent of respondents indicated there was a need for more meaningful and granular SLA metrics that are geared toward ensuring the continuous delivery of a high quality end-user experience.

In preparing an upcoming major sales campaign you’ve done great work and you are confident, all your effort en-sures your websites resist the rush. But when the big day comes, it turns out that the load testing you’ve done with your CDN isn’t playing out the way it was predicted – be-cause they are getting hit with peak demand that wasn’t reflected when they were in test mode. The inadequate track-ing and responding in real-time shows exactly how a lack of visibility effects and destroys any plan to make big money with the sales event.

Whether you’ve launched a new app in a public cloud, or in your virtualized data center, full visibility across all cloud and on premise tiers – in one pane of glass – is the only way to maintain control. In this way, you’ll be able to detect regressions automatically and identify root cause in minutes. Reflect and consider these APM best practices in your daily job. The next shopping season is coming sooner than expected.

Klaus Enzenhofer is a Technology Strategist and the Lead of the Dynatrace Center of Excellence Team.

Web


by Matti Tahvonen

As expressed by Pete Hunt (Facebook, React JS) at the JavaOne 2014 Web Framework Smackdown, if you were to create a UI toolkit from scratch, it would look nothing like a DOM. Web technologies are not designed for application development, but rich text presentation. Markup-based pres-entation has proven to be superior for more static content like web sites, but applications are a different story. In the early stages of graphical UIs on computers, UI frameworks didn’t form “component based” libraries by accident. Those UI li-braries have developed over decades, but the basic concept of component based UI framework is still the most powerful way to create applications.

And yet Swing, SWT, Qt and similar desktop UI frame-works have one major problem compared to web apps: they require you to install special software on your client machine. As we have all learned during the internet era, this can be a big problem. Today’s users have lots of different kinds of applications that they use and installing all of them (and es-pecially maintaining them) will become a burden for your IT department.

Browser plugins like Java’s Applet/Java WebStart support (and Swing or JavaFX) and Flash are the traditional work-arounds to avoid installing software locally for workstations. But famous security holes in these, especially with outdated software, may become a huge problem and your IT depart-ment will nowadays most likely be against installing any kind of third party browser plugins. For them it is much easier

to just maintain one browser application. This is one of the fundamental reasons why pure web apps are now conquering even the most complex application domains.

Welcome to the wonderful world of web appsEven for experienced desktop developers it may be a huge jump from the desktop world to web development. Develop-ing web applications is much trickier than developing basic desktop apps. There are lots of things that make things com-plicated, such as client-server communication, the markup language and CSS used for display, new programming lan-guages for the client side and client-server communication in many different forms (basic HTTP, Ajax style requests, long polling, WebSockets etc.). The fact is that, even with the most modern web app frameworks, web development is not as easy as building desktop apps.

Vaadin Framework is probably the closest thing to the component based Swing UI development in the mainstream web app world. Vaadin is a component based UI library that tries to make web development as easy as traditional desk-top development, maximizing developers' productivity and the quality of the produced end user experience. In a Vaadin application the actual UI logic, written by you, lives in the server’s JVM. Instead of browser plugins, Vaadin has a built-in “thin client” that renders the UI efficiently in browsers. The highly optimized communication channel sends only the stuff that is really visible on the user’s screen to the client. Once the initial rendering has been done, only deltas, in both ways, are transferred between the client and the server.

Architectural differences in Vaadin web applications

Separating UI structure and logic In the first part of JAXenter’s series on Vaadin-based web apps, Matti Tahvonen shows us why every architec-tural decision has its pros and cons, and why the same goes for switching from Swing to Vaadin in your UI layer.

Web


ArchitectureThe architecture of Vaadin Framework provides you with an abstraction for the web development challenges, and most of the time you can forget that you are building a web ap-plication. Vaadin takes care of handling all the communica-tion, HTML markup, CSS and browser differences – you can concentrate all your energy on your domain problems with a clean Java approach and take advantage of your experience from the desktop applications.

Vaadin uses GWT to implement its “thin client” running in the browser. GWT is another similar tool for web develop-ment, and its heart is its Java to JavaScript “compiler”. GWT also has a Swing-like UI component library, but in GWT the Java code is compiled into JavaScript and executed in the browser. The compiler supports only a subset of Java and the fact that it is not running in JVM causes some other limita-tions, but the concepts are the same. Running your code in the browser as a white box also has some security implica-tions.

One application instance, many usersThe first thing you’ll notice is that you are now developing your UI right next to your data. Pretty much all modern business apps, both web and desktop apps, save their data somehow to a central server. Often the data is “shielded” a middleware layer (for example with EJBs). Now that you move to Vaadin UI, the EJB, or whatever the technology you use in your “backend”, is “closer”. It can often be run in the very same application server as your Vaadin UI, making some hard problems trivial. Using a local EJB is both efficient and secure.

Even if you’d still use a separate application server for your EJBs, they are most probably connected to UI servers using a fast network that can handle chatty connection between UI and business layers more efficiently than typical client server communication – the network requirements by the Vaadin thin client are in many cases less demanding, so your applica-tion can be used over e. g. mobile networks.

Another thing developers arriving from desktop Java to Vaadin will soon notice is that fields with “static” keywords are quite different in the server world. Many desktop appli-cations use static fields as “user global” variables. For Java apps running in server, they are “application global”, which is a big difference. Application servers generally use a class loader per web application (.war file), not class loader per user session. For “user global” variables, use fields in your UI class, Vaadin Session, HttpSession or e. g. @SessionScoped CDI bean.

Web applications in general will be much cheaper for IT de-partments to maintain. They have been traditionally run on a company’s internal servers, but the trend of the era is hosting them in PaaS services, in the “cloud”. Instead of maintain-ing the application in each user’s workstation, updates and changes only need to be applied to the server. Also all data, not just the shared parts, is saved on the server whose back-ups are much easier to handle. When your user’s workstation breaks, you can just give him/her a replacement and the work can continue.

Memory and CPU usage is centralized to serverOn the negative side is the fact that some of the computing previously done by your user's workstation is now moved to the server. The CPU hit is typically negligible, but you might face some memory constraints without taking this fact into account. On the other hand, the fact that the application memory and processing happens now mostly on the server, might be a good thing. The server side approach makes it possible to handle really complex computing tasks, even with really modest handheld devices. This is naturally possible with Swing and a central server as well, but with the Vaadin approach this comes as a free bonus feature.

A typical Vaadin business app consumes 50–500 kB of server memory per user, depending on your application char-acteristics. If you have a very small application you can do with a smaller number and if you reference a lot of data from your UI, which usually makes things both faster and simpler, you might need even more memory per user.

The per user memory usage is in line with e. g. Java EE stand-ard JSF. If you do some basic math you can understand this isn’t an issue for most typical applications and modern appli-cation servers. But, in case you create an accidental memory leak in application code or carelessly load the whole database table into memory, the memory consumption may become an issue earlier than with desktop applications. Accidentally referencing a million basic database entities from a user ses-sion will easily consume 100–200 MB of memory per session. This might still be tolerable in desktop applications, but if you have several concurrent users, you’ll soon be in trouble.

The memory issues can usually be rather easily solved by using paging or by lazy loading the data from the backend to the UI. Server capacity is also really cheap nowadays, so buy-ing a more efficient server or clustering your application to multiple application servers is most likely much cheaper than making compromises in your architectural design. But in case each of your application users need to do some heavy analysis with huge in-memory data sets, web applications are still not the way to go for your use case.

If your application's memory usage is much more impor-tant than its development cost (read: you are trying to write the next GMail), Vaadin might not be the right tool for you. If you still want to go to web applications, in this scenario you should strive for completely (server) stateless application and keep your UI logic in browsers. GWT is a great library for these kinds of applications.

Additionally there are helpers to implement navigation be-tween views and the management of master-detail interfaces. One source of inspiration is Microsoft’s framework PRISM, an application framework that provides many needed tools for the development of applications.

Matti Tahvonen works at Vaadin in technical marketing, helping the com-munity be as productive as possible with Vaadin.

Java


by Alexander Casall

The design pattern “Model View ViewModel” was first pub-lished by Microsoft for .Net applications and is nowadays also used in other technologies like JavaScript frameworks. As with other MV* approaches the goal is the separation be-tween the structure of the user interface and the (UI-) logic. To do this MVVM defines a ViewModel that represents the state of the UI. The ViewModel doesn’t know the View and has no dependencies to specific UI components.

Instead the View contains the UI components but no UI logic and is connected with the ViewModel via Data Bind-ing. Figure 1 shows a simple example of the preparation of a welcome message in the ViewModel.

One of the benefits of this structure is that all UI state and UI logic is encapsulated in a ViewModel that is independent from the UI. But what is UI logic?

The UI logic defines how the user interface reacts to input from the user or other events like changes in the domain mod-el. For example, the decision whether a button should be ac-tive or inactive. Because of the independence from the UI, the ViewModel can be tested with unit tests. In many cases there is no need for complicated integration tests anymore where the actual application is started and remotely controlled by the test tool. This simplifies test-driven development signifi-cantly. Due to the availability of Properties and Data Binding JavaFX is eminently suitable for this design pattern. mvvmFX adds helpers and tools for the efficient and clean implementa-tion of the pattern.

The following example will give an impression of the devel-opment process with MVVM. In this example there is a login button that should only be clickable when the username and the password are entered. Following TDD, the first step is to create a unit test for the ViewModel (Listing 1).

After that the ViewModel can be implemented (Listing 2).Now this ViewModel has to be connected with the View.

In the context of mvvmFX the “View” is the combination of an fxml file and the related controller class. It is important to

keep in mind that the JavaFX controller is part of the View and should not contain any logic. Its only purpose is to create the connection to the ViewModel (Listing 3).

Please note that the View has a generic type that is the re-lated ViewModel type. This way mvvmFX can manage the lifecycle of the View and the ViewModel.

Additional FeaturesThe shown example uses FXML to define the structure of the user interface. This is the recommended way for development but mvvmFX supports traditional Views written with pure Java code too. Another key aspect of the library is the sup-port of Dependency Injection frameworks. This is essential to be able to use the library in bigger projects. At the moment there are additional modules provided for the integration with Google Guice and JBoss Weld/CDI to allow for an easy start with these frameworks. But other DI frameworks can be easily embedded too.

mvvmFX was recently released in a first stable version 1.0.0. It is currently used for projects by worklplace Saxo-nia Systems AG. The framework is developed as open source

Listing 1

@Testpublic void test(){ LoginViewModel viewModel = new LoginViewModel();

assertThat(viewModel.isLoginButtonDisabled()).isFalse();

viewModel.setUsername("mustermann"); assertThat(viewModel.isLoginButtonDisabled()).isFalse();

viewModel.setPassword("geheim1234"); assertThat(viewModel.isLoginPossible()).isTrue();}

A look at mvvmFX

Model View ViewModel with JavaFX The mvvmFX framework provides tools to implement the Model View ViewModel design pattern with JavaFX. After one year of development a first stable 1.0.0 version has been released.

Java


there are helpers to implement navigation between views and the management of master-detail interfaces.

(Apache licence) and is hosted on GitHub. The authors wel-come feedback, suggestions and critical reviews.

For the future development the focus lies on features that are needed for bigger projects with complex user interfaces. These include a mechanism with many ViewModels that can access common data without the introduction of a mutual visibility and dependency to each other (Scopes). Additionally

Listing 2

public class LoginViewModel implements ViewModel {

private StringProperty username = new SimpleStringProperty(); private StringProperty password = new SimpleStringProperty(); private BooleanProperty loginPossible = new SimpleBooleanProperty();

public LoginViewModel() { loginButtonDisabled.bind(username.isEmpty().or(password. isEmpty()); }

// getter/setter}

Figure 1: Welcome message in ViewModel

Listing 3public class LoginView implements FxmlView<LoginViewModel> { @FXML public Button loginButton;

@FXML public TextField username;

@FXML public PasswordField password;

@InjectViewModel // is provided by mvvmFX private LoginViewModel viewModel;

// will be called by JavaFX as soon as the FXML bootstrapping is done public void initialize(){ username.textProperty() .bindBidirectional(viewModel.usernameProperty()); password.textProperty() .bindBidirectional(viewModel.passwordProperty());

loginButton.disableProperty() .bindBidirectional(viewModel.loginPossibleProperty()); }}

PublisherSoftware & Support Media GmbH

Editorial Offi ce AddressSoftware & Support MediaSaarbrücker Straße 3610405 Berlin, Germanywww.jaxenter.com

Editor in Chief: Sebastian Meyen

Editors: Coman Hamilton, Natali Vlatko

Authors: Chris Becker, Alexander Casall, Klaus Enzenhofer, Ghislain Mazars,

Ryan Paul, Nathan Rijksen, Matti Tahvonen

Copy Editor: Jennifer Diener

Creative Director: Jens Mainz

Layout: Flora Feher, Christian Schirmer

Sales Clerk:Anika Stock+49 (0) 69 [email protected]

Entire contents copyright © 2015 Software & Support Media GmbH. All rights reserved. No part of this publication may be reproduced, redistributed, posted online, or reused by any means in any form, including print, electronic, photocopy, internal network, Web or any other method, without prior written permission of Software & Support Media GmbH.

The views expressed are solely those of the authors and do not refl ect the views or po-sition of their fi rm, any of their clients, or Publisher. Regarding the information, Publisher disclaims all warranties as to the accuracy, completeness, or adequacy of any informa-tion, and is not responsible for any errors, omissions, in adequacies, misuse, or the con-sequences of using any information provided by Pub lisher. Rights of disposal of rewarded articles belong to Publisher. All mentioned trademarks and service marks are copyrighted by their respective owners.

Imprint

Alexander Casall is a developer at Saxonia Systems AG, with a focus on multi-touch applications using JavaFX.

The digital magazine for enterprise developers Polyglots do it better · 2018-06-27 · The digital...

Documents

Transcript of The digital magazine for enterprise developers Polyglots do it better · 2018-06-27 · The digital...