22 free tools for data visualization and analysis

download 22 free tools for data visualization and analysis

of 21

Transcript of 22 free tools for data visualization and analysis

  • 7/31/2019 22 free tools for data visualization and analysis

    1/21

    22 free tools for data visualization and analysis

    Got data? These useful tools can turn it into informative, engaging graphics.Sharon Machlis

    April 20, 2011(Computerworld)

    You may not think you've got much in common with an investigative journalist or an

    academic medical researcher. But if you're trying to extract useful information from an ever-

    increasing inflow of data, you'll likely find visualization useful -- whether it's to show patterns

    or trends with graphics instead of mountains of text, or to try to explain complex issues to a

    nontechnical audience.

    There are many tools around to help turn data into graphics, but they can carry hefty price

    tags. The cost can make sense for professionals whose primary job is to find meaning inmountains of information, but you might not be able to justify such an expense if you or your

    users only need a graphics application from time to time, or if your budget for new tools is

    somewhat limited. If one of the higher-priced options is out of your reach, there are a

    surprising number of highly robust tools for data visualization and analysis that are available

    at no charge.

    Here's a rundown of some of the

    better-known options, many of

    which were demonstrated at the

    Computer-Assisted Reporting

    (CAR) conference last month.Others are not as well known but show great promise. They range from easy enough for a

    beginner (i.e., anyone who can do rudimentary spreadsheet data entry) to expert (requiring

    hands-on coding). But they all share one important characteristic: They're free. Your only

    investment: time.

    Data cleaning

    Before you can analyze and visualize data, it often needs to be "cleaned." What does that

    mean? Perhaps some entries list "New York City" while others say "New York, NY" and you

    need to standardize them before you can see patterns. There might be some records with

    misspellings or numerical data-entry errors. The following two tools are designed to help get

    your data in tip-top shape to be analyzed.

    DataWrangler

    What it does: This Web-based service from Stanford University's Visualization Group is

    designed for cleaning and rearranging data so it's in a form that other tools such as a

    spreadsheet app can use.

    Click on a row or column, and DataWrangler will suggest changes. For example, if you click

    on a blank row, several suggestions pop up such as "delete row" or "delete empty rows."

    There's also a history list that allows for easy undo -- a feature that's also available in Google

    Refine (reviewed next).

    Want to see all the tools at once?

    For quick reference, check out our chart listing 22 free data

    visualization tools.

    Page 1 of 2122 free tools for data visualization and analysis

    21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...

  • 7/31/2019 22 free tools for data visualization and analysis

    2/21

    What's cool: Text editing is especially easy. For example, when I selected "Alabama" in one

    row of sample data headlined "Reported crime in Alabama" and then selected "Alaska" in the

    next group of data, it led to a suggestion to extract every state name. Hover your mouse over

    a suggestion, and you can see affected rows highlighted in red.

    Drawbacks: I found that unexpected changes occurred as I attempted to explore

    DataWrangler's options; I constantly had to click "clear" to reset. And not all suggestions are

    useful ("promote row to header" seemed an odd suggestion when the row was blank) or easy

    to understand ("fold split 1 using 2 as key").

    And while the fact that DataWrangler is a Web-based service makes it convenient to use,

    don't forget that it sends your data off to an external site -- which means it isn't an option for

    sensitive internal information. However, there are plans for a future release of a stand-alone

    desktop version. Another important thing to keep in mind is that DataWrangler is currently

    alpha code, and its creators say it's "still a work in progress."

    Skill level: Advanced beginner.

    Runs on: Any Web browser.

    Learn more: There's a screencast on the Data Wrangler home page. Also, see this post on

    using DataWrangler to format data (from Tableau Public's blog).

    Google Refine

    What it does: Google Refine can be described as a spreadsheet on steroids for taking a first

    look at both text and numerical data. Like Excel, it can import and export data in a number of

    formats including tab- and comma-separate text files and Excel, XML and JSON files.

    Refine features several built-in algorithms that find text items that are spelled differently but

    DataWrangler helps format table data so it can be better used and analyzed by other applications.Click to view larger image.

    Google Refine can make data 'cleaner' by helping to find errors or different versions of the same proper names. Click

    to view larger image.

    Page 2 of 2122 free tools for data visualization and analysis

    21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...

  • 7/31/2019 22 free tools for data visualization and analysis

    3/21

    actually should be grouped together. After importing your data, you simply select edit cells -->

    cluster and editand select which algorithm you want to use. After Refine runs, you decide

    whether to accept or reject each suggestion. For example, you could say yes to combining

    Microsoft and Microsoft Corp., but no to combining Coach Inc. with CQG Inc. If it's offering

    too few or too many suggestions, you can change the strength of the suggestion function.

    There are also numerical options that offer quick and easy overviews of data distributions.

    This functionality can reveal anomalies that might be the result of data input errors -- such as

    $800,000 instead of $80,000 for a salary entry, or it could expose inconsistencies -- such as

    differences in the way compensation data is reported from entry to entry, with some showing,

    say, hourly wages and others showing weekly pay or yearly salaries.

    Beyond data housekeeping, Google Refine offers some useful analysis tools, such as sorting

    and filtering.

    What's cool: Once you get used to which commands do what, this is a powerful tool for data

    manipulation and analysis that strikes a good balance between functionality and ease of use.

    The undo/redo list of every action you've taken lets you roll back when needed. And textfunctions handle Java-syntax regular expressions, allowing you to look for patterns (such as,

    say, three numbers followed by two digits) as well as specific text strings and numbers.

    Finally, while this is a browser-based application, it works with files on your desktop, so your

    data remains local.

    Drawbacks: Although Google Refine looks like a spreadsheet, you can't do typical

    spreadsheet calculations with it; for that, you must export to a conventional spreadsheet

    application. If you've got a large data set, carve out some time in your day to go through all of

    Refine's suggested changes, since it can take a while. And, depending on the data set, be

    prepared when looking for text items to merge: You're likely to get either a lot of false

    positives or missed problems -- or both.

    Skill level: Advanced beginner. Knowledge of data analysis concepts is more important than

    technical prowess; power Excel users who understand data-cleaning needs should be

    comfortable with this.

    Runs on: Windows, Mac OS X (if it appears to do nothing after loading on a Mac, point a

    browser manually to http://127.0.0.1:3333/ ), Linux.

    Learn more: These three screencasts give a good overview of why and how you'd use

    Refine; there's also fairly detailed documentation on the Google Code project area.

    Statistical analysis

    Sometimes you need to combine graphical representation of your data with heftier numerical

    analysis.

    The R Project for Statistical Computing

    What it does: R is a general statistical analysis platform (the authors call it an "environment")

    that runs on the command line. Need to find means, medians, standard deviations,

    correlations? R can handle that and much more, including "linear and generalized linear

    models, nonlinear regression models, time series analysis, classical parametric and

    nonparametric tests, clustering and smoothing," according to the project website.

    Page 3 of 2122 free tools for data visualization and analysis

    21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...

  • 7/31/2019 22 free tools for data visualization and analysis

    4/21

    R also graphs, charts and plots results. There are numerous add-ons to this open-sourceproject that significantly extend functionality. For users who prefer a GUI, Peter Aldhous, San

    Francisco bureau chief for New Scientistmagazine, suggests RExcel, which offers access to

    the R engine through Excel.

    What's cool: There is a great deal of functionality in R, including quite a number of

    visualization options as well as numerical and spatial analysis.

    Drawbacks: The fact that R runs on the command line means that users will have to take the

    time to learn which commands do what, and not all users will be comfortable with a text-only

    interface. In addition, Aldhous says those dealing with large data sets may hit a memory

    barrier (if so, there's a commercial option from Revolution Analytics).

    Skill level: Intermediate to advanced. Comfort with command-line prompts and a knowledge

    of statistics are a musts for the core application.

    Runs on: Linux, Mac OS X, Unix, Windows XP or later.

    Learn more: Try R for Statistics: First Steps (PDF) by Peter Aldhous, Hands-on R, a step-

    by-step tutorial (PDF) by Jacob Fenton, and the project's own An Introduction to R. The R

    Statistics blog has a number of visualization samples.

    Visualization applications and services

    These tools offer a number of different visualization options. While some stick to conventionalcharts and graphs, many offer a range of other choices such as treemaps and word clouds. A

    few offer geographical mapping as well, although if you're interested in maps, our sections on

    GIS/mapping focus specifically on that.

    Google Fusion Tables

    What it does: This is one of the simplest ways I've seen to turn data into a chart or map. You

    can upload a file in several different formats and then choose how to display it: table, map,

    heatmap, line chart, bar graph, pie chart, scatter plot, timeline, storyline or motion (animation

    over time). It's somewhat customizable, allowing you to change map icons and style infowindows.

    The R Project for Statistical Computing provides a wide range of data analysis options.Click to view larger image.

    Page 4 of 2122 free tools for data visualization and analysis

    21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...

  • 7/31/2019 22 free tools for data visualization and analysis

    5/21

    There are some data editing functions within Fusion Tables, although changing more than a

    few individual cell entries can quickly become tedious. You can also join tables (which is

    important when the data you want to map is in multiple tables), and filter, sort and add

    columns and so on. There are also options to allow others to make comments on the dataitself.

    Mapping goes beyond just placing points, as many of us are accustomed to with Google

    Maps. Fusion tables can also map multiple polygons with variations in color based on

    underlying data, such as this intensity map showing the percentage of households with

    Internet access by state from 2007 U.S. Census bureau data.

    The Knight Digital Media Center notes that a handy undocumented feature allows the use of

    Fusion Table's "templating" export to generate a JSON file from data in other formats. JSON

    is required by some APIs and JavaScript libraries.

    Unlike IBM's Many Eyes, Google lets you designate your data as private or unlisted as well

    as public, although your data still resides on Google's servers -- a benefit or drawback,

    depending on whether server bandwidth costs or data privacy is more important to you.

    What's cool: Fusion Tables offers relatively quick charting and mapping, including

    geographic information system (GIS) functions to analyze data by geography. The service

    also automatically geocodes addresses, which is useful when trying to place numerous

    points on a map. This is an excellent tool for beginners and advanced beginners to use to get

    comfortable with analyzing data; it's also a good fit for people who don't program. For more

    advanced users, there's an API.

    Drawbacks: Functionality, customization and data capacity are all limited compared with

    desktop applications or custom code, and interacting with large data sets on the site can besluggish. And it has its limitations -- the site choked on March 11, the day of the devastating

    earthquake and tsunami in Japan. (It is still a Google Labs beta project.)

    Google Fusion Tables is a user-friendly tool that makes it easy to map data.Click to view interactive map.

    Page 5 of 2122 free tools for data visualization and analysis

    21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...

  • 7/31/2019 22 free tools for data visualization and analysis

    6/21

    Skill level: Beginner.

    Runs on: Any Web browser.

    Learn more: A Google Fusion Tables tour and several tutorials are available. We've also got

    some examples of what it can do in our story "H-1B Visa Data: Visual and Interactive Tools."

    Also see the Fusion Tables Example Gallery.

    Impure

    What it does: Impure is sort of a Yahoo Pipes for data visualization, designed for creating

    numerous types of highly polished graphical representations of data using a drag-and-drop

    workspace. The service includes a library of objects and various methods, and -- as with

    Yahoo Pipes -- it allows you to click and drag to connect modules so that the output of one

    becomes the input of another. It was developed by Spanish analytics firm Bestiario.

    What's cool: Impure offers a highly visual interface for the task of creating visualizations --

    which is not as common as you might expect. It has a sleek user interface and numerous

    modules, including quite a few APIs that are designed to pull data from the Web. It features

    numerous visualization types that are searchable by keywords like numeric, tables, nodes,

    geometryand map. And although it saves your workspaces to the Web, you can copy and

    save the code behind your workspaces locally, so you can back up your work or maintain

    your own libraries of code snippets.

    Drawbacks: Users of Impure face a surprisingly steep learning curve despite its drag-and-

    drop functionality. The documentation is detailed in some areas, but lacking in others. For

    instance, while it was easy to find a list of APIs, it was more difficult to find basic instructions

    on how to use the workspace -- or even figure out that there wasa workspace, let alone how

    to use the various objects and methods.

    Once you save your workspace, it's on the public Web, although it's unlikely that anyone else

    will be able to find it unless you share the URL. And I found some of the samples not all that

    helpful in understanding the underlying data, even if they were visually striking.

    Skill level: Intermediate.

    Runs on: Any Web browser.

    Learn more: To get started, I'd suggest the videos "Interface Basics" (7 minutes) and

    "Workspaces and Code." You can find a sample called The Pay Gap Between Men and

    Women Mapped at the website of British newspaper The Guardian.

    Tableau Public

    What it does: This tool can turn data into any number of visualizations, from simple to

    complex. You can drag and drop fields onto the work area and ask the software to suggest a

    visualization type, then customize everything from labels and tool tips to size, interactive

    filters and legend display.

    Page 6 of 2122 free tools for data visualization and analysis

    21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...

  • 7/31/2019 22 free tools for data visualization and analysis

    7/21

    What's cool: Tableau Public offers a variety of ways to display interactive data. You can

    combine multiple connected visualizations onto a single dashboard, where one search filter

    can act on numerous charts, graphs and maps; underlying data tables can also be joined.

    And once you get the hang of how the software works, its drag-and-drop interface is

    considerably quicker than manually coding in JavaScript or R for most users, making it more

    likely that you'll try additional scenarios with your data set. In addition, you can easily perform

    calculations on data within the software.

    Drawbacks: In the free version of Tableau's business intelligence software, your

    visualization and data must reside on Tableau's site. Whenever you save your work, it gets

    sent up to the public website -- which means you can't save work in progress without running

    the risk that it will be seen before it's ready (while Tableau's site won't deliberately expose

    your work, it relies on security by obscurity -- so someone could see your work if they guess

    your URL). And once it's saved, viewers are invited to download your entire workbook with

    data. Upgrading to a single-user desktop edition costs $999.

    Not surprisingly, all that functionality comes at a cost: Tableau's learning curve is fairly steep

    compared to that of, say, Fusion Tables. Even with the drag-and-drop interface, it'll take more

    than an hour or two to learn how to use the software's true capabilities, although you can getup and running doing simple charts and maps before too long.

    Skill level: Advanced beginner to intermediate.

    Runs on:Windows 7, Vista, XP, 2003, Server 2008, 2003.

    Learn more: There are seven short training videos on the Tableau site, where you can also

    find downloadable data files that you can use to follow along.

    You can see a sample in our article "Tech Unemployment Climbs; Self-employment Steady."

    Many Eyes

    A pioneer in Web-based data visualization, IBM's Many Eyes project combines graphical

    analysis with community, encouraging users to upload, share and discuss information. It's

    Tableau Public can turn data into any number of visualizations, from simple to complex.Click to view interactive graphic.

    Page 7 of 2122 free tools for data visualization and analysis

    21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...

  • 7/31/2019 22 free tools for data visualization and analysis

    8/21

    extremely easy to use and very well documented, including suggestions on when to use what

    kind of visual data representation. Many Eyes includes more than a dozen output options --

    from charts, graphics and word clouds to treemaps, plots, network diagrams and some

    limited geographic maps.

    You'll need a free account to upload and post data, although anyone can browse. Formatting

    is basic: For most visualizations, the data must be in a tab-separated text file with column

    headers in the first row.

    It took me about three minutes to create a bar chart of top H-1B visa employers.

    It took perhaps another minute to create a treemap of the same data.

    What's cool: Visualization can't get much easier, and the results look considerably more

    sophisticated than you'd expect based on the minimal amount of effort needed to create

    them. Plus, the list of possible visualization types includes explanations of the types of data

    each one is best suited for.

    Drawbacks: Both your visualizations and your data sets are public on the Many Eyes site

    It takes just a few minutes to create online charts like this with Many Eyes.Click to view larger image.

    Many Eyes offers a number of ways to visualize data, such as treemaps.Click to view larger image.

    Page 8 of 2122 free tools for data visualization and analysis

    21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...

  • 7/31/2019 22 free tools for data visualization and analysis

    9/21

    and can be easily downloaded, shared, reposted and commented upon by others. This can

    be great for certain types of users -- especially government agencies, nonprofits, schools and

    other organizations that want to share visualizations on someone else's server budget -- but

    an obvious problem for others. (IBM does offer a contact form for businesses interested in

    hosting their own version of the software.) In addition, customization is limited, as is data file

    size (5MB).

    Skill level: Beginner.

    Runs on: Java and any modern Web browser that can display Flash.

    Learn more: IBM's website features pages explaining data formatting for Many Eyes and

    visualization choices.

    You can see some featured visualizations on the Many Eyes home page or browse through

    some of the tens of thousands of uploads. One interesting map shows popular surnames in

    the U.S. from the 2000 Census by Martin Wattenberg, one of the creators of Many Eyes.

    VIDI

    What it does: Although VIDI's website bills this as a tool for the Drupal content management

    system, graphics created by the site's visualization wizard can be used on any HTML page --

    no Drupal required.

    Upload your data, select a visualization type, do a bit of customization selection, and your

    chart, timeline or map is ready to use via auto-generated embed code (using an iframe, not

    JavaScript or Flash).

    What's cool: This is about as easy as Many Eyes -- with more mapping options and no need

    to make your visualization and data set public on its website. There are quick screencasts

    explaining each visualization type and several different color customization options. And thefile-size limit of 30MB is six times larger than Many Eyes' 5MB maximum.

    Graphics created by VIDI's visualization wizard can be used on any HTML page -- no Drupal required.Click to view interactive graphic.

    Page 9 of 2122 free tools for data visualization and analysis

    21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...

  • 7/31/2019 22 free tools for data visualization and analysis

    10/21

    Drawbacks: Oddly, the visualization wizard was a lot easier to use than the embed code --

    my embedded iframe didn't display while trying to preview it on the VIDI website; I needed to

    save the visualization and go to the "My VIDI" page to get embed code that actually worked.

    Also, as with any cloud service, if you're using this for Web publishing, you'll want to feel

    confident that the host's servers can handle your traffic and will be available longer than your

    need to display the data.

    Skill level: Beginner.

    Runs on: Any Web browser.

    Learn more: The VIDI home page features a link to an 11-minute video tutorial.

    It took me less than five minutes to create a sample: a map of earthquakes of 7.0 magnitude

    or more since Jan. 1, 2000.

    Zoho Reports

    What it does: One of the more traditional corporate-focused business analytics offerings inthis group, Zoho Reports can take data from various file formats or directly from a database

    and turn it into charts, tables and pivot tables -- formats familiar to most spreadsheet users.

    What's cool: You can schedule data imports from sources on the Web. Data can be queried

    using SQL and can be turned into visualizations, and the service is set up for Web publishing

    and sharing (although if it's accessed by more than two users, you will need a paid account).

    Drawbacks: Visualization options are fairly basic and limited. Interacting live with the Web-based data can be sluggish at times. Data files are limited to 10MB. I found the navigation

    confusing at times -- for example, after I saved a copy of a sample database, I was told it was

    in the folder "My reports," yet I had a hard time finding that.

    Skill level: Advanced beginner.

    Runs on: Any Web browser.

    Learn more: There are video demos and samples on Zoho's website.

    Code help: Wizards, libraries, APIs

    Sometimes nothing can substitute for coding your own visualization -- especially if the look

    and feel you're after can't be achieved without an existing desktop or Web app. But that

    Zoho Reports provides traditional business charts and graphs.Click to view larger image.

    Page 10 of 2122 free tools for data visualization and analysis

    21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...

  • 7/31/2019 22 free tools for data visualization and analysis

    11/21

    doesn't mean you need to start from scratch, thanks to a wide range of available libraries and

    APIs.

    Choosel (under development)

    What it does: This open-source Web-based framework is designed for charts, clouds,graphs, timelines and maps. Right now, it is geared more for developers who create

    applications than it is for end users who need to save and/or embed their work; but there's an

    interactive online demo that lets you quickly upload some data to visualize.

    What's cool: As with Tableau Public, you can have more than one visualization on a page

    and connect them so that, for example, mousing over items on a chart will highlight

    corresponding items on a map.

    Drawbacks: This is not yet an application that end users can use to store and share their

    work. And I found the online demo to be finicky about uploading data -- even after I corrected

    field formats for dates (dd/mm/yyyy) and location (latitude/longitude) as documented, my

    data wouldn't load until I had another text field added (rather than just having numerical

    fields). It was also unclear how to customize labels. This project shows promise if it's further

    developed and documented.

    Skill level: Expert

    Runs on: Chrome, Safari and Firefox.

    Learn more: There's a short video called Choosel -- Timeline and Basic Features and a

    sample titled Earthquakes With 1,000 or More Deaths Since 1900.

    Exhibit

    What it does: This spin-off of the MIT Simile Project is designed to help users "easily create

    Web pages with advanced text search and filtering functionalities, with interactive maps,

    timelines and other visualization." Billed as a publishing framework, the JavaScript libraryallows easy additions of filters, searches and more. The Easy Data Visualization for

    Journalists page offers examples of the code in use at a number of newspaper websites.

    Still under development, Choosel has potential as an easy way to create online graphics.Click to view larger image.

    Page 11 of 2122 free tools for data visualization and analysis

    21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...

  • 7/31/2019 22 free tools for data visualization and analysis

    12/21

    Of course, "easy" is in the eye of the beholder -- what's easy for the professionals at MIT who

    created Exhibit might not be that simple for a user whose comfort level stops at Excel. Like

    most JavaScript libraries, Exhibit requires more hand-coding than services such as Many

    Eyes and Google Fusion Tables. On the other hand, Exhibit has clear documentation for

    beginners, even those with no JavaScript experience.

    What's cool: For those who arecomfortable coding, Exhibit offers a number of views --

    maps, charts, timeplots, calendars and more -- as well as customized lenses (ways to format

    an individual record) and facets (properties that can be searched or sorted). You're much

    more likely to get the exact presentation you want with Exhibit than, say, Many Eyes. And

    your data stays local unless and until you decide to publish.

    Drawbacks: For newcomers unused to coding visualizations, it takes time to get familiar with

    coding and library syntax.

    Skill level: Expert.

    Learn more: There are a number of examples you can look at, including Red Sox-Yankees

    Winning Percentages Through the Years, U.S. Cities by Population and others.

    Note: There are numerous other JavaScript libraries to help create visualizations, such as the

    recently released Data-Driven Documents and thejQuery Visualize plug-in. Six Revisions' list

    of 20 Fresh JavaScript Data Visualization Libraries gives you an idea of how many there are

    to choose from.

    Google Chart Tools

    What it does: Unlike Google Fusion Tables, which is a full-fledged, self-contained

    application for uploading and storing data, and generating charts and maps, Chart Tools is

    designed to visualize data residing elsewhere, such as your own website or within GoogleDocs.

    Google offers both a Chart API using a "simple URL request to a Google chart server" forcreating a static image and a Visualization API that accesses a JavaScript library for creating

    interactive graphics. Google offers a comparison of data size, page load, skills needed and

    Google Chart Tools offers both a wizard and an API for creating Web graphics from data.Click to view larger image.

    Page 12 of 2122 free tools for data visualization and analysis

    21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...

  • 7/31/2019 22 free tools for data visualization and analysis

    13/21

    other factors to help you decide which option to use.

    For the simpler static graphics, there's a wizard to help you create a chart from some sample

    formats; it goes as far as helping you input data row by row, although for any decent-size

    data set -- say, more than half a dozen or so entries -- it makes more sense to format it in a

    text file.

    The visualization API includes various types of charts, maps, tables and other options.

    What's cool: The static image chart is reasonably easy to use and features a Live Chart

    Playground, which allows you to tweak code and see your results in real time.

    The more robust API lets you pull data in from a Google spreadsheet. You can create icons

    that mix text and images for visualizations, such as this weather forecast note, and what it

    calls a "Google-o-meter" graphic. The Visualization API also has some of the best

    documentation I've seen for a JavaScript library.

    Drawbacks: The static charts tool requires a bit more work than some of the other Web-

    based services, and it doesn't always offer lots of extras in return. And for the API, as with

    other JavaScript libraries, coding is required, making this more of a programming tool than an

    end-user business intelligence application.

    Skill level: Advanced beginner to expert.

    Runs on: Any Web browser.

    Learn more: See Getting Started With Charts and Interactive Charts. There are also

    samples in the Google Visualization API Gallery.

    JavaScript InfoVis Toolkit

    What it does: InfoVis is probably not among the best known JavaScript visualization

    libraries, but it's definitely worth a look if you're interested in publishing interactive data

    visualizations on the Web. The White House agrees: InfoVis was used to create the Obama

    administration's Interactive Budget graphic.

    What sets this tool apart from many others is the highly polished graphics it creates from just

    basic code samples. InfoVis creator Nicolas Garca Belmonte, senior software architect at

    Sencha Inc., clearly cares as much about aesthetic design as he does about the code, and it

    shows.

    What's cool: The samples are

    gorgeous and there's no extracoding involved to get nifty fly-in effects. You can choose to download code for only the

    visualization types you want to use to minimize the weight of Web pages.

    Drawbacks: Since this is not an application but a code library, you must have coding

    expertise in order to use it. Therefore, this might not be a good fit for users in an organization

    who analyze data but don't know how to program. Also, the choice of visualization types is

    somewhat limited. Moreover, the data should be in JSON format.

    Skill level: Expert.

    Runs on: JavaScript-enabled Web browsers.

    Learn more: See demos with source code.

    Protovis

    Page 13 of 2122 free tools for data visualization and analysis

    21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...

  • 7/31/2019 22 free tools for data visualization and analysis

    14/21

    What it does: Billed as a

    "graphical toolkit for visualization,"

    this project from Stanford

    University's Visualization Group is

    one of the more popular

    JavaScript libraries for turning datainto visuals; it's designed to

    balance simplicity with control over

    the display.

    What's cool: One of the best

    things about Protovis is how well

    it's documented, with plenty of

    examples featuring visualization

    and sample code. There are also a

    large number of sample

    visualization types available,including maps and some

    statistical analyses. This is a

    robust tool, capable of building

    graphics like this color-coded U.S.

    map with timeline slider.

    Drawbacks: As is the case with

    other JavaScript libraries, it's pretty much essential for users to have knowledge of

    JavaScript (or at least some other programming language). While it's possible to copy, paste

    and modify code without really understanding what it's doing, I find it difficult to recommend

    that approach for nontechnical end users.Skill level: Expert.

    Runs on: JavaScript-enabled Web browsers.

    Learn more: Try the How-to: Get Started Guide. You can also find examples of the types of

    graphics you can build with Protovis at the Protovis Gallery.

    GIS/mapping on the desktop

    There's a wide range of business uses for geographic information systems (GIS), ranging

    from oil exploration to choosing sites for new retail stores. Or, as The Miami Heralddid for itsPulitzer Prize-winning coverage of Hurricane Andrew, you can compare maximum wind

    speeds with damage reports and building information (and perhaps discover, for example,

    that the worst damage didn't happen in the areas suffering the heaviest winds, but in areas

    with a lot of new, shoddy construction).

    Quantum GIS (QGIS)

    What it does: This is full-fledged GIS software, designed for creating maps that offer

    sophisticated, detailed data-based analysis of a geographic regions.

    The best-known desktop GIS software is probably Esri's ArcView, a robust, well-supported

    application that costs quite a bit of money. The open-source QGIS is an alternative to

    ArcView.

    This sunburst of a directory tree shows some of the visualizationcapabilities of the JavaScript InfoVis Toolkit. You can see a larger,interactive version on the InfoVis website.

    Page 14 of 2122 free tools for data visualization and analysis

    21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...

  • 7/31/2019 22 free tools for data visualization and analysis

    15/21

    As OpenOffice is to Microsoft Office, QGIS is to ArcView. ArcView enthusiasts argue that

    Esri's offering is a couple of years ahead of open-source alternatives, has a better-developed

    interface, enjoys commercial support and is better suited for print output. But QGIS users say

    the open-source alternative is an excellent program that does a great deal of useful GIS work

    -- and may even be better than ArcView when it comes to generating maps for the Web,

    thanks to a plug-in dedicated to generating HTML image maps.

    What's cool: QGIS has an enormous amount of GIS functionality, including the ability to

    create maps, overlay various types of data, do spatial analysis, publish to the Web and more.

    It can also be enhanced with plug-ins that add support for numerous undertakings, including

    geocoding, managing underlying table data, exporting to MySQL and generating HTML

    image maps.

    Drawbacks: As with any sophisticated GIS application, learning to use this software entails a

    serious commitment of time and training. Even in hour-long hands-on sessions with first

    ArcView and then QGIS, I noticed things that were easier to do in the commercial option. For

    example, ArcView had a one-click "normalize" function to immediately calculate, say, the

    percentage of people 65 and over versus the total population from a data table with both

    columns; in QGIS, I needed to pull up a "field calculator" and create a new column with the

    formula to do that calculation myself.

    Runs on: Linux, Unix, Mac OS X, Windows. (This is one case where installation is more

    complicated on OS X, since it requires manual installation of several dependencies. There's

    a one-click installer for Windows.)

    Skill level: Intermediate to expert.

    Learn more: Timothy Barmann of The Providence Journalposted two very useful tutorials

    for the CAR conference that are still available: Introduction to QGIS and The Latest in

    Mapping With JavaScript and jQuery. Barmann also offers a sample: Rhode Island's Ethnic

    Mosaic. Another resource to help you get started: QGIS Tutorial Labs from Richard E. Plant,

    professor emeritus at the University of California, Davis.

    Note: If you're interested in GIS and want to consider other free software options, download

    this PDF listing of Open Source/Non-Commercial GIS Products. And if you're looking for a

    Quantum GIS (QGIS) offers full-fledged geospatial visualization and analysis on the desktop.

    Click to view larger image.

    Page 15 of 2122 free tools for data visualization and analysis

    21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...

  • 7/31/2019 22 free tools for data visualization and analysis

    16/21

    free open-source desktop GIS program that might be fairly easy to use, Jacob Fenton,

    director of computer-assisted reporting at American University's Investigative Reporting

    Workshop, recommends taking a look at the System for Automated Geoscientific Analyses

    (SAGA) site. Finally, if analyzing geographic data in a conventional database sounds

    interesting, PostGIS "spatially enables" the PostgreSQL relational database, according to the

    site.

    Web-based GIS/mapping

    Most of us are familiar with mapping tools from major companies like Google (which has a

    number of third-party front ends such as Map A List, an add-on that adds info to a Google

    Map from a spreadsheet). There's also Yahoo Maps Web Services and Bing Maps -- all with

    APIs. But there are numerous other options from smaller organizations or lone open-source

    enthusiasts that were designed from the ground up to map geographic data.

    OpenHeatMap

    What it does: This user-friendly website generates color-coded maps; the colors change

    depending on underlying info such as population change or average income. It can also

    place markers on a map, varying the size of the markers based on a data table.

    In addition to providing the Web-based service, author Pete Warden has also packaged

    OpenHeatMap as a jQuery plug-in for those who don't want to rely on hosting at

    OpenHeatMap.com. However, not all data formats work correctly when hosted locally. "My

    recommended way is to embed the maps from the site," Warden wrote via Skype chat.

    What's cool: It is astonishinglyeasy to create a color-coded map from many types of

    location data -- even IP addresses (just use the column header ip_address).

    It took me about 60 seconds to create a basic map from a spreadsheet of magnitude 7 or

    higher earthquakes around the world since Jan. 1, 2000, then a couple of minutes more to

    OpenHeatMap is extremely easy to use for creating data-based maps, although there are still occasional bugs in thiswell-thought-out service. Click to view interactive graphic.

    Page 16 of 2122 free tools for data visualization and analysis

    21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...

  • 7/31/2019 22 free tools for data visualization and analysis

    17/21

    customize the rollover box to display both date and magnitude. (You can see a larger version

    on OpenHeatMap.com.)

    Marker transparency, size and color are extremely simple to customize; you can also upload

    your own marker image, and customize what appears in the tooltips rollover by adding a

    tooltip column to your data source.

    OpenHeatMap automatically figures out and maps locations based on a wide range of place

    definitions, relying on how the location columns are named -- "address," "country,"

    "fips_code" (used by the U.S. Census Bureau), "zip_code_area" (for five-digit ZIP codes),

    "lat" (latitude), "lon" (longitude) and so on.

    This is a well-thought-out interface from a onetime Apple engineer. (Warden said he worked

    on several software projects at Apple, including Final Cut Studio.)

    Drawbacks: There's no way to delete data once it's been uploaded (you can get around this

    by using a Google Spreadsheet as a data source), and editing time is limited to as long as

    your browser is open and you haven't started a new map. Embedded OpenHeatMap.com-

    hosted maps may be slow to load.

    The documentation doesn't make it clear whether you can set where the map is centered or

    what the default zoom level should be; Warden told me by e-mail that the system remembers

    where you last positioned and zoomed the map before saving. And this feature still can

    occasionally be buggy, although Warden is responsive to bug reports.

    Skill level: Beginner.

    Runs on: Web browsers enabled for Flash or HTML 5 Canvas.

    Learn more: Its title notwithstanding, the four-minute video "How OpenHeatMap Can Help

    Journalists" offers a clear explanation for anyone interested in using the service. You can

    also view samples on the OpenHeatMap Gallery and check out this Guardianinteractive map

    of where Facebook is used.

    OpenLayers

    What it does: OpenLayers is a JavaScript library for displaying map information. It's aimed

    at providing functionality similar to those big companies' code libraries -- but with open-

    source code. OpenLayers works with OpenStreetMap andother maps, as this tutorial about

    use with Google shows.

    Other projects build on it to add functionality or ease of use, such as GeoExt, which adds

    more GIS capabilities. For users who are comfortable hand-coding JavaScript and prefer notto use a commercial platform such as Google or Bing, this can be a compelling option.

    Drawbacks: OpenLayers is not yet as developed or as easy to use as, say, Google Maps.

    The project page notes that it is "still undergoing rapid development."

    Skill level: Expert.

    Runs on: Any Web browser.

    Learn more: Try this OpenLayers Simple Example. A good sample is Ushahidi's Haiti map.

    There are other JavaScript libraries for overlaying information on maps, such as Polymaps.

    And there are a number of other mapping platforms, such as Google Maps, which offersnumerous mapping APIs; Yahoo Maps Web Services, with its own APIs; the Bing Maps

    platform and APIs; and GeoCommons.

    Page 17 of 2122 free tools for data visualization and analysis

    21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...

  • 7/31/2019 22 free tools for data visualization and analysis

    18/21

  • 7/31/2019 22 free tools for data visualization and analysis

    19/21

    What's cool: TimeFlow makes it incredibly easy to interact with data in various ways, such

    as switching views or filtering by criteria such as date ranges or earthquakes of magnitude 8

    or more. The timeline view offers a slider so you can zero in on a time period. While many

    applications can plot bar graphs, fewer also offer calendar views. And unlike Web-based

    Google Fusion Tables, TimeFlow is a desktop application that makes it quick and painless to

    edit individual entries.

    Drawbacks: This is an alpha release designed to help individual reporters doing

    investigative work. There are no facilities for publishing or sharing results other than taking a

    screen snapshot, and additional development appears unlikely in the near future.

    Skill level: Beginner.

    Runs on: Desktop systems running Java 1.6, including Windows and Mac OS X.

    Learn more: Check out Top tips.

    Note: If you're looking to publishvisualized timelines, better options include Google Fusion

    Tables, VIDI or the SIMILE Timeline widget.

    Text/word clouds

    Some data visualization geeks think word clouds are either not very serious or not very

    original. You can think of them as the tiramisu of visualizations -- once trendy, now overused.

    But I still enjoy these graphics that display each word from a text file once, with the size of

    the words varying depending on how often each one appears in the source.

    IBM Word-Cloud Generator

    What it does: Several tools mentioned previously can create word clouds, including Many

    Eyes and the Google Visualization API, as well as the website Wordle (which is a handy tool

    for making word clouds from websites instead of text files). But if you're looking for easy

    desktop software dedicated to the task, IBM's free Word-Cloud desktop application fits the

    bill.

    What's cool: This is a quick, fun and easy way to find frequency of words in text.

    Drawbacks: Because it's trying to ignore words such as "a" and "the," the basic configurationcan miss some important terms. In my tests, it didn't know the difference between "it" and

    "IT," and completely missed "AT&T."

    Skill level: Advanced beginner. This app runs on the command line, so users should have

    ability to find file paths and plug them into a sample command.

    Runs on: Windows, Mac OS X and Linux running Java.

    Learn more: Check the examples that come with the download.

    Social and other network analysis

    These tools use a pre-Facebook/Twitter definition of "social network analysis" (SNA),

    referring to the discipline of finding connections between people based on various data sets.

    TimeFlow offers a number of different ways to easily visualize data with an important time component.Click to view larger image.

    Page 19 of 2122 free tools for data visualization and analysis

    21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...

  • 7/31/2019 22 free tools for data visualization and analysis

    20/21

    Investigative journalists have used such tools to, for example, find links between people who

    are involved in development projects or who are members of various boards of directors.

    An understanding of statistical theories of network node analysis is necessary in order to use

    this category of software. Since I've only had a very basic introduction to that discipline, this

    is one category of tools I did not test hands-on. But if you're seeking software to do such

    analysis, one of these might meet your needs.

    Gephi

    What it does: Billed as a Photoshop for data, this open-source beta project is designed for

    visualizing statistical information, including relationships within networks of up to 50,000

    nodes and half a million edges (connections or relationships) as well as network analyses of

    factors such as "betweenness," closeness and clustering coefficient.

    Runs on: Windows, Linux, Mac OS X running Java 1.6.

    Learn more: Try this Quick Start tutorial (PDF).

    NodeXL

    What it does: This Excel plug-in displays network graphs from a given list of connections,

    helping you analyze and see patterns and relationships in the data.

    NodeXL merges the older and current definitions of SNA. It's "optimized for analyzing online

    social media -- it includes built-in connections to query the APIs of Twitter, Flickr and

    YouTube, allowing you to draw networks of users and their activity," according to Peter

    Aldhous, San Francisco bureau chief for New Scientistmagazine.

    It also handles e-mail and conventional network analysis files (including data created by the

    popular -- but not free -- analysis tool UCINET).

    Runs on: Excel 2007 and 2010 on Windows.

    Learn more: Download this detailed free NodeXL tutorial (PDF) or these basic step-by-step

    instructions on analyzing your own Facebook social network (PDF). One Facebook app for

    downloading your own friend information for use in NodeXL is Name Gen Web.

    Gephi can visualize networks of up to 50,000 nodes.Click to view larger image.

    Page 20 of 2122 free tools for data visualization and analysis

    21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...

  • 7/31/2019 22 free tools for data visualization and analysis

    21/21

    Sharon Machlis is online managing editor atComputerworld. Her email address [email protected]. You can follow her on Twitter @sharon000, onFacebook

    or by subscribing to her RSS feeds:

    articles |blogs .

    Page 21 of 2122 free tools for data visualization and analysis