22 free tools for data visualization and analysis
-
Upload
ernani-marques -
Category
Documents
-
view
222 -
download
0
Transcript of 22 free tools for data visualization and analysis
-
7/31/2019 22 free tools for data visualization and analysis
1/21
22 free tools for data visualization and analysis
Got data? These useful tools can turn it into informative, engaging graphics.Sharon Machlis
April 20, 2011(Computerworld)
You may not think you've got much in common with an investigative journalist or an
academic medical researcher. But if you're trying to extract useful information from an ever-
increasing inflow of data, you'll likely find visualization useful -- whether it's to show patterns
or trends with graphics instead of mountains of text, or to try to explain complex issues to a
nontechnical audience.
There are many tools around to help turn data into graphics, but they can carry hefty price
tags. The cost can make sense for professionals whose primary job is to find meaning inmountains of information, but you might not be able to justify such an expense if you or your
users only need a graphics application from time to time, or if your budget for new tools is
somewhat limited. If one of the higher-priced options is out of your reach, there are a
surprising number of highly robust tools for data visualization and analysis that are available
at no charge.
Here's a rundown of some of the
better-known options, many of
which were demonstrated at the
Computer-Assisted Reporting
(CAR) conference last month.Others are not as well known but show great promise. They range from easy enough for a
beginner (i.e., anyone who can do rudimentary spreadsheet data entry) to expert (requiring
hands-on coding). But they all share one important characteristic: They're free. Your only
investment: time.
Data cleaning
Before you can analyze and visualize data, it often needs to be "cleaned." What does that
mean? Perhaps some entries list "New York City" while others say "New York, NY" and you
need to standardize them before you can see patterns. There might be some records with
misspellings or numerical data-entry errors. The following two tools are designed to help get
your data in tip-top shape to be analyzed.
DataWrangler
What it does: This Web-based service from Stanford University's Visualization Group is
designed for cleaning and rearranging data so it's in a form that other tools such as a
spreadsheet app can use.
Click on a row or column, and DataWrangler will suggest changes. For example, if you click
on a blank row, several suggestions pop up such as "delete row" or "delete empty rows."
There's also a history list that allows for easy undo -- a feature that's also available in Google
Refine (reviewed next).
Want to see all the tools at once?
For quick reference, check out our chart listing 22 free data
visualization tools.
Page 1 of 2122 free tools for data visualization and analysis
21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...
-
7/31/2019 22 free tools for data visualization and analysis
2/21
What's cool: Text editing is especially easy. For example, when I selected "Alabama" in one
row of sample data headlined "Reported crime in Alabama" and then selected "Alaska" in the
next group of data, it led to a suggestion to extract every state name. Hover your mouse over
a suggestion, and you can see affected rows highlighted in red.
Drawbacks: I found that unexpected changes occurred as I attempted to explore
DataWrangler's options; I constantly had to click "clear" to reset. And not all suggestions are
useful ("promote row to header" seemed an odd suggestion when the row was blank) or easy
to understand ("fold split 1 using 2 as key").
And while the fact that DataWrangler is a Web-based service makes it convenient to use,
don't forget that it sends your data off to an external site -- which means it isn't an option for
sensitive internal information. However, there are plans for a future release of a stand-alone
desktop version. Another important thing to keep in mind is that DataWrangler is currently
alpha code, and its creators say it's "still a work in progress."
Skill level: Advanced beginner.
Runs on: Any Web browser.
Learn more: There's a screencast on the Data Wrangler home page. Also, see this post on
using DataWrangler to format data (from Tableau Public's blog).
Google Refine
What it does: Google Refine can be described as a spreadsheet on steroids for taking a first
look at both text and numerical data. Like Excel, it can import and export data in a number of
formats including tab- and comma-separate text files and Excel, XML and JSON files.
Refine features several built-in algorithms that find text items that are spelled differently but
DataWrangler helps format table data so it can be better used and analyzed by other applications.Click to view larger image.
Google Refine can make data 'cleaner' by helping to find errors or different versions of the same proper names. Click
to view larger image.
Page 2 of 2122 free tools for data visualization and analysis
21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...
-
7/31/2019 22 free tools for data visualization and analysis
3/21
actually should be grouped together. After importing your data, you simply select edit cells -->
cluster and editand select which algorithm you want to use. After Refine runs, you decide
whether to accept or reject each suggestion. For example, you could say yes to combining
Microsoft and Microsoft Corp., but no to combining Coach Inc. with CQG Inc. If it's offering
too few or too many suggestions, you can change the strength of the suggestion function.
There are also numerical options that offer quick and easy overviews of data distributions.
This functionality can reveal anomalies that might be the result of data input errors -- such as
$800,000 instead of $80,000 for a salary entry, or it could expose inconsistencies -- such as
differences in the way compensation data is reported from entry to entry, with some showing,
say, hourly wages and others showing weekly pay or yearly salaries.
Beyond data housekeeping, Google Refine offers some useful analysis tools, such as sorting
and filtering.
What's cool: Once you get used to which commands do what, this is a powerful tool for data
manipulation and analysis that strikes a good balance between functionality and ease of use.
The undo/redo list of every action you've taken lets you roll back when needed. And textfunctions handle Java-syntax regular expressions, allowing you to look for patterns (such as,
say, three numbers followed by two digits) as well as specific text strings and numbers.
Finally, while this is a browser-based application, it works with files on your desktop, so your
data remains local.
Drawbacks: Although Google Refine looks like a spreadsheet, you can't do typical
spreadsheet calculations with it; for that, you must export to a conventional spreadsheet
application. If you've got a large data set, carve out some time in your day to go through all of
Refine's suggested changes, since it can take a while. And, depending on the data set, be
prepared when looking for text items to merge: You're likely to get either a lot of false
positives or missed problems -- or both.
Skill level: Advanced beginner. Knowledge of data analysis concepts is more important than
technical prowess; power Excel users who understand data-cleaning needs should be
comfortable with this.
Runs on: Windows, Mac OS X (if it appears to do nothing after loading on a Mac, point a
browser manually to http://127.0.0.1:3333/ ), Linux.
Learn more: These three screencasts give a good overview of why and how you'd use
Refine; there's also fairly detailed documentation on the Google Code project area.
Statistical analysis
Sometimes you need to combine graphical representation of your data with heftier numerical
analysis.
The R Project for Statistical Computing
What it does: R is a general statistical analysis platform (the authors call it an "environment")
that runs on the command line. Need to find means, medians, standard deviations,
correlations? R can handle that and much more, including "linear and generalized linear
models, nonlinear regression models, time series analysis, classical parametric and
nonparametric tests, clustering and smoothing," according to the project website.
Page 3 of 2122 free tools for data visualization and analysis
21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...
-
7/31/2019 22 free tools for data visualization and analysis
4/21
R also graphs, charts and plots results. There are numerous add-ons to this open-sourceproject that significantly extend functionality. For users who prefer a GUI, Peter Aldhous, San
Francisco bureau chief for New Scientistmagazine, suggests RExcel, which offers access to
the R engine through Excel.
What's cool: There is a great deal of functionality in R, including quite a number of
visualization options as well as numerical and spatial analysis.
Drawbacks: The fact that R runs on the command line means that users will have to take the
time to learn which commands do what, and not all users will be comfortable with a text-only
interface. In addition, Aldhous says those dealing with large data sets may hit a memory
barrier (if so, there's a commercial option from Revolution Analytics).
Skill level: Intermediate to advanced. Comfort with command-line prompts and a knowledge
of statistics are a musts for the core application.
Runs on: Linux, Mac OS X, Unix, Windows XP or later.
Learn more: Try R for Statistics: First Steps (PDF) by Peter Aldhous, Hands-on R, a step-
by-step tutorial (PDF) by Jacob Fenton, and the project's own An Introduction to R. The R
Statistics blog has a number of visualization samples.
Visualization applications and services
These tools offer a number of different visualization options. While some stick to conventionalcharts and graphs, many offer a range of other choices such as treemaps and word clouds. A
few offer geographical mapping as well, although if you're interested in maps, our sections on
GIS/mapping focus specifically on that.
Google Fusion Tables
What it does: This is one of the simplest ways I've seen to turn data into a chart or map. You
can upload a file in several different formats and then choose how to display it: table, map,
heatmap, line chart, bar graph, pie chart, scatter plot, timeline, storyline or motion (animation
over time). It's somewhat customizable, allowing you to change map icons and style infowindows.
The R Project for Statistical Computing provides a wide range of data analysis options.Click to view larger image.
Page 4 of 2122 free tools for data visualization and analysis
21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...
-
7/31/2019 22 free tools for data visualization and analysis
5/21
There are some data editing functions within Fusion Tables, although changing more than a
few individual cell entries can quickly become tedious. You can also join tables (which is
important when the data you want to map is in multiple tables), and filter, sort and add
columns and so on. There are also options to allow others to make comments on the dataitself.
Mapping goes beyond just placing points, as many of us are accustomed to with Google
Maps. Fusion tables can also map multiple polygons with variations in color based on
underlying data, such as this intensity map showing the percentage of households with
Internet access by state from 2007 U.S. Census bureau data.
The Knight Digital Media Center notes that a handy undocumented feature allows the use of
Fusion Table's "templating" export to generate a JSON file from data in other formats. JSON
is required by some APIs and JavaScript libraries.
Unlike IBM's Many Eyes, Google lets you designate your data as private or unlisted as well
as public, although your data still resides on Google's servers -- a benefit or drawback,
depending on whether server bandwidth costs or data privacy is more important to you.
What's cool: Fusion Tables offers relatively quick charting and mapping, including
geographic information system (GIS) functions to analyze data by geography. The service
also automatically geocodes addresses, which is useful when trying to place numerous
points on a map. This is an excellent tool for beginners and advanced beginners to use to get
comfortable with analyzing data; it's also a good fit for people who don't program. For more
advanced users, there's an API.
Drawbacks: Functionality, customization and data capacity are all limited compared with
desktop applications or custom code, and interacting with large data sets on the site can besluggish. And it has its limitations -- the site choked on March 11, the day of the devastating
earthquake and tsunami in Japan. (It is still a Google Labs beta project.)
Google Fusion Tables is a user-friendly tool that makes it easy to map data.Click to view interactive map.
Page 5 of 2122 free tools for data visualization and analysis
21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...
-
7/31/2019 22 free tools for data visualization and analysis
6/21
Skill level: Beginner.
Runs on: Any Web browser.
Learn more: A Google Fusion Tables tour and several tutorials are available. We've also got
some examples of what it can do in our story "H-1B Visa Data: Visual and Interactive Tools."
Also see the Fusion Tables Example Gallery.
Impure
What it does: Impure is sort of a Yahoo Pipes for data visualization, designed for creating
numerous types of highly polished graphical representations of data using a drag-and-drop
workspace. The service includes a library of objects and various methods, and -- as with
Yahoo Pipes -- it allows you to click and drag to connect modules so that the output of one
becomes the input of another. It was developed by Spanish analytics firm Bestiario.
What's cool: Impure offers a highly visual interface for the task of creating visualizations --
which is not as common as you might expect. It has a sleek user interface and numerous
modules, including quite a few APIs that are designed to pull data from the Web. It features
numerous visualization types that are searchable by keywords like numeric, tables, nodes,
geometryand map. And although it saves your workspaces to the Web, you can copy and
save the code behind your workspaces locally, so you can back up your work or maintain
your own libraries of code snippets.
Drawbacks: Users of Impure face a surprisingly steep learning curve despite its drag-and-
drop functionality. The documentation is detailed in some areas, but lacking in others. For
instance, while it was easy to find a list of APIs, it was more difficult to find basic instructions
on how to use the workspace -- or even figure out that there wasa workspace, let alone how
to use the various objects and methods.
Once you save your workspace, it's on the public Web, although it's unlikely that anyone else
will be able to find it unless you share the URL. And I found some of the samples not all that
helpful in understanding the underlying data, even if they were visually striking.
Skill level: Intermediate.
Runs on: Any Web browser.
Learn more: To get started, I'd suggest the videos "Interface Basics" (7 minutes) and
"Workspaces and Code." You can find a sample called The Pay Gap Between Men and
Women Mapped at the website of British newspaper The Guardian.
Tableau Public
What it does: This tool can turn data into any number of visualizations, from simple to
complex. You can drag and drop fields onto the work area and ask the software to suggest a
visualization type, then customize everything from labels and tool tips to size, interactive
filters and legend display.
Page 6 of 2122 free tools for data visualization and analysis
21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...
-
7/31/2019 22 free tools for data visualization and analysis
7/21
What's cool: Tableau Public offers a variety of ways to display interactive data. You can
combine multiple connected visualizations onto a single dashboard, where one search filter
can act on numerous charts, graphs and maps; underlying data tables can also be joined.
And once you get the hang of how the software works, its drag-and-drop interface is
considerably quicker than manually coding in JavaScript or R for most users, making it more
likely that you'll try additional scenarios with your data set. In addition, you can easily perform
calculations on data within the software.
Drawbacks: In the free version of Tableau's business intelligence software, your
visualization and data must reside on Tableau's site. Whenever you save your work, it gets
sent up to the public website -- which means you can't save work in progress without running
the risk that it will be seen before it's ready (while Tableau's site won't deliberately expose
your work, it relies on security by obscurity -- so someone could see your work if they guess
your URL). And once it's saved, viewers are invited to download your entire workbook with
data. Upgrading to a single-user desktop edition costs $999.
Not surprisingly, all that functionality comes at a cost: Tableau's learning curve is fairly steep
compared to that of, say, Fusion Tables. Even with the drag-and-drop interface, it'll take more
than an hour or two to learn how to use the software's true capabilities, although you can getup and running doing simple charts and maps before too long.
Skill level: Advanced beginner to intermediate.
Runs on:Windows 7, Vista, XP, 2003, Server 2008, 2003.
Learn more: There are seven short training videos on the Tableau site, where you can also
find downloadable data files that you can use to follow along.
You can see a sample in our article "Tech Unemployment Climbs; Self-employment Steady."
Many Eyes
A pioneer in Web-based data visualization, IBM's Many Eyes project combines graphical
analysis with community, encouraging users to upload, share and discuss information. It's
Tableau Public can turn data into any number of visualizations, from simple to complex.Click to view interactive graphic.
Page 7 of 2122 free tools for data visualization and analysis
21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...
-
7/31/2019 22 free tools for data visualization and analysis
8/21
extremely easy to use and very well documented, including suggestions on when to use what
kind of visual data representation. Many Eyes includes more than a dozen output options --
from charts, graphics and word clouds to treemaps, plots, network diagrams and some
limited geographic maps.
You'll need a free account to upload and post data, although anyone can browse. Formatting
is basic: For most visualizations, the data must be in a tab-separated text file with column
headers in the first row.
It took me about three minutes to create a bar chart of top H-1B visa employers.
It took perhaps another minute to create a treemap of the same data.
What's cool: Visualization can't get much easier, and the results look considerably more
sophisticated than you'd expect based on the minimal amount of effort needed to create
them. Plus, the list of possible visualization types includes explanations of the types of data
each one is best suited for.
Drawbacks: Both your visualizations and your data sets are public on the Many Eyes site
It takes just a few minutes to create online charts like this with Many Eyes.Click to view larger image.
Many Eyes offers a number of ways to visualize data, such as treemaps.Click to view larger image.
Page 8 of 2122 free tools for data visualization and analysis
21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...
-
7/31/2019 22 free tools for data visualization and analysis
9/21
and can be easily downloaded, shared, reposted and commented upon by others. This can
be great for certain types of users -- especially government agencies, nonprofits, schools and
other organizations that want to share visualizations on someone else's server budget -- but
an obvious problem for others. (IBM does offer a contact form for businesses interested in
hosting their own version of the software.) In addition, customization is limited, as is data file
size (5MB).
Skill level: Beginner.
Runs on: Java and any modern Web browser that can display Flash.
Learn more: IBM's website features pages explaining data formatting for Many Eyes and
visualization choices.
You can see some featured visualizations on the Many Eyes home page or browse through
some of the tens of thousands of uploads. One interesting map shows popular surnames in
the U.S. from the 2000 Census by Martin Wattenberg, one of the creators of Many Eyes.
VIDI
What it does: Although VIDI's website bills this as a tool for the Drupal content management
system, graphics created by the site's visualization wizard can be used on any HTML page --
no Drupal required.
Upload your data, select a visualization type, do a bit of customization selection, and your
chart, timeline or map is ready to use via auto-generated embed code (using an iframe, not
JavaScript or Flash).
What's cool: This is about as easy as Many Eyes -- with more mapping options and no need
to make your visualization and data set public on its website. There are quick screencasts
explaining each visualization type and several different color customization options. And thefile-size limit of 30MB is six times larger than Many Eyes' 5MB maximum.
Graphics created by VIDI's visualization wizard can be used on any HTML page -- no Drupal required.Click to view interactive graphic.
Page 9 of 2122 free tools for data visualization and analysis
21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...
-
7/31/2019 22 free tools for data visualization and analysis
10/21
Drawbacks: Oddly, the visualization wizard was a lot easier to use than the embed code --
my embedded iframe didn't display while trying to preview it on the VIDI website; I needed to
save the visualization and go to the "My VIDI" page to get embed code that actually worked.
Also, as with any cloud service, if you're using this for Web publishing, you'll want to feel
confident that the host's servers can handle your traffic and will be available longer than your
need to display the data.
Skill level: Beginner.
Runs on: Any Web browser.
Learn more: The VIDI home page features a link to an 11-minute video tutorial.
It took me less than five minutes to create a sample: a map of earthquakes of 7.0 magnitude
or more since Jan. 1, 2000.
Zoho Reports
What it does: One of the more traditional corporate-focused business analytics offerings inthis group, Zoho Reports can take data from various file formats or directly from a database
and turn it into charts, tables and pivot tables -- formats familiar to most spreadsheet users.
What's cool: You can schedule data imports from sources on the Web. Data can be queried
using SQL and can be turned into visualizations, and the service is set up for Web publishing
and sharing (although if it's accessed by more than two users, you will need a paid account).
Drawbacks: Visualization options are fairly basic and limited. Interacting live with the Web-based data can be sluggish at times. Data files are limited to 10MB. I found the navigation
confusing at times -- for example, after I saved a copy of a sample database, I was told it was
in the folder "My reports," yet I had a hard time finding that.
Skill level: Advanced beginner.
Runs on: Any Web browser.
Learn more: There are video demos and samples on Zoho's website.
Code help: Wizards, libraries, APIs
Sometimes nothing can substitute for coding your own visualization -- especially if the look
and feel you're after can't be achieved without an existing desktop or Web app. But that
Zoho Reports provides traditional business charts and graphs.Click to view larger image.
Page 10 of 2122 free tools for data visualization and analysis
21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...
-
7/31/2019 22 free tools for data visualization and analysis
11/21
doesn't mean you need to start from scratch, thanks to a wide range of available libraries and
APIs.
Choosel (under development)
What it does: This open-source Web-based framework is designed for charts, clouds,graphs, timelines and maps. Right now, it is geared more for developers who create
applications than it is for end users who need to save and/or embed their work; but there's an
interactive online demo that lets you quickly upload some data to visualize.
What's cool: As with Tableau Public, you can have more than one visualization on a page
and connect them so that, for example, mousing over items on a chart will highlight
corresponding items on a map.
Drawbacks: This is not yet an application that end users can use to store and share their
work. And I found the online demo to be finicky about uploading data -- even after I corrected
field formats for dates (dd/mm/yyyy) and location (latitude/longitude) as documented, my
data wouldn't load until I had another text field added (rather than just having numerical
fields). It was also unclear how to customize labels. This project shows promise if it's further
developed and documented.
Skill level: Expert
Runs on: Chrome, Safari and Firefox.
Learn more: There's a short video called Choosel -- Timeline and Basic Features and a
sample titled Earthquakes With 1,000 or More Deaths Since 1900.
Exhibit
What it does: This spin-off of the MIT Simile Project is designed to help users "easily create
Web pages with advanced text search and filtering functionalities, with interactive maps,
timelines and other visualization." Billed as a publishing framework, the JavaScript libraryallows easy additions of filters, searches and more. The Easy Data Visualization for
Journalists page offers examples of the code in use at a number of newspaper websites.
Still under development, Choosel has potential as an easy way to create online graphics.Click to view larger image.
Page 11 of 2122 free tools for data visualization and analysis
21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...
-
7/31/2019 22 free tools for data visualization and analysis
12/21
Of course, "easy" is in the eye of the beholder -- what's easy for the professionals at MIT who
created Exhibit might not be that simple for a user whose comfort level stops at Excel. Like
most JavaScript libraries, Exhibit requires more hand-coding than services such as Many
Eyes and Google Fusion Tables. On the other hand, Exhibit has clear documentation for
beginners, even those with no JavaScript experience.
What's cool: For those who arecomfortable coding, Exhibit offers a number of views --
maps, charts, timeplots, calendars and more -- as well as customized lenses (ways to format
an individual record) and facets (properties that can be searched or sorted). You're much
more likely to get the exact presentation you want with Exhibit than, say, Many Eyes. And
your data stays local unless and until you decide to publish.
Drawbacks: For newcomers unused to coding visualizations, it takes time to get familiar with
coding and library syntax.
Skill level: Expert.
Learn more: There are a number of examples you can look at, including Red Sox-Yankees
Winning Percentages Through the Years, U.S. Cities by Population and others.
Note: There are numerous other JavaScript libraries to help create visualizations, such as the
recently released Data-Driven Documents and thejQuery Visualize plug-in. Six Revisions' list
of 20 Fresh JavaScript Data Visualization Libraries gives you an idea of how many there are
to choose from.
Google Chart Tools
What it does: Unlike Google Fusion Tables, which is a full-fledged, self-contained
application for uploading and storing data, and generating charts and maps, Chart Tools is
designed to visualize data residing elsewhere, such as your own website or within GoogleDocs.
Google offers both a Chart API using a "simple URL request to a Google chart server" forcreating a static image and a Visualization API that accesses a JavaScript library for creating
interactive graphics. Google offers a comparison of data size, page load, skills needed and
Google Chart Tools offers both a wizard and an API for creating Web graphics from data.Click to view larger image.
Page 12 of 2122 free tools for data visualization and analysis
21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...
-
7/31/2019 22 free tools for data visualization and analysis
13/21
other factors to help you decide which option to use.
For the simpler static graphics, there's a wizard to help you create a chart from some sample
formats; it goes as far as helping you input data row by row, although for any decent-size
data set -- say, more than half a dozen or so entries -- it makes more sense to format it in a
text file.
The visualization API includes various types of charts, maps, tables and other options.
What's cool: The static image chart is reasonably easy to use and features a Live Chart
Playground, which allows you to tweak code and see your results in real time.
The more robust API lets you pull data in from a Google spreadsheet. You can create icons
that mix text and images for visualizations, such as this weather forecast note, and what it
calls a "Google-o-meter" graphic. The Visualization API also has some of the best
documentation I've seen for a JavaScript library.
Drawbacks: The static charts tool requires a bit more work than some of the other Web-
based services, and it doesn't always offer lots of extras in return. And for the API, as with
other JavaScript libraries, coding is required, making this more of a programming tool than an
end-user business intelligence application.
Skill level: Advanced beginner to expert.
Runs on: Any Web browser.
Learn more: See Getting Started With Charts and Interactive Charts. There are also
samples in the Google Visualization API Gallery.
JavaScript InfoVis Toolkit
What it does: InfoVis is probably not among the best known JavaScript visualization
libraries, but it's definitely worth a look if you're interested in publishing interactive data
visualizations on the Web. The White House agrees: InfoVis was used to create the Obama
administration's Interactive Budget graphic.
What sets this tool apart from many others is the highly polished graphics it creates from just
basic code samples. InfoVis creator Nicolas Garca Belmonte, senior software architect at
Sencha Inc., clearly cares as much about aesthetic design as he does about the code, and it
shows.
What's cool: The samples are
gorgeous and there's no extracoding involved to get nifty fly-in effects. You can choose to download code for only the
visualization types you want to use to minimize the weight of Web pages.
Drawbacks: Since this is not an application but a code library, you must have coding
expertise in order to use it. Therefore, this might not be a good fit for users in an organization
who analyze data but don't know how to program. Also, the choice of visualization types is
somewhat limited. Moreover, the data should be in JSON format.
Skill level: Expert.
Runs on: JavaScript-enabled Web browsers.
Learn more: See demos with source code.
Protovis
Page 13 of 2122 free tools for data visualization and analysis
21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...
-
7/31/2019 22 free tools for data visualization and analysis
14/21
What it does: Billed as a
"graphical toolkit for visualization,"
this project from Stanford
University's Visualization Group is
one of the more popular
JavaScript libraries for turning datainto visuals; it's designed to
balance simplicity with control over
the display.
What's cool: One of the best
things about Protovis is how well
it's documented, with plenty of
examples featuring visualization
and sample code. There are also a
large number of sample
visualization types available,including maps and some
statistical analyses. This is a
robust tool, capable of building
graphics like this color-coded U.S.
map with timeline slider.
Drawbacks: As is the case with
other JavaScript libraries, it's pretty much essential for users to have knowledge of
JavaScript (or at least some other programming language). While it's possible to copy, paste
and modify code without really understanding what it's doing, I find it difficult to recommend
that approach for nontechnical end users.Skill level: Expert.
Runs on: JavaScript-enabled Web browsers.
Learn more: Try the How-to: Get Started Guide. You can also find examples of the types of
graphics you can build with Protovis at the Protovis Gallery.
GIS/mapping on the desktop
There's a wide range of business uses for geographic information systems (GIS), ranging
from oil exploration to choosing sites for new retail stores. Or, as The Miami Heralddid for itsPulitzer Prize-winning coverage of Hurricane Andrew, you can compare maximum wind
speeds with damage reports and building information (and perhaps discover, for example,
that the worst damage didn't happen in the areas suffering the heaviest winds, but in areas
with a lot of new, shoddy construction).
Quantum GIS (QGIS)
What it does: This is full-fledged GIS software, designed for creating maps that offer
sophisticated, detailed data-based analysis of a geographic regions.
The best-known desktop GIS software is probably Esri's ArcView, a robust, well-supported
application that costs quite a bit of money. The open-source QGIS is an alternative to
ArcView.
This sunburst of a directory tree shows some of the visualizationcapabilities of the JavaScript InfoVis Toolkit. You can see a larger,interactive version on the InfoVis website.
Page 14 of 2122 free tools for data visualization and analysis
21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...
-
7/31/2019 22 free tools for data visualization and analysis
15/21
As OpenOffice is to Microsoft Office, QGIS is to ArcView. ArcView enthusiasts argue that
Esri's offering is a couple of years ahead of open-source alternatives, has a better-developed
interface, enjoys commercial support and is better suited for print output. But QGIS users say
the open-source alternative is an excellent program that does a great deal of useful GIS work
-- and may even be better than ArcView when it comes to generating maps for the Web,
thanks to a plug-in dedicated to generating HTML image maps.
What's cool: QGIS has an enormous amount of GIS functionality, including the ability to
create maps, overlay various types of data, do spatial analysis, publish to the Web and more.
It can also be enhanced with plug-ins that add support for numerous undertakings, including
geocoding, managing underlying table data, exporting to MySQL and generating HTML
image maps.
Drawbacks: As with any sophisticated GIS application, learning to use this software entails a
serious commitment of time and training. Even in hour-long hands-on sessions with first
ArcView and then QGIS, I noticed things that were easier to do in the commercial option. For
example, ArcView had a one-click "normalize" function to immediately calculate, say, the
percentage of people 65 and over versus the total population from a data table with both
columns; in QGIS, I needed to pull up a "field calculator" and create a new column with the
formula to do that calculation myself.
Runs on: Linux, Unix, Mac OS X, Windows. (This is one case where installation is more
complicated on OS X, since it requires manual installation of several dependencies. There's
a one-click installer for Windows.)
Skill level: Intermediate to expert.
Learn more: Timothy Barmann of The Providence Journalposted two very useful tutorials
for the CAR conference that are still available: Introduction to QGIS and The Latest in
Mapping With JavaScript and jQuery. Barmann also offers a sample: Rhode Island's Ethnic
Mosaic. Another resource to help you get started: QGIS Tutorial Labs from Richard E. Plant,
professor emeritus at the University of California, Davis.
Note: If you're interested in GIS and want to consider other free software options, download
this PDF listing of Open Source/Non-Commercial GIS Products. And if you're looking for a
Quantum GIS (QGIS) offers full-fledged geospatial visualization and analysis on the desktop.
Click to view larger image.
Page 15 of 2122 free tools for data visualization and analysis
21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...
-
7/31/2019 22 free tools for data visualization and analysis
16/21
free open-source desktop GIS program that might be fairly easy to use, Jacob Fenton,
director of computer-assisted reporting at American University's Investigative Reporting
Workshop, recommends taking a look at the System for Automated Geoscientific Analyses
(SAGA) site. Finally, if analyzing geographic data in a conventional database sounds
interesting, PostGIS "spatially enables" the PostgreSQL relational database, according to the
site.
Web-based GIS/mapping
Most of us are familiar with mapping tools from major companies like Google (which has a
number of third-party front ends such as Map A List, an add-on that adds info to a Google
Map from a spreadsheet). There's also Yahoo Maps Web Services and Bing Maps -- all with
APIs. But there are numerous other options from smaller organizations or lone open-source
enthusiasts that were designed from the ground up to map geographic data.
OpenHeatMap
What it does: This user-friendly website generates color-coded maps; the colors change
depending on underlying info such as population change or average income. It can also
place markers on a map, varying the size of the markers based on a data table.
In addition to providing the Web-based service, author Pete Warden has also packaged
OpenHeatMap as a jQuery plug-in for those who don't want to rely on hosting at
OpenHeatMap.com. However, not all data formats work correctly when hosted locally. "My
recommended way is to embed the maps from the site," Warden wrote via Skype chat.
What's cool: It is astonishinglyeasy to create a color-coded map from many types of
location data -- even IP addresses (just use the column header ip_address).
It took me about 60 seconds to create a basic map from a spreadsheet of magnitude 7 or
higher earthquakes around the world since Jan. 1, 2000, then a couple of minutes more to
OpenHeatMap is extremely easy to use for creating data-based maps, although there are still occasional bugs in thiswell-thought-out service. Click to view interactive graphic.
Page 16 of 2122 free tools for data visualization and analysis
21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...
-
7/31/2019 22 free tools for data visualization and analysis
17/21
customize the rollover box to display both date and magnitude. (You can see a larger version
on OpenHeatMap.com.)
Marker transparency, size and color are extremely simple to customize; you can also upload
your own marker image, and customize what appears in the tooltips rollover by adding a
tooltip column to your data source.
OpenHeatMap automatically figures out and maps locations based on a wide range of place
definitions, relying on how the location columns are named -- "address," "country,"
"fips_code" (used by the U.S. Census Bureau), "zip_code_area" (for five-digit ZIP codes),
"lat" (latitude), "lon" (longitude) and so on.
This is a well-thought-out interface from a onetime Apple engineer. (Warden said he worked
on several software projects at Apple, including Final Cut Studio.)
Drawbacks: There's no way to delete data once it's been uploaded (you can get around this
by using a Google Spreadsheet as a data source), and editing time is limited to as long as
your browser is open and you haven't started a new map. Embedded OpenHeatMap.com-
hosted maps may be slow to load.
The documentation doesn't make it clear whether you can set where the map is centered or
what the default zoom level should be; Warden told me by e-mail that the system remembers
where you last positioned and zoomed the map before saving. And this feature still can
occasionally be buggy, although Warden is responsive to bug reports.
Skill level: Beginner.
Runs on: Web browsers enabled for Flash or HTML 5 Canvas.
Learn more: Its title notwithstanding, the four-minute video "How OpenHeatMap Can Help
Journalists" offers a clear explanation for anyone interested in using the service. You can
also view samples on the OpenHeatMap Gallery and check out this Guardianinteractive map
of where Facebook is used.
OpenLayers
What it does: OpenLayers is a JavaScript library for displaying map information. It's aimed
at providing functionality similar to those big companies' code libraries -- but with open-
source code. OpenLayers works with OpenStreetMap andother maps, as this tutorial about
use with Google shows.
Other projects build on it to add functionality or ease of use, such as GeoExt, which adds
more GIS capabilities. For users who are comfortable hand-coding JavaScript and prefer notto use a commercial platform such as Google or Bing, this can be a compelling option.
Drawbacks: OpenLayers is not yet as developed or as easy to use as, say, Google Maps.
The project page notes that it is "still undergoing rapid development."
Skill level: Expert.
Runs on: Any Web browser.
Learn more: Try this OpenLayers Simple Example. A good sample is Ushahidi's Haiti map.
There are other JavaScript libraries for overlaying information on maps, such as Polymaps.
And there are a number of other mapping platforms, such as Google Maps, which offersnumerous mapping APIs; Yahoo Maps Web Services, with its own APIs; the Bing Maps
platform and APIs; and GeoCommons.
Page 17 of 2122 free tools for data visualization and analysis
21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...
-
7/31/2019 22 free tools for data visualization and analysis
18/21
-
7/31/2019 22 free tools for data visualization and analysis
19/21
What's cool: TimeFlow makes it incredibly easy to interact with data in various ways, such
as switching views or filtering by criteria such as date ranges or earthquakes of magnitude 8
or more. The timeline view offers a slider so you can zero in on a time period. While many
applications can plot bar graphs, fewer also offer calendar views. And unlike Web-based
Google Fusion Tables, TimeFlow is a desktop application that makes it quick and painless to
edit individual entries.
Drawbacks: This is an alpha release designed to help individual reporters doing
investigative work. There are no facilities for publishing or sharing results other than taking a
screen snapshot, and additional development appears unlikely in the near future.
Skill level: Beginner.
Runs on: Desktop systems running Java 1.6, including Windows and Mac OS X.
Learn more: Check out Top tips.
Note: If you're looking to publishvisualized timelines, better options include Google Fusion
Tables, VIDI or the SIMILE Timeline widget.
Text/word clouds
Some data visualization geeks think word clouds are either not very serious or not very
original. You can think of them as the tiramisu of visualizations -- once trendy, now overused.
But I still enjoy these graphics that display each word from a text file once, with the size of
the words varying depending on how often each one appears in the source.
IBM Word-Cloud Generator
What it does: Several tools mentioned previously can create word clouds, including Many
Eyes and the Google Visualization API, as well as the website Wordle (which is a handy tool
for making word clouds from websites instead of text files). But if you're looking for easy
desktop software dedicated to the task, IBM's free Word-Cloud desktop application fits the
bill.
What's cool: This is a quick, fun and easy way to find frequency of words in text.
Drawbacks: Because it's trying to ignore words such as "a" and "the," the basic configurationcan miss some important terms. In my tests, it didn't know the difference between "it" and
"IT," and completely missed "AT&T."
Skill level: Advanced beginner. This app runs on the command line, so users should have
ability to find file paths and plug them into a sample command.
Runs on: Windows, Mac OS X and Linux running Java.
Learn more: Check the examples that come with the download.
Social and other network analysis
These tools use a pre-Facebook/Twitter definition of "social network analysis" (SNA),
referring to the discipline of finding connections between people based on various data sets.
TimeFlow offers a number of different ways to easily visualize data with an important time component.Click to view larger image.
Page 19 of 2122 free tools for data visualization and analysis
21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...
-
7/31/2019 22 free tools for data visualization and analysis
20/21
Investigative journalists have used such tools to, for example, find links between people who
are involved in development projects or who are members of various boards of directors.
An understanding of statistical theories of network node analysis is necessary in order to use
this category of software. Since I've only had a very basic introduction to that discipline, this
is one category of tools I did not test hands-on. But if you're seeking software to do such
analysis, one of these might meet your needs.
Gephi
What it does: Billed as a Photoshop for data, this open-source beta project is designed for
visualizing statistical information, including relationships within networks of up to 50,000
nodes and half a million edges (connections or relationships) as well as network analyses of
factors such as "betweenness," closeness and clustering coefficient.
Runs on: Windows, Linux, Mac OS X running Java 1.6.
Learn more: Try this Quick Start tutorial (PDF).
NodeXL
What it does: This Excel plug-in displays network graphs from a given list of connections,
helping you analyze and see patterns and relationships in the data.
NodeXL merges the older and current definitions of SNA. It's "optimized for analyzing online
social media -- it includes built-in connections to query the APIs of Twitter, Flickr and
YouTube, allowing you to draw networks of users and their activity," according to Peter
Aldhous, San Francisco bureau chief for New Scientistmagazine.
It also handles e-mail and conventional network analysis files (including data created by the
popular -- but not free -- analysis tool UCINET).
Runs on: Excel 2007 and 2010 on Windows.
Learn more: Download this detailed free NodeXL tutorial (PDF) or these basic step-by-step
instructions on analyzing your own Facebook social network (PDF). One Facebook app for
downloading your own friend information for use in NodeXL is Name Gen Web.
Gephi can visualize networks of up to 50,000 nodes.Click to view larger image.
Page 20 of 2122 free tools for data visualization and analysis
21/04/2011http://www.computerworld.com/s/article/print/9215504/22_free_tools_for_data_visua...
-
7/31/2019 22 free tools for data visualization and analysis
21/21
Sharon Machlis is online managing editor atComputerworld. Her email address [email protected]. You can follow her on Twitter @sharon000, onFacebook
or by subscribing to her RSS feeds:
articles |blogs .
Page 21 of 2122 free tools for data visualization and analysis