Pcaso poster-gi2015

1
[censored] Pcaso: Share and fluidly explore point-cloud data Nathaniel M. Pearson 1 , Robert Aboukhalil 2 , Carmel Dudley 3 , John Greally 4 1 New York Genome Center, New York USA; 2 Cold Spring Harbor Laboratory, Cold Spring Harbor USA; 3 Weizmann Institute of Science, Rehovot Israel; 4 Albert Einstein College of Medicine, New York USA Background People see and think in too few dimensions to easily grok data with many variates. Flattening and freezing such data, to view as dots in a plane or box, can reveal some key patterns, but hide others. To help plumb many kinds of point-cloud data, in genomics and other fields, we built the Point cloud analysis stereopticon (Pcaso), a mobile device-friendly way to collaboratively explore plane-projected (e.g., PCA, MDS, comparative abundance) point clouds. Distinctively, Pcaso lets users • post and explore point-cloud datasets as interactive small-multiplots with stable URLs (e.g., in preprint or publication) • switch smoothly among orthogonal views, to track how each point shifts relative to others, among potentially many dimensions • smoothly zoom, to resolve point clumping • highlight points by hand, stably colored metaclass (e.g., deme, tissue, platform, sex), or metadata text search. These and coming features, elaborated by us and others via CC0-licensed d3+ code, may help interpret diverse complex, plane- projected datasets. Pcaso beats paper. (And the original beats a poster.) Point your device to pcaso.io/mundo Using Pcaso To explore posted data At the URL for a posted dataset, click any small plot to see it big. Hover on a point in the big plot to see its metadata. And text search, or click a metaclass, to highlight point(s) by metadata. To post your own data At pcaso.io, upload a .csv file, then pick fields to show as axes or metadata, and save. You can then explore away. To share your post for others to explore Share the post’s URL by email, IM, (pre)print text, tweet, carrier pigeon, &c. What you’ll see Next features We hope to soon let users zoom within the main plot, search with autofill, update and/or delete posts, and do other handy stuff. Please send any thoughts to [email protected] – and look soon for source code and updates to Pcaso, along with other tools, via the budding Open Genomics Visualization Initiative (OGVI). Anticipated questions What kinds of data does Pcaso help show? Basically, anything with many numeric dimensions. Could be principal components, of course – but lots of other data can benefit from smoothly switchable, metadata-responsive planar views. If you’ve got such data, try it out. What about parallel-coordinate plots? They too can help a lot, by showing many dimensions (especially qualitative or naturally ordered ones) at once. But dense, criss-crossing point-specific curves can, of course, be hard to follow. Why not show a 3d box instead of planes? We started with planes that mean something real from underlying data, and that our brains can easily track when switching axes. (Plus, for data with many axes, a box shows little more of the whole than a plane does.) But doesn't a stereopticon show 3d images? That’s a stereoscope ;-) Back in steampunk days, stereopticons just let people switch from one slide to another. We were also inspired by cubists (like Picasso, of course) who splintered complex real-world shapes into many-planar views.. How robust is Pcaso? We’ve started with a beta hosted on a good standard web server, meant first to show key features and spark collaborative insights and refinement – not (yet) to manage private or versioned data, or stop determined bad actors. Please use Pcaso responsibly.

Transcript of Pcaso poster-gi2015

Page 1: Pcaso poster-gi2015

[censored]    

Pcaso: Share and fluidly explore point-cloud data Nathaniel M. Pearson1, Robert Aboukhalil2, Carmel Dudley3, John Greally4

1New York Genome Center, New York USA; 2Cold Spring Harbor Laboratory, Cold Spring Harbor USA; 3Weizmann Institute of Science, Rehovot Israel; 4Albert Einstein College of Medicine, New York USA

Background People see and think in too few dimensions to easily grok data with many variates. Flattening and freezing such data, to view as dots in a plane or box, can reveal some key patterns, but hide others. To help plumb many kinds of point-cloud data, in genomics and other fields, we built the Point cloud analysis stereopticon (Pcaso), a mobile device-friendly way to collaboratively explore plane-projected (e.g., PCA, MDS, comparative abundance) point clouds. Distinctively, Pcaso lets users • post and explore point-cloud datasets as interactive small-multiplots with stable URLs (e.g., in preprint or publication) • switch smoothly among orthogonal views, to track how each point shifts relative to others, among potentially many dimensions • smoothly zoom, to resolve point clumping • highlight points by hand, stably colored metaclass (e.g., deme, tissue, platform, sex), or metadata text search. These and coming features, elaborated by us and others via CC0-licensed d3+ code, may help interpret diverse complex, plane- projected datasets.

Pcaso beats paper.

(And the original beats a poster.)

Point your device to pcaso.io/mundo

Using Pcaso To explore posted data At the URL for a posted dataset, click any small plot to see it big. Hover on a point in the big plot to see its metadata. And text search, or click a metaclass, to highlight point(s) by metadata. To post your own data At pcaso.io, upload a .csv file, then pick fields to show as axes or metadata, and save. You can then explore away. To share your post for others to explore Share the post’s URL by email, IM, (pre)print text, tweet, carrier pigeon, &c. What you’ll see

Next features We hope to soon let users zoom within the main plot, search with autofill, update and/or delete posts, and do other handy stuff. Please send any thoughts to [email protected] – and look soon for source code and updates to Pcaso, along with other tools, via the budding Open Genomics Visualization Initiative (OGVI). Anticipated questions What kinds of data does Pcaso help show? Basically, anything with many numeric dimensions. Could be principal components, of course – but lots of other data can benefit from smoothly switchable, metadata-responsive planar views. If you’ve got such data, try it out. What about parallel-coordinate plots? They too can help a lot, by showing many dimensions (especially qualitative or naturally ordered ones) at once. But dense, criss-crossing point-specific curves can, of course, be hard to follow. Why not show a 3d box instead of planes? We started with planes that mean something real from underlying data, and that our brains can easily track when switching axes. (Plus, for data with many axes, a box shows little more of the whole than a plane does.) But doesn't a stereopticon show 3d images? That’s a stereoscope ;-) Back in steampunk days, stereopticons just let people switch from one slide to another. We were also inspired by cubists (like Picasso, of course) who splintered complex real-world shapes into many-planar views.. How robust is Pcaso? We’ve started with a beta hosted on a good standard web server, meant first to show key features and spark collaborative insights and refinement – not (yet) to manage private or versioned data, or stop determined bad actors. Please use Pcaso responsibly.