Asking Why
-
Upload
datascope -
Category
Data & Analytics
-
view
151 -
download
0
Transcript of Asking Why
Asking “Why?”
A lesson for Data Scientists and those who manage them
Adapted from a post by Mike Stringer & Dean Malmgren, founders of Datascope???
The other day we had a conversation with a bespectacled senior data scientist at another organization (named X to protect the innocent).
Data scientists, being curious individuals, are often drawn to projects because: ☑ they’re interesting ☑ they’re fun ☑ they’re technically challenging ☑ their boss heard about “big data” in
the Wall Street Journal
Important problems in business are often daunting to data scientists because they don’t strictly require data to solve…
…and there are established experts already working on them.
Operations Product Development
StrategyHuman Resources Marketing
IT R&DSales
Yet these roles increasingly have an opportunity to use data in innovative ways, to make dents in long-standing problems where quantitative approaches have previously been impossible.
Operations Product Development
StrategyHuman Resources Marketing
IT R&DSales
Going in any different order is a recipe for disillusionment about big data’s true potential.
Starting with a real problem instead of starting with some interesting dataset often leads data scientists down a completely different—and much more fruitful—path.
In 2010, Brian Uzzi introduced us to Daegis, an e-discovery services provider
When a company gets sued, they have to provide all documents relevant to the case.
E-discovery companies like Daegis use a combination of technology and lawyers to help sued companies provide these documents, without providing anything they don’t need to.
Early conversations circled around “social network analysis”.
Daegis’ client datasets contained millions of emails we could parse, study and visualize!
Instead of social networks, we made the first phase of our project building a quick prototype using data from the Text Retrieval Conference (TREC).
We demonstrated that our transductive learning algorithms could reduce the number of documents that needed to be reviewed by 80-99%.
This was huge!
We were going to help Daegis gain a tremendous advantage and Daegis’ clients would be able to defend themselves from frivolous lawsuits.
+1 for the good guys. Right?
There’s that “why” again.
Had we asked about this at the beginning of the project we would’ve known the importance of defensibility.
After more design iterations (see our Strata presentation or slides if you’re interested), we arrived at some insights: what we developed needed to be educational, transparent, and understandable.
By the end, if you had to summarize the project, it would be closer to “educating attorneys about information retrieval” than “social network analysis.”
The final result is a product that Daegis sells under the name Acumen.
But beware.
The answers to this deceptively simple question may surprise you, take you into challenging uncharted territory, and inspire you to think about problems in completely different ways.