SDS PODCAST EPISODE 363: INTUITION, FRAMEWORKS, AND ...€¦ · SDS PODCAST EPISODE 363: INTUITION,...
Transcript of SDS PODCAST EPISODE 363: INTUITION, FRAMEWORKS, AND ...€¦ · SDS PODCAST EPISODE 363: INTUITION,...
SDS PODCAST
EPISODE 363:
INTUITION,
FRAMEWORKS, AND
UNLOCKING THE
POWER OF DATA
Kirill Eremenko: This is episode number 363 with President and CEO at
Aryng Analytics, Piyanka Jain.
Kirill Eremenko: Welcome to the SuperDataScience podcast. My name
is Kirill Eremenko, Data Science Coach and Lifestyle
Entrepreneur. And each week we bring you inspiring
people and ideas to help you build your successful
career in data science. Thanks for being here today
and now, let's make the complex simple.
Kirill Eremenko: Welcome back to SuperDataScience podcast
everybody. Super excited to have you back here on the
show. Are you ready for a rollercoaster of knowledge?
This is going to be a lot of fun. I just got off the phone
with Piyanka Jain, and you will be overloaded with
information about analytics and data science. Literally,
I have so many notes and so much in my head. I
probably need to sit down and process this for quite a
bit of time.
Kirill Eremenko: So Piyanka is the founder, president, and CEO of
Aryng Analytics, an analytics consulting company
where they provide services to enterprises and
businesses on how to be better with data science, data
driven decision making. Also Piyanka is an author of
several books now, of multiple books, best selling
books which you can find on Amazon. We'll talk about
one of the books during the podcast. Piyanka's also a
writer for publications like Forbes, Harvard Business
Review, Inside HR. She has keynoted many
conferences around the world, and also, Piyanka is an
educator. So they have data science courses on
Aryng.com. They have a whole academy of data science
where they provide certifications and help people get
into the space of data science. So as you can tell,
Piyanka is involved in many aspects of data science.
Kirill Eremenko: And what exactly are we going to be talking about in
this episode? There was so much to choose from.
There was so many questions I had, so many topics we
could have gone into. There was virtually, or literally...
Impossible. It was virtually impossible to cover
everything. So what did we cover? Well, we talked
about, among other things, a very important
framework called BADIR, a framework that Piyanka
developed herself. It's B-A-D-I-R. And this is a
framework that allows you to do data science in a very
thought through way. According to Piyanka, with this
framework, you can do lean data science. You can do
data science much quicker than normal. You can
deliver results faster because you're thinking things
through. Not often do you hear about data science
frameworks. I found this one very interesting,
especially how it uses hypothesis based data science.
In this podcast, you'll get an acquaintance with this
framework, and if you'd like to learn more about it,
you can always follow up and check out the book or
other resources.
Kirill Eremenko: In addition to that, we'll talk about putting courses
into context and what that means and how you can do
that for yourself, and why you would need to do that.
SWAT teams in data science and how to know if your
team is a SWAT data science team, how to do lean
data science and what percentage of data science
projects fail and why. You'll actually be very surprised
at the number. In addition to that, we talked about the
four components of data culture and how they come
hand-in-hand, how do they enable each other, and we
discuss the difference between decision science and
data science. So there we go, a podcast full of value.
Can't wait to get started. So without further ado, I
bring to you, president and CEO of Aryng, Piyanka
Jain.
Kirill Eremenko: Welcome back to the SuperDataScience podcast
everybody. Super excited to have you back here on the
show. And today's guest is calling in from California.
Welcome to the show Piyanka Jain. How are you
today?
Piyanka Jane: I'm great and excited to be here.
Kirill Eremenko: Very excited to have you. And this was a first for me
because before the start of the episode, you asked me
a ton of questions about the audience, about how you
can help our listeners better, and all these other
things. I normally don't have that. So very excited. I
can tell right away you have an inquisitive mind, and
that probably serves you quite well in your career in
data science, doesn't it?
Piyanka Jane: It does. A curious mind is a good data science mind.
Kirill Eremenko: For sure. That's a great motto. How are things going in
California these days?
Piyanka Jane: Things are good. We're all sheltered in and it's a good
thing, and hopefully, we are able to contain COVID
soon. But yeah, no. Otherwise things are good. We're
all doing what we need to be doing with social
distancing and so on.
Kirill Eremenko: Yeah. That's right. That's right. Yeah, hopefully it does
go past quite quickly with these new measures. But
I've had a look at your career background, and it's
extremely impressive, from having a published book to
being a CEO of a company that does both consulting
for companies like, as far as I understood, Google,
Box, Apple, General Electric, and many others. Also
you do education in the space of data science. You are
everywhere in the space of data science. Tell us a bit
about yourself. For somebody who hasn't met you
before, how would you describe what you do?
Piyanka Jane: Thank you so much for your kind words. I feel like I'm
just getting started, but for those who are listening in
and want to know a little bit more about me, I am all
about practical data science. I really believe in the
power of data, and for me, data plus intuition, because
we are all intuitive beings, and if we can marry data to
that, we can really optimize our decisions. And that
goes all the way from corporate decisions as a
marketing manager, as a business manager to data
scientists to all the way to our internal as personal
human beings. You want to achieve something. You
want to climb Mount Everest. You have to use data,
and that's how you're going to be able manage and
optimize your progress and your decisions. So I really
believe in that, and I think that's what I evangelize,
and that is what I hope to share... I have to be
infectious about my passion for data science today.
Kirill Eremenko: Love it. Love it. Interesting thing just ran through my
mind when you were saying that. Indeed, if you're
going to climb Mount Everest, you've got to use data. It
might sound strange at the start but when I think
about it, you've got to use data on, okay, I've climbed
this other mountain. Maybe you'll be doing training.
You'll be measuring your pulse. You'll be measuring
how tired you get, how much endurance you have,
how much water you consume, how much oxygen you
breathe, and data will definitely get you there. The
interesting that went through my mind was, some
people might say that if you just use data in everything
from business to personal life to sports, eventually,
you'll be like a robot. You won't have any emotion,
empathy, any kind of random chance that comes with
life that is normal. What would you say to people who
have that opinion?
Piyanka Jane: I have a lot to say about that. One is that when we talk
about data, we don't talk about data driven as in just
believe on data. We always talk about hypothesis
driven, data driven decision making. What does that
mean? What that means is you want to bring your
whole self, your intuition and the intuition of your
colleagues, of your stakeholders to the bearing and
then form what is needed from the data, and then
prove your hypotheses. So for us, data science, or this
aspect of being able to apply, putting data to work is
all about marrying data to intuition that you already
have.
Piyanka Jane: So for example, if you're a marketing manager, you
probably have some really good intuition about your
audience, about what works for them, what products
they like, and so on. Let's use that intuition you have,
the context you have as well as your team members,
as well as your stakeholders, and then form a
hypothesis driven... We teach a framework called
BADIR, that's also there in my book, Behind Every
Good Decision, for those who are interested in
knowing more about it. It's called B-A-D-I-R is the
name of the framework, and we talk about how... For
you to be effective and efficient in data science and
analytics, you basically lay out a hypothesis driven
plan.
Piyanka Jane: So even before you touch data, you lay a hypothesis
driven plan. You think what are the things... If you are
solving a problem, going back to our personal goal,
let's say you were going to go look for treasure in
Pacific Ocean. There two approaches to it, right? One
is, I'm going to be just... just going to jump in because
I want to experience the world so I'm just going to
jump in and start swimming in the ocean, and
hopefully, one day, I will run into a treasure. How
likely is that, Kirill?
Kirill Eremenko: 0%.
Piyanka Jane: Yeah, right? You have to be super lucky. You're relying
on luck and you're relying on... And you're actually
giving up your power because for anybody who sets
sail in Pacific Ocean, you have limited resource. Maybe
you have one-month's supply. So you're basically
saying, "Oh, I have one-month's supply but I'm going
to set sail," you're risking that one-month supply
because you don't even know. Maybe if you had
infinite supply and infinite time, maybe someday you
will run into a treasure. But on the other hand, if
you're like Sherlock Holmes, and you are detective,
and you lay out hypotheses, what does hypotheses
mean? Basically, you have good ideas of where the
treasure could be.
Piyanka Jane: So you look up past shipwrecks, past routes, the
depth of the ocean, all of that, and you figure out, you
narrow down, these are the most likely spots and the
news reports of where debris was collected and so on.
And then you narrow down, these are the five most
important, or 10 most important spots, most likely
spots for treasure. And you go there and then you
send your deep sea divers or your submarines down
there, you're more likely to find that, and at least you
would fail faster as well. You would have looked at
those 10 spots. You would know within 10 days or
whatever else, "Okay. I don't have it. Now what's my
next plan forward?" Versus just kind of going, right?
So that is what we talk about.
Piyanka Jane: When we talk about data driven, we talk about
hypothesis based, data driven. So going back to your
will we become a robot? We are human beings and we
are special beings. We can't quite become robot.
However, you also don't want to just rely on data. And
I have seen people who are so data driven that they
leave their intuition behind, and they come up with
these results sometimes, in the business as well as in
personal, and we look at it, and you're like, "This does
not even make sense." My entire being rejects what
you are concluding, and that's my intuition, right?
Piyanka Jane: So you never want to leave the intuition. Intuition is a
big part of us. Intuition is what keeps us safe.
Intuition is when you're going up the Mount Everest
and you're beginning to feel not so good. You look up
your VOX meter and you say, "Oh, what's my oxygen
level? What I'm absorbing is dropping," right? If you
didn't have intuition, if you're not paying attention,
you won't even know. And before you know, you will
have fainted before you even can look at your data,
right? So you need to bring your whole self to this
game. Data science is not about just data. It's about
bringing your entire self to the table.
Kirill Eremenko: Fantastic. Thank you for the rundown. It was a great
way to see how data science can be combined with
intuition. I think a lot of us would agree that both have
a place. And I actually want to talk a bit more about
your BADIR framework. So I read about it. So the B-A-
D-I-R. What do those letters stand for and what is this
framework all about?
Piyanka Jane: Yeah. So the BADIR is an acronym for these five steps,
and they stand for business question, analysis plan,
data collection, insights, and recommendation. And if
you notice, the part about data collection is step
number three. So many people think, when they think
about data science, they think about, "Oh, let's start
with data." But it doesn't start with data. Good
analytics, good data science project doesn't start with
data. It starts with business question, refining and
flushing out what you really want to find out. What is
it your question... For example, a question could be
why's our sales dropping, or why are our customers
churning, or why is our conversion down, or can we
optimize our conversion? Can we improve our user
acquisition? In what ways can we improve our loss
ratios, and so on? So those can be the questions.
Piyanka Jane: And the point about having a full step for it is
basically, there's an ask that comes in to the
stakeholder, or the first thought if you're a marketing
manager and yourself, a citizen analyst, which means
part of your work, you are doing data science which
almost all of us are right now. In the world these day,
in the business world, the language of business is
data, and so everybody is speaking the language of
data. And if they're not, they're being left behind. So
everybody sort of is sort of having some access to data.
And the first thought that comes to your mind as you
think about, "Oh. I need to do my next campaign, or I
need to figure out whether this feature works or not,"
that's an early question and even if you are a data
scientist, if somebody asks you a question, it's an early
question. You need to define it through the business
question framework to come to the real business
question. And that refinement has many aspects about
what actions somebody's ready to take. If you find the
insights with it, there is the, who are the stakeholders?
And very many, many aspects both on the data science
side as well the decision science side. And so, that's
business question.
Piyanka Jane: Then you lay out a hypothesis driven analysis plan.
You ask yourself and the stakeholders, what is the
solution? So if the question was why are our
customers churning, then many people will have some
good ideas. Your stakeholders may have good ideas
like, "Oh, we have recently increased our price and
that's causing some churn. Our ticketing system is not
working that well or our policies have changed recently
or the customers are churning because we post 90-
days, these things that we do that just not working
well," and so on and so forth. You have lots of good
hypotheses. It's like good spots that you would, going
back to our Pacific Ocean [inaudible 00:15:44], it's
going back to where you're going to look deeper. So
these are good hypotheses. And then you lay out
analysis plan with hypotheses, with your methodology,
with your assumptions, with your data and criteria to
prove this. There's a bunch of stuff.
Piyanka Jane: And from there comes out your data specification
which means if this is the question, and these are my
hypotheses and these are my assumptions, facts, and
my methodology and so on... And by methodology, I
mean your specific approach to data science. So are
you going to use aggregate analysis? Are you going to
go correlation analysis? Are you going to go deeper into
using probably some predictive analytics like
statistical methods? Or are you going to go even
deeper and you'll use machine learning, whatever else?
So you're laying out, is a classification problem? Is it a
regularization problem? You're laying it out right there
and saying how far am I going to go. And that's also a
function of... It's a function of data. It's function of
time you have. It's a function of precision you need
and so on. So there's a lot of things going on. These
are all planning stages.
Piyanka Jane: Remember, you have not yet touched the data, right?
And most people, most data scientists and others,
when they think about data science, they think about,
"Oh. Where's the data? Let me pull that data in Excel,
or let me [inaudible 00:16:54] in Python," whatever
else. But that's not where data science starts from.
Data science starts from question, or flushing out the
question, laying out a hypothesis driven plan. And
when you're playing out hypothesis driven plan, it also
means you are aligning with your stakeholders to say,
"This is what my plan is going to be. This is how much
time it'll take. This is blah, blah, blah. Are we in
alignment?" When you have a handshake, that's when
you go to the most time consuming step of getting data
and then validating it, and triangulating and cleaning
it up. That's all time consuming.
Piyanka Jane: Then start doing your analysis. So the insight step is
also, if you have a recipe for doing insights versus
what many people do which is they set sail in the
ocean of data and they start looking for treasure,
which is a pretty bad idea because it takes you a long
time, and your likelihood of finding treasure is also
really low. So what we recommend is now that you've
done all this work, you have laid out a hypothesis, you
have collected data, now... And collected data meaning
you have collected only the data that you need versus
saying, give me all the data you have. Now because
you have hypothesis, you've used that to know where
exactly you're going to dive deeper. Then the next step
is use recipes to derive insights.
Piyanka Jane: You know if you're going to do correlation analysis,
these are the steps. If you're going to build a linear
regression model, these are the steps. Or if you're
going to go into gradient boosting, these are the steps.
This is what you're going to do, and so on, right? So
you know what the steps are. Follow those steps and
follow the recipe in a structured manner and come to
your insights. At the end of it, share your early
insights with your stakeholder and see if that's making
sense.
Piyanka Jane: And the last step of this BADIR framework is
recommendation. So you make recommendations or
you instrument your model, you productionalize your
model, you instrument your insights, whatever have
you. And that's also very important because I can't tell
you how many good models I've seen sitting in shelf
because people didn't know how to align with
stakeholders, how to communicate your findings to the
right folks in the right way so that you can basically,
inspire them into action. So that in a nutshell, is the
BADIR framework, and for folks who are interested,
they can learn more about that in my book, Behind
Every Good Decision, as well as on our website. If they
want to go and look at aryng.com, they can find a lot
of use cases and case studies on why we believe this
works, and many, many organizations, many Fortune
1000 have already adopted it, this framework as their
common language.
Kirill Eremenko: And I actually wanted to ask you about that. So I'm
seeing on your website that this framework is adopted
by Apple, Google, GE, PayPal, Adobe, SAP, Ebay, and
many, many more companies. How did you get this
framework into these companies?
Piyanka Jane: So not all companies that you spoke about and not
100% of them are adopting it, as you say, but many
organizations are adopting it much widely and some
organizations are adopting it within for example,
customer support group or marketing group and so
on. But the way the... So I mean, one is that we, after
many years of being pestered by our students, I wrote
this book, and basically put then the BADIR
framework and made it open-source. So many data
scientists and business users are picking up the book
and it has a step-by-step guide, so they are picking it
up and they're adopting it. And then as and when they
need further detail, more detailed help, they reach out
to us. So even our non-customers are using it and we
may not be even aware of them, so that's-
Kirill Eremenko: Got you. Got you.
Piyanka Jane: The other thing is, it's a very... I mean, it's a recipe-
based approach. I don't know, Kirill. Do you cook?
Kirill Eremenko: Yes, love to cook.
Piyanka Jane: Loves to cook, okay. So do you know how to make
falafel?
Kirill Eremenko: Falafel? No, I don't know how to make falafel.
Piyanka Jane: Okay. So that's a tricky one. So let's say you are
thinking about, "Oh, I'm going to make falafel." What
would help you the most now that you have to make
falafel?
Kirill Eremenko: A recipe on how to make falafel, I think.
Piyanka Jane: There you go, right? So a recipe. And then the first
time you make, do you think you'll get it perfect?
Kirill Eremenko: No, of course not. First time always not perfect.
Piyanka Jane: Because you're still understanding, "Hey. I'm going to
salt chickpea. And I'm going to grind it, how fine I'd
grind it. And then what would be the consistency of
that as I drop it into... To fry in oil, as I drop those
balls, how thick they need to be, how viscus or how
liquidy they need to be." So there's a lot of details that
you're going to get. The first time you're going to make,
you're going to get the detail and you'll see the output.
The same way, if you have to learn data science, what
would help you the most? A recipe.
Kirill Eremenko: A course. A book. A guide. A learning path.
Piyanka Jane: Yes, and a recipe. Whatever, a course is about recipe.
A book is about recipe. A recipe. Something that tells
me, do this and then do this and then do this, and
these are the ingredients. And do this and then do
this, right? And the first time I do that, I'll get
somewhat, and I get some understand the second and
the third time. So the way we have structured our
courses, and for your listeners who are interested,
they can go on academy.aryng.com and find these
courses, we are all about how to bake a cake, how to
make a falafel kind of recipe. So we start, we share
this whole BADIR framework and by the time they're
done with even the level one course which is the
business analytics course, they have done this
framework. They have baked the cake, and they have
cooked the falafel at least three times.
Piyanka Jane: And then following that, they work on a project which
means, okay great, you have done this in simulated
data, or you have done that in data which was fairly
clean. Now do this in real world, in your real world. So
for current data scientists who are currently employed,
we tell them, "Okay, pick up a project within your own
work flow." Or for future data scientists who enroll
with us, we give them one of our client projects. And
thereby, they get to practice, again, the same
framework. So they know exactly what they're doing.
As we put them in a client situation or they pick up a
project, they know how to follow the BADIR
framework, and we are there as their mentor at
different points, at the analysis plan stage, at the
insight stage, at the recommendation stage because
they know what they're doing. We know exactly what
they should be following so we can course correct. And
that's the fastest way I have found to learn data
science is using some kind of recipe, some kind of... a
step-by-step method of this is how it works.
Piyanka Jane: Now, as you get advanced in it, you can start using
shortcuts. You can start using iterations and so on
and so forth, right? And so you can think about, you
start with your common, simple vanilla cake, and then
you can start adding some... I'm going today, make
some nuts and raisins, and I'm going today, make
some icing, and I'm going to layer it up. And maybe
one day I'm going to be able to make tiramisu and all
of that, right? So you're going to be able to advance
your skills. And this step-by-step way of learning is
recipe-based, and then step-by-step use case based
approach is what I recommend for people who want to
learn data science.
Kirill Eremenko: Got you. Wow. Thank you for the rundown. So let's
talk a bit more about your courses. So I noticed you
have... For those by the way, for those interested, the
website is Aryng, A-R-Y-N-G. And the course are at
academy.aryng.com. I noticed you have quite a few
interesting courses, and what I wanted to find out is...
These are high ticket items, so over $1,000 per course.
What is your X-factor? So what is it that students can
pick up from this course that will really make it
worthwhile for them?
Kirill Eremenko: Are you subscribed to the Data Science Insider?
Personally, I love the Data Science Insider. It is
something that we've created so I'm biased, but I do
get a lot of value out of it. Data Science Insider, if you
don't know, is a absolutely free newsletter which we
send out into your inbox every Friday. Very easy to
subscribe to. Go to SuperDataScience.com/DSI. And
what do we put together there? Well, our team goes
through the most important updates over the past
week or maybe several weeks, and finds the news
related to data science and artificial intelligence. You
can get swamped with all the news, even if you filter it
down to just AI and data science. And that's why our
team does this work for you.
Kirill Eremenko: Our team goes through all this news and finds the top
five, simply five articles that you will find interesting
for your personal and professional growth. They are
then summarized, put into one email, and at a click of
a button, you can access them, look through the
summaries. You don't even have to go and read the
whole article. You can just read the summary and be
up to speed with what's going on in the world, and if
you're interested in what exactly is happening in
detail, then you can click the link and read the original
article itself. I do that almost every week myself. I go
through the articles and sometimes, I find something
interesting. I dig into it. So if you'd like to get the
updates of the week in your inbox, subscribe to the
Data Science Insider absolutely free at
SuperDataScience.com/DSI. That's
SuperDataScience.com/DSI and now, let's get it back
to this amazing episode.
Piyanka Jane: Yeah. So there are courses and there are certifications,
and our certifications are... For example, let's pick up
one which is the future data scientist certification. And
what it has is a complete [inaudible 00:27:16] of how
you can transition your career to data science. And so,
it'll have the underlying courses, and it's self-paced so
you come in, and you log in and you... We recommend
one section a week, or if you have more time, one
section a day, and make progress. And then after
you're done with that... And while you're doing that,
we have communities so you're posting questions in
Facebook community. And you also have a monthly
mentoring sessions directly with us live on Zoom, and
thereby, you are able to log in and ask your questions
live, as well as post your questions non-live, 24 by 7
on Facebook community.
Piyanka Jane: So lots of interaction. Students are helping each other.
So there's a community that you have. There's a
learning that you're doing of the fundamental
framework, BADIR, and you're learning it in a context
of marketing of product. If something happens in this
[inaudible 00:28:10] in hospital, how are you going to
do it? If this is happening in winery, how are you going
to think about optimizing and so on? Lots of different
use cases. We are opening their blinders and we are
giving them a toolkit of tools that they can use. Then
the next part of it is-
Kirill Eremenko: Putting it into... Sorry, putting it into context, putting
education into context. I'm just thinking of what can
students take away that they can enact in their own
learning, and it sounds like putting education, data
science into context like you said, in a hospital, in
winery or somewhere else. That helps probably
retention. Also helps understanding the topic better.
Piyanka Jane: Yeah. And then follow that up with a real project. So
they all work, all of these certifications have a project
at the end. So it's all fine and dandy when you learn
something. How many of us have gone and done this
in the corporate world, really? We are taking classes
all the time. You come in. You even do a half-day
offsite for leadership, and you go out there and you
say, "Wow. That was amazing. That's so inspiring."
You come back and it's business as usual, right?
Kirill Eremenko: That's true.
Piyanka Jane: So for the business to be not as usual, for you to
interrupt that way of thinking and to really change
manage, you need to bring it home with you, which
means you need to tie it to a project. And I can't tell
you how many people... I mean, I've seen people just
flower from, "Oh. I'm very, very nervous about data
science," to, "Okay. I've done the course. I'm still not
sure," to, "Now I'm doing a live project with a client
and oh, I get this part. Oh, I can go review. I'm stuck
in this part. I can go review this video," or whatever
else. And then when they're done with the course, they
have delivered a final model, final insights to the client
and the client is really happy. I've seen people go from,
I'm so nervous to all of like, "I get it. I can do it," right?
So that's what it-
Kirill Eremenko: Gotcha... So in the courses, they would have actual
live projects with clients. Is that the case?
Piyanka Jane: In the certifications. People have options of just taking
courses a la carte and they can learn on their own
time and do courses. What we recommend is the
certifications which has projects at the end with us as
live mentors and with live clients, again, all working
remotely. We have students logging in from Nigeria to
Australia to of course, big percentage of them are in
US, as well as all throughout Europe. And so, they're
working remotely but with live client and with us live
in the mentoring session. And once they're done with
that, they have the confidence, "Hey. I understand the
fundamentals of data science and I can apply it to
solve problem and I've seen the end-to-end of at least
one project all by myself or with someone from my
team."
Piyanka Jane: And then, for people who are looking to transition, we
have mentoring sessions of step number one, do your
targeting of your job. All the things that you've
learned, now let's apply it to job search. Targeting of
your job. Making your resume to your target profile.
Because a lot of times people think, "Oh. Now I have
done the certification. Let me add this one line item to
my resume, and now I'm an analyst." If you're looking
to transition your career, your resume needs to
transition as well. It needs to now tell a story of you as
an analyst, you as a data scientist. So that's the
second mentoring session we have. And the third one
is how do you interview and how do you ace that,
right? So we have follow on, end-to-end process where
we're holding hand and making sure that the people
cross over to the other side. And that basically
increases the success rate. So for people who are
looking to transition, that's a huge success rate.
Piyanka Jane: We also have a similar certification for current data
scientists. Again, with the project and the with this
kind of learning and hand-holding, they get the
confidence that they can do it and then they are able
to do it, and then they see the stakeholder alignment.
They see what happens to people once they deliver the
kind of project the way we are talking about. And we
have gotten so many letters, I can't tell you, like,
"Piyanka, you won't believe. I got invited to this
meeting where I would never be invited after this
project." And yes, if you're going to align with
stakeholders, if you're going to use this framework,
and make sure that you're doing the decision science
part, you start to appear as a partner versus as a
downstream somebody who takes order. So it changes
the world altogether when you start doing things in the
way that can engage people in the right way. And same
for [inaudible 00:32:21] analysts.
Piyanka Jane: So our approach is sort of end-to-end. I'm all about
results. So for me, when any algorithm, any math, any
statistics is useless until it gets me results. And so for
me, again, as I guided thousands of students with this
transition... I also have another book. Sorry, I'm
bombarding folks with another book, Acing Your
Analytics Career Transition, which is right now
because of COVID being made free. It's on Amazon. It's
called Acing Your Analytics Career Transition. And it's
a very quick read on Kindle. So it's like a 40-page read
or something. And it lays out these steps, step-by-step,
and whichever program they choose to go, whatever
else, you need to follow a step-by-step method of really
transitioning. You can't follow a haphazard path and
expect that your career would be of that of a data
scientist by just taking courses here, courses there. I
mean, take courses but in the context of identifying
what your target is. Back calculate. Look at your own
resume. Figure out the gaps. So all of that is there in
the book, and hopefully, that will be a good guide for
some of your listeners.
Kirill Eremenko: Amazing. Thank you. Very cool that you made it free.
That is very admirable and maybe, probably will help
lots of people.
Piyanka Jane: Yes, hopefully.
Kirill Eremenko: So talking about these courses. Very interesting. So
the certification is something that you are actually
organizing internally. I haven't seen that before, so
that is very cool. Tell me a little bit about your SWAT
data science. So SWAT, I know the SWOT framework
as W-O-T. Strengths, weaknesses, opportunities,
threats for business, but you have another framework
in addition to your BADIR framework called S-W-A-T.
What does that stand for and how does it work?
Piyanka Jane: Sorry. So that's not a framework. So this is going back
to my days in PayPal. I was heading up business
analytics there for North America, and before that, I
was part of leading product analytics for merchant
consumer on the product side at PayPal. And at that
time, it was some series of projects that I did and the
credibility I won. I and my team became like a SWAT
team. You know the SWAT team who come in when
things are not going... When things are complex
situation, a SWAT team is parachuted into that
situation and they can control it and they can get stuff
done. So we came to be known as a SWAT team, and it
was a pretty small team that I lead. It was here as well
as international. And yet, we came to be known as a
SWAT team, and the reason was... And what that
meant was, even if we were part of product analytics, if
there's a problem in Omaha in customer support, I
would be called in. And we'd be saying, "Leave
everything. Drop everything. This is urgent. Come into
this meeting and take this over, and for the next one
month, this is what you're focusing and I need results
by Monday, April 22nd," right? So that was how it
used to be.
Piyanka Jane: And I recognized and I used to wonder, what is it that
made us a SWAT team? What is it that we got... I
mean, there were lots of data scientist team at that
point. And what was it that got us that much
credibility that got us... We didn't have any extra or
special tool. We used the same tool that most other
data scientists had. And we recognized, the power was
us, our hypothesis driven method. This BADIR
framework that after I left PayPal, I sort of formalized,
was what I was doing internally in my head, and my
team was doing it because I was teaching them. As I
onboarded my data scientists, I would teach them this
method, not in this framework the way it is, but I
would inherently teach them this framework. And
what this does is, it gets results quicker.
Piyanka Jane: So for example, today, for our clients, we can get a
really high end, very good accuracy, highly functioning
machine learning model in about eight to nine weeks,
and no other consulting companies can. And that's
very lean like consulting team of two to three people,
we can produce machine learning models so quickly,
same for AI or deep learning models. And the reason is
because we are hypothesis driven, and we do a fair bit
of the same BADIR framework. We do a fair bit of work
upfront aligning stakeholders. So not only does our...
We produce work fast, but the percentage of time our
work gets used is also really high. And that was the
same thing for me. When I was at PayPal, almost every
model or every project we worked on went on to
produce millions of dollars of impact, and had a
amazing shelf life, meaning amazing... Some of the
models were operational after three years or four
years, and the reason was that-
Kirill Eremenko: You mean still operational after three or four-
Piyanka Jane: They were operational and they still form the
foundation of many things because we did a lot of
work on the decision science aspect, the stakeholder
alignment, really understood the question, and then
we were hypothesis driven. So all this brain work that
we followed, this gave us... One is it gave us
acceleration. Second is because we were so success
metric driven because that is inbuilt into our
framework, we were almost always... The stakeholders
could not wait to act on our insights instead of us
having to influence and go after them and say, "This is
what we need to do." They could not wait. It's like a
relay race. They were jumping up and down, ready to
take the baton from us which rarely happens. And the
third thing was, because we did a whole lot of work...
because we were hypothesis driven, we were really,
really fast.
Piyanka Jane: So that same analogy then I took over when I went to...
Eight and a half years ago when I started Aryng, I took
that same analogy and I basically framed the team,
our whole philosophy is similar. It's all about rapid
ROI and also practical data science. We're not about
pie in the sky, fairytale data science. Give us all your
data. We are going to help you monetize it and it will
take over months and months and we'll keep trickling
some insights to you. For us, it's all about practical
data science. What data do you have right now? What
are the decisions you're looking to make? And how can
we get you the fastest go-to-market with that?
Kirill Eremenko: Got you. Well, Piyanka, this is one of the most
saturated podcasts. I can't keep up with you. You have
so many ideas, so many things. I'm just going to jump
to the next question I had for you. So in one of your
videos you talked about analytics projects, and this
ties in quite well with what we just discussed that
having a hypothesis at the start of your analysis,
before you even start your analysis by asking those
business questions and doing analysis plan according
to this BADIR framework, come out with the
hypothesis and then only moving to data collection
deriving insights and recommendations. So doing
those first two steps, coming up with hypothesis helps
your analytics projects be more relevant and creates
success. In one of your videos you said that a huge
percentage of data science and analytics projects
actually fail. I could not believe how low the percentage
of successful project was. Could you walk you us
through that again, please?
Piyanka Jane: Yeah. And it's dismal. Gartner published a report I
think two years ago in 2017, or three years ago, which
then they later corrected to even... It basically said
that 85% of big data or data science projects... And by
the way, this whole space is about $200 trillion
investment that goes in, and of that-
Kirill Eremenko: Trillion. 200 trillion.
Piyanka Jane: Trillion dollar.
Kirill Eremenko: Into analytics?
Piyanka Jane: Yeah. Meaning I'm talking about the big space of data
science and big data. All the infrastructure investment
and so on. Let me correct. There's $200 billion-plus
investment that goes in overall, world over, globally.
And of that, Gartner predicted less than 85% of them
actually drive an impact. So 85%-plus projects
actually fail meaning they get instrumented or they sit
on a shelf somewhere. Nobody uses them, or they get
abandoned halfway through, whatever else. They just
fail.
Kirill Eremenko: They could even look like a success. We derive the
insights but nobody's using those insights.
Piyanka Jane: Nobody's using it because you build the best of the
model. And this by the way, is the biggest pet peeve. I
keynote at many conferences and one of the
conferences I was keynote at is Predictive Analytics
World and some of their top data scientists come to
these conferences. And their biggest... And when I ask,
I often start with my keynote a thing, "Oh, how many
of you have work on data science project and it didn't
go anywhere?" And almost all hands go up.
Piyanka Jane: It's one of the biggest pet peeves of data scientists. I
built the most amazing, lowest misclassification, high
accuracy model, but my stakeholders are not listening.
They have moved onto something else, or whatever
reason. So as for Gartner, it was 85%, right? And then
some other experts came in and they did the...
CIO.com for example, published some other reports
which said of the ones which even get finished and get
out, you would deem successful, less than 15% of
them actually drive any significant impact. So by the
time you do all this math, it's looking like of the 200
billion-plus investment, we're talking about 2%.
Kirill Eremenko: 2%?
Piyanka Jane: 2% is actually driving any impact. It's abysmal. This is
horrible. But it is real, and I have seen this live and
that again, goes back to my world when I used to go
back... going back to my PayPal world and now also in
my role at Aryng and our client work. I mean, clients
pay for data science work so you would think our
probability of success would be higher, but I can't tell
you how many times we are coming to project halfway
through or somewhere, even end where it's going
nowhere from some other consulting companies or
whatever else, and the companies come and said, "This
is going nowhere." They've said, "We have already
attempted it. It has failed. Can you correct it now?"
And often, one project that has failed has taken
months to fail also. So it's not like you fail fast, and
the stakeholders have found out you've failed. It's like,
"Oh, we were looking at using NLP. We were looking at
improving our refund or return rates. This is
automotive part. And we did this large scale analysis.
It took us six to eight months, and we realized that
because of this and this and this and this and this
reason, we can't get any lift and the model which was
built is not operational. We asked them to recollect
that." All of that. There's issues galore. And it took
them eight months.
Piyanka Jane: So there's so much failure and so much money getting
wasted, so much time getting wasted, all because...
And there's lot of main reasons. The main reasons for
failure that I have seen is lack of data maturity, which
means people are not even believing the data, or
getting the data out is itself a... It's like putting your
hand in a lion's mouth and getting something out of
that. It's really, really hard. Data science rigor is often
low. People often have data scientists who start from
the step D and do part of I, and they call it done,
right? So they don't do the end-to-end. Often
organizations don't have an engagement model
between business and data science. Sometimes they
don't even have... Well, this is more common. I mean,
lack of enterprise data literacy is a big one where the
data science team is good. They are producing
insights. But the organization, the marketing, the
product, they don't understand data science. They are
wary of data. And so, you give them a machine
learning model, they're not believing it. And so-
Kirill Eremenko: Could I just jump in here? This is an interesting topic
on data literacy because according to you, data culture
consists of three things. Data literacy is one of them.
What are the two others? I was just curious.
Piyanka Jane: So there are four Ds of data culture.
Kirill Eremenko: Oh, four actually.
Piyanka Jane: Four Ds, yeah. So data literacy is one of them. But the
most important or the foundation on which data
culture sits is data maturity, right? The data maturity
being do I have easy and appropriate access to single
source of truth for all, right? So our data scientist
needs a different access. A marketing manager needs a
different access. But do I have appropriate access to
single source of truth, or an easy access? Or does it
take me forever like, "Oh, I click this button. I wait 10
minutes." Do you think your marketing manager is
going to wait 10 minutes to get that report? No.
They're going to start finding shortcuts. You know this
excel report that comes in from the other system?
Maybe I'm going to look at that, whatever else. So data
maturity is the critical. It's the foundation of if you're
looking to establish a culture of data, you need to get
some degree of data maturity. And on a scale of 0 to
10, at least 7 and above and then you'll be functional.
Piyanka Jane: Second part is data literacy. Now that you have access
to data, do the people know what to do with data?
You've given access to the marketing managers, the
product managers, the operations people, the
customer support people. They have access to data.
When the customer calls in, they have access to data,
not only about what this customer's history is, how
much they have spent and all of that. Now, do they
know what to do with it? Do they even know what...
The customer support call center agent, they see their
own call performance data, their average hold time,
average speed of answers, whatever else. Do they know
what to do with it? Do the supervisors know what to
do with it? So that's data literacy.
Piyanka Jane: When appropriate level have appropriate... When
people have appropriate level of data literacy at the
right level for them, and they're able to use the data
effectively to make decision at their level to be able to
use that in discussions to drive conclusion, then your
organization has appropriate level of data literacy. And
currently, data literacy's really low in organizations.
We have gone into organizations where data literacy...
Less than 2%, less than 3% of people have the right
level of data literacy. And in the best case scenario,
maybe 10%, 15% of the people are going to be at the
right level of data literacy. Still 80% don't have the
right skills at their level.
Kirill Eremenko: Wow. Wow. That's crazy. What are the other two Ds?
Piyanka Jane: The other two Ds are data driven leadership. So if the
leadership does not have a vision for a data driven
organization, they don't or they're not holding their
team accountable to use data to drive decision, they're
not using something like zero-based budgeting, you
give me the money, that's when you get the money and
so on, then that organization cannot have a data
driven culture because the leadership itself is not
embodying it, and they don't necessarily see it as an
asset.
Piyanka Jane: And the last one which ties all of this together is data
driven decision making process. So if you don't have
data in a structured decision making process, if data
is not part of the decision making process, then you
can do data all you like. You can build models all you
like. But the decisions are getting made in a parallel
track almost independent of data. And of course, your
organization will not be able to leverage the data, and
will not be data driven. So these are the four Ds. I'm
going to summarize it. Data maturity, data literacy,
data driven leadership, and decision making process.
Kirill Eremenko: Wow. Very, very interesting. I know we're going to have
to wrap up soon, but I have one more question for you.
Piyanka Jane: Sure.
Kirill Eremenko: You mentioned decision science. But what is the
difference between decision science and data science?
Piyanka Jane: That's a great question. Again, I'm going back to the
power of BADIR framework or power of why a SWAT
team works wherever they work, is analytics or putting
data to work has two components. There's data
science and decision science. Data science is all the
algorithmic aspects of the things you need to do to... or
technical aspects to do the technical analysis,
collection of data, identification of the data type you
need, setting up your null hypothesis and all of that.
That is all data science.
Piyanka Jane: Decision science is all the things that you need to do
to make sure that those insights that you produce
goes towards impacts, which means the business
considerations, the stakeholder constraints and
communications, timelines, the realities of the
business, that is all decision science. So the science
that addresses all of those and incorporates that into
data science is decision science. And when you marry
the data science with decision science, you get the
power of data.
Kirill Eremenko: Fantastic. Love it. All right, we'll end on that. I think
this was an amazing excurse into the world of data
science. I have a lot to process after this. Before we go,
Piyanka, could you please help our listeners, where
can they find you, follow you, get to know more about
Aryng, if they'd like to explore this space further?
Piyanka Jane: Sure. So they can connect with me on LinkedIn, or
follow me on LinkedIn. And then my name, if they look
up Piyanka Jain and Aryng. Aryng is the company
name, A-R-Y-N-G. They can find us either through
LinkedIn or Aryng. Or they can also follow me on
Twitter. My hashtag is analyticsqueen.
Kirill Eremenko: Fantastic. And of course, pick up the book. Sounds
very exciting. Behind Every Good Decision. Great
reviews on Amazon. Love it. What inspired you to write
the book?
Piyanka Jane: I wish it was an inspiration but it was more of a
forcing factor. At that point, we were doing lots of
public workshops. And every workshop that we would
do or we would conclude, people would be by the time
the... The day four, day five people, our students
would be pestering us that, "Do you have a book? Do
you have a book?" And I said, "We don't have a book."
And I of course, for one reason, I always thought,
"Well, who has time to write a book?" I mean, I
wouldn't even know where to start. And I'm a natural
speaker but writing is not that easy for me. So I said,
"Well, I don't think... I'm not sure." But then that kept
repeating over and over, and somebody planted a seed
and it starts...
Piyanka Jane: And then, right around that time, Wiley have called us.
Wiley reached out, out of the blue. And also [inaudible
00:52:12]. And they said, "Oh, we're thinking of
publishing a book. Would you like to write something
along this line?" And I was like... It was all coming
together. So I said, "Okay. Well, I don't know what it
takes but we can attempt it." And by golly, I mean...
Because I'm all about practical, it took us a while to
get to the level that I wanted-
Kirill Eremenko: It's a big process, right, writing a book. It's a big job.
Piyanka Jane: It was a big process because... And I had a team. I had
my co-author, Puneet Sharma, as my colleague at
PayPal, whose now at Google. Great guy. We
collaborated together. And he and I are very much in
alignment with how we see the power of data, so that
was great. But then, none of us are writers and so we
had to find some really good editor who could edit
out... really put content in perspective for users to
understand because we were saying a lot of things but
if we are technical, somebody has to call us out on it
like, "Hey. It's not making sense."
Piyanka Jane: And so my dear friend, Laxmi, came about on a hike,
one of the hikes we were doing up PG&E here. She
started talking to me and by vocation, she is not a
writer so I had never thought of her. But as we hiked
that steep four-mile up which... It's a very tricky hike
because every turn, you think, "Okay. I'm almost to
the top." But it takes you about [inaudible 00:53:37]
pretty steep hike. And she got the entire gist of what
we were trying to do, and this one chapter... I was
basically kind of whining to her that, "Hey, I hired this
editor and they are correcting our English but they
keep taking the content out. It's not working well."
Piyanka Jane: And since she started talking to me, and then the way
she sort of reframed what I was saying, I was like, "Oh.
Do you have time to work on this project on with us?"
And she thankfully did. And that time, I was pregnant,
and also, she was pregnant and it was so funny. And
then Puneet was struck in between two pregnant
ladies who were like... Our hormones are high and
we're trying to collaborate on this project over phone,
over live. And then we hired a graphic design team
because we are both very-
Kirill Eremenko: I love the images in your book. They're so good.
Piyanka Jane: Isn't it?
Kirill Eremenko: So the one with the sharks, hello data science. That is
so funny. You got some really cool illustrations.
Piyanka Jane: Yeah, thank you. And we hired one of the best teams
because I am already visual. And I said, "I don't want
to write a dry book. I want to make it fun." And so we
got this team together, and finally what came out, I
was happy with and then it got published. So I know
you asked me a short question and I gave a long
answer.
Kirill Eremenko: No, no. Love it. Love it. I highly recommend. I'm a big
believer in this. My own book is also about helping
people to get into data science. This sounds like it's a
different perspective. You introduce the BADIR
framework there. I think it's a fantastic book for people
to pick up. Definitely check it out. It's available on
Amazon. Yeah, looks like a great book.
Piyanka Jane: Thank you. Thank you so much, Kirill. It was a
pleasure talking to you, and it was such a joy having
this conversation.
Kirill Eremenko: Thank you, Piyanka. Yeah, lots to process. I think we
might need to do a second podcast sometime down the
line.
Piyanka Jane: Absolutely. Would love to.
Kirill Eremenko: So there you have it ladies and gentlemen. That was
Piyanka Jain, president and CEO of Aryng. I hope you
enjoyed this podcast and got a lot of value out of it. I
know it probably felt like drinking out of a fire hose.
Piyanka has so much knowledge, so much information
on the space of analytics. That's why I said at the end
that probably, we need to do a second episode to dive
deep into specific topics here. I had so many
interesting favorite parts here. I loved the discussion
about what data culture is, the four components, the
difference between data science and decision science,
always an interesting topic. Probably my biggest
favorite out of all of them was the hypothesis based
approached to data science. I think that is a very
refreshing approach rather than just diving in and
trying to solve everything, trying to boil the ocean. We
all know that you need to ask the right questions, but
this hypothesis based data science actually takes it to
a whole new level. So if you're interested in learning
more, check out the BADIR framework.
Kirill Eremenko: As usual, you'll find the show notes at
SuperDataScience.com/363. That's
SuperDataScience.com/363. There you'll find any
links and materials we mentioned on the episode,
including Piyanka's book, or books I should say, one
which you can purchase on Amazon. I think one she
said is free. Then you can find Piyanka's courses there.
You can find Piyanka's company for if you want to do
any consulting projects with her, and of course
LinkedIn, Twitter, everywhere else where you can
follow Piyanka. Piyanka does quite a bit of keynotes
around the world, probably also in virtual events. So
make sure to follow her and maybe you can attend on
of the upcoming events with her as well.
Kirill Eremenko: And on that note, if you know anybody who would
benefit from this podcast, make sure to send them the
link, SuperDataScience.com/363. Very easy to share,
and maybe you can help somebody become an even
better data scientist by applying some of the methods
that we spoke about today. Thank you so much for
being here today. Really appreciate you spending this
hour with us and taking the time to tune into the
SuperDataScience podcast. Hope we delivered on
bringing you an amazing guest once again, and I will
see you back here next time. Until then, happy
analyzing.