SDS PODCAST EPISODE 207 WITH KRISTEN KEHRER...your resume. And in fact, in terms of that the last...
Transcript of SDS PODCAST EPISODE 207 WITH KRISTEN KEHRER...your resume. And in fact, in terms of that the last...
Kirill Eremenko: This is episode number 207 with founder of Data
Moves Me, Kristen Kehrer. Welcome to the Super Data
Science podcast. My name is Kirill Eremenko, Data
Science Coach and lifestyle entrepreneur. And each
week, we bring you an inspiring people and ideas to
help you build your successful career in data science.
Thanks for being here today. And now, let's make the
complex simple.
Kirill Eremenko: Welcome to the Super Data Science podcast, ladies
and gentlemen. Super excited to have you back on this
show. And today, we've got a very inspiring guest,
Kristen Kehrer, a data scientist with 10 years of
experience, a data science influencer, a future data
science author, a co-host of Data Science podcast and
many, many more exciting roles that Kristen fulfills in
the space of data science in the way she gives back to
the community. And in fact, Kristen was one of the
speakers at DataScienceGo 2018, and her talk was full
of energy.
Kirill Eremenko: There was lots of excitement, lots of people came up to
Kristen after her talk. And today, she is here on the
podcast to share her journey in the space of data
science with us. And in this podcast, you'll find a lot of
valuable tips. You'll find out how and why Kristen uses
certain data science tools from SQL, to R, to Python, to
big data tools, visualization tools. You'll also find out
why Kristen uses R sometimes, and why she uses
Python sometimes, and why Kristen recommends to
make sure that you know both of these tools and what
each one of them is good for.
Kirill Eremenko: You'll also hear some valuable career hacks and tips,
whether you're just starting out into data science or
whether you're an advanced data scientist. You'll find
hacks on what technical skills actually add value to
businesses and are quite easy to learn. You will find
out what to do about your soft skills, how to give back
to the community, and in fact, how to better structure
your resume. And in fact, in terms of that the last one,
you'll find a special surprise waiting for you towards
the end of this podcast, Kristen shared something
exciting with us in terms of her course on building a
resume.
Kirill Eremenko: So lots and lots of value for all level of data scientists
and lots of energy from Kristen Kehrer right here on
the show coming up just now. So without further ado,
I bring to you Kristen Kehrer, founder of Data Moves
Me.
Kirill Eremenko: Welcome Ladies and gentlemen to the Super Data
Science podcast. This is going to be fun because we
were just recording this podcast with Kristen and then
my computer crashed, so this has got to be our second
attempt at it. Kristen, my huge apologies for that, but
it was so much fun. It was this great energy. So let's
recreate that from the start. How are you feeling about
that?
Kristen Kehrer: I'm feeling great. Let's do it.
Kirill Eremenko: Awesome. Okay. All right. I believe we started off by
me complimenting your amazing energy at
DataScienceGo and how you were inspiring everybody.
You brought in so much positivity to the event. How
did you feel about everything that happened there?
Kristen Kehrer: Oh my God. I thought it was amazing. It was fantastic
to meet people who I've been building relationships
online with for the last couple months, to meet them
and give them a hug or other people that I've been
interacting with on their posts and really get an
opportunity to meet them and connect, and everyone
was warm and friendly and the energy was incredible
the whole weekend.
Kirill Eremenko: Yeah. Thank you for the compliments. The amount of
energy you brought was just incredible. I think your
talk was one of the ones where people are like laughing
the most and having a really, really great time. That
was really cool to hear and see. Just for our listeners,
for the sake of our listeners, Kristen does a lot of
things in the space of giving back to the community of
data scientists. Kristen writes her own blog posts,
you've got webinars that you run. You've got these
sessions with Favio Vázquez. You appear on podcasts,
you're writing a book with Kate Strachnyi was on the
Super Data Science podcasts just not that long ago.
Kirill Eremenko: You're just generally helping people, speaking at
conferences, and you have your own website with a
course on it. So that is very, very exciting. I want to
ask you, where do you find the motivation and energy
to do that?
Kristen Kehrer: Yeah, it comes from true passion. I work 9:00 to 5:00,
and everyone says that after you have kids you have
less time, but that just hasn't been the case for me
because my kids go to bed at 8:00 pm and then I have
from 8:00 to 11:00 to work on other initiatives, and I'm
not a TV watcher so that's what I'm doing. Like last
night, I was turning a video into an image data set so
that I can start doing some object detection in my free
time. The people that are just so much fun and I like
taking part in data science office hours with Terry
Singh and Kate Strachnyi and Favio and some others.
Kristen Kehrer: I'm building these amazing relationships, and it's not
like I'm coming from a place of I need to give back. It's
that I just am because because it's so much fun. It's
really like my purpose.
Kirill Eremenko: So it's something that you enjoy and you actually want
to do?
Kristen Kehrer: Yes. 100%.
Kirill Eremenko: That's very cool. Would you say that, I believe I asked
you this question and I think it's an important one so
I'll ask it again. Would you say that's you have to have
like in your case, 10 years of experience and be an
expert at something to be able to give back? Or do you
think that anybody who's even starting out in the field
has the capacity to give back and help others?
Kristen Kehrer: Anyone in the field absolutely has, or not even in the
field. If you are in school and you learn something
cool, share that with others. There are people who
want to read it. The LinkedIn community is incredibly
welcoming, put yourself out there and you're going to
be so pleasantly surprised with the response that you
get. It is a little bit making yourself vulnerable to put
yourself out there, but you absolutely have something
to share with those who are not learning the same
stuff.
Kristen Kehrer: You could be studying at a different school than
somebody else and learning the material in another
way, and you may help somebody better understand
an algorithm or something else, or you may give them
that aha moment that really helps someone.
Kirill Eremenko: Got you. What I like about your approach is that you
work in collaboration with other data scientists as
well. So in addition to giving back on your own, you've
taken it to the next level and you have these are
webinars with Favio and you're writing a book with
Kate. Tell us a bit about that, how do you go about
finding these partnerships, working to maintain them
and create projects together?
Kristen Kehrer: It's all been pretty organic. It's like Kate Strachnyi had
posted on LinkedIn forever go, "Hey, I'm doing Humans
of Data Science. Comments here if you want to take
part." So I commented and when it was my turn to be
on Humans of Data Science, which was open to
anyone. You could have been a first year data science
student. I met Kate and we have an incredible
friendship now, I'm not overselling it. I'm actually
traveling to her house in New York next weekend and
I'm going to spend the night.
Kristen Kehrer: I've made friendships on LinkedIn. And with Favio we
were in this group chat and we just started talking
about similar things. We started talking some more
and decided to launch our webinar series together. I
come from that mentality of, put it out there,. I don't
overthink anything too much, like if somebody has a
great idea, I'm just typically like I'm in, within the
construct of like I definitely set healthy boundaries for
myself and that way I'm always able to meet my
deadlines, but if I have time for something and I can fit
it in and it's exciting, I'm just the type of person who's
going to go for it.
Kirill Eremenko: That's a very cool way of putting it. So just like
network and connect with people online, chat, and
when you find someone with similar interests, grab the
opportunity by its horns and give it a chance, right?
Like if somebody is suggesting something, you don't
have to commit to a year of work together, but like give
it a go and see how it works out. And if the first time
you guys are able to create something that gives value
to other people, why not continue, right?
Kristen Kehrer: Yeah. Actually, that's totally how my blog started. So
my friend Jonathan Nolis, he's also a data scientist, I
noticed him getting active on LinkedIn and I texted
them and I was like, "What are you up to Mr. LinkedIn
social guy?" And he was like, "You should write a blog
article." And I was like, "Okay." I launched my first
blog article in March and now I get a lot of shares and
a lot of likes on my blog articles and I haven't been
doing it for very long.
Kirill Eremenko: That's amazing. Just for all listeners, we're actually
talking like massive, massive growth and impact.
That's how in demand this space is, and that's how
much people are hungry for help and knowledge in
this space. Kristen started in March blogging, and now
she has, Kristen you have 13,000 followers on
LinkedIn. In fact, congratulations. It just went over
13,000 as we were speaking. That's awesome.
Kristen Kehrer: Thanks.
Kirill Eremenko: That's so cool. And so when you say blogs, you don't
have to have ... I think you have your own ... You blog
on your website, but people can just blog on LinkedIn
as well. Is that correct?
Kristen Kehrer: Absolutely. You can create LinkedIn articles. My first
article I ever posted was actually on Medium first. I
sent that article into Towards Data Science, it got
rejected and then so I just submitted my next article
Towards Data Science and that one got accepted. And
so now anything I write can go in Towards Data
Science, and I would like to say that my first article
was awesome.
Kirill Eremenko: Nice. What was it about?
Kristen Kehrer: It was about using segmentation to learn, and how in
business, oftentimes you'll hear people say, "I want us
to do a segmentation. Can these be the segments?"
And it's like, "No, we should use an unsupervised
algorithm." At least that's absolutely my preference.
The algorithm decides what the natural groupings are,
or at least have an understanding of what your natural
groupings are. There's also a lot of times where I've
heard in multiple companies that I've worked for, "Hey,
we've done this market segmentation, can you tie
these people back to our actual customers?"
Kristen Kehrer: And the answer is, "No. Like, I don't have the survey
questions that you asked for my whole dataset. But I
can build you a segmentation on your internal data,
and if you want we can append third party data." And I
was also talking about like really thinking about
creative ways that you can come up with new
variables. And in one of those I mentioned, one of the
variables that I mentioned was like, "Can you
determine customers with the seasonal usage
pattern?" And then after I wrote that blog article, I
went on to find customers in our database who had
seasonal usage so that we can message to them
differently.
Kristen Kehrer: So instead of looking at somebody who has used less
than normal over the last couple months and thinking
that they're a retention risk, we're now able to market
to them differently, and say, "Hey, here's how you can
build your business in the off season." And that's
really helpful. I work for Constant Contact, they sell an
email marketing solution for small, medium, large
businesses, anyone who would benefit from email
marketing, but there's certainly people who, if you are
a ski resort or something, I don't know if a ski resort
would need it, but in the off season, it would be useful
to their business to continue thinking about list
building and continue thinking about how they can
stay front of mind in the eyes of their customers.
Kristen Kehrer: And so I feel like we're able to add value for these
people who sometimes go dormant for months at a
time.
Kirill Eremenko: That's a very, very interesting approach. I think that's
really very valuable. We actually were talking before
about techniques. Maybe this is a good opportunity for
us to revise the conversation. What would you say, you
have a whole array of techniques that you have
expertise in from time series analysis to forecasting
cluster analysis, segmentation, neural networks, text
analytics, survival analysis, full factorial MVT. What
would you say is the most valuable? And I really liked
how you mentioned previously that you've got, what
you like yourself, what you enjoy and what's useful to
the business? Do you mind sharing that with us again,
please?
Kristen Kehrer: Yeah, sure. So what I was saying is exactly ... like
there's things that I find super fun, and of course
when I was identifying seasonal customers that was
sort of like an off label use case for the model. And so
things that are a little bit more innovative and fun, like
that's really exciting to me, but a lot of the times where
I'm able to add the most value is in things like
multivariate test analysis, which isn't a skill that most
people have. I don't know that it's taught in a lot of the
data science programs. That's me just conjecturing, I
don't have any factual information on that.
Kristen Kehrer: But I haven't really found too many other people who
are well versed in MVT, so I'm able to teach that to
other analysts at Vista Print and I'm able to teach that
to other analysts and data scientists at Constant
Contact and that allows them to do multivariate tests
on their website and really be able to understand the
interactions that are going on there instead of doing
iterative A/B testing where of course you'd be like
losing some information. And so that's teaching other
people how to read a Novas and do this analysis and
that opens up more possibilities for in terms of testing.
Kirill Eremenko: Yeah. Could you give us like a sample application of
MVT. I don't know, some of us maybe who are not
familiar with the technique might be able to see the
value and then start learning it.
Kristen Kehrer: Yeah, sure. So like I said, if you do iterative A/B
testing, you're not able to see the interaction of certain
variables. So in a multivariate test, it might be
something like you are promoting a sale on the
website, and in what areas should you promote that
sale, right? Because all of the real estate on the
website is important, and if you are not promoting the
sale you could be mentioning other copy or promoting
other products. So let's just say this is a site wide sale
or something, and maybe you'll have that in the
marquee, which is like the header, maybe you have
that on a product page in like a little box.
Kristen Kehrer: There's just so many different areas of a website.
There's the header, the footer, the marquee, different
product tiles. And any of those tiles could be swapped
out. And so you'd basically be looking for what is the
optimal placement or combination of placements that
is best for promoting a sale that's either going to lead
to a higher conversion rate or a higher revenue.
Kirill Eremenko: So, you're kind of like a testing multiple changes at the
same time rather than one by one, or like two verses
two, like one versus one many times?
Kristen Kehrer: Right. Exactly. If the sale is either on or off in a certain
placement and there's four different placements you're
considering, then that's two the four. And so whatever
that is, 16, so you're testing 16 different things. So it's
on in placement one and off in the other three
placements, and all those permutations up until you
get to having all four on. Intuitively a lot of people
think like, "Oh, okay, if I have the sale on in four
different placements, that's going to be better than
only having it on in three." And that's actually not true
a lot of the time.
Kristen Kehrer: You can find an optimal way of placing that message
and freeing up other space to message to other things.
But the benefit of the MVT is that you learn of the
combinations and the interactions. Whereas in a split
test or even if your split test has multiple cells, so if
you have sell A, B, C and D and you're doing different
things, you're not understanding the interaction
between A, B and C. Whereas in a multivariate test,
you can actually get at what's the effect of the
interaction of these three things, having all three
things on at once versus having four independent
cells.
Kirill Eremenko: Yup. Makes sense. Thank you very much for the
example. You mentioned as well that the two things,
that there's something that is really valuable to the
business and I can see how this would be an extremely
valuable skill to bring into the business. But then you
said that there are things in data science that you are
most excited about. So what would you say out of
these skills that you have, out of these different
algorithms that you use, what would you say is the
one that you are most excited about?
Kristen Kehrer: Yeah. I honestly get excited to build any type of model.
Kirill Eremenko: That's a good thing to get excited about if you're a data
scientist.
Kristen Kehrer: Right now I'm working on a large cluster analysis that
I'm really excited about. For me, it really is. Data
science is both an art and a science and being able to
... the added complexity comes in when you think
about your output and does this make sense in terms
of the business question and really like trying iterating
and trying different things and finding that answer
that truly gets at the business question that's
actionable, that people will ... we can automate this
and tag people and build campaigns off of it. I just
enjoy it all, and I'm really enjoying the segmentation
that I'm currently working on.
Kirill Eremenko: Fantastic. In addition to a lot of different algorithms
and skills, techniques that you have, you know quite a
bit of tools. You're a very technical person in from my
perspective. You know SQL, R, Python, Tableau,
Hadoop as well. In fact two types of SQL. Could you
tell us, what would you say is the most important
foundational skill or tool out of all of those?
Kristen Kehrer: Yeah, so I always say SQL because even though every
day now I'm in Python and I'm writing my SQL queries
in Python, day one, if you're a data scientist and you
walk into a new company, they're going to say, "Here's
our data warehouse. This is where you're getting your
data from," and you can have all of the techniques in
the world to build models, but if you're not able to
access the data and pull it correctly in a way that
makes sense, then you're sort of stopped at the
starting point.
Kirill Eremenko: Totally, totally agree. When I was starting out at
Deloitte, that was the single most valuable skill that I
had. And I brought into the business, I think I actually
studied SQL before the interview quite extensively to
make sure like I know how to get the data out of their
databases to work with it. And SQL isn't that hard,
right? It doesn't take that long to learn.
Kristen Kehrer: No, absolutely not. I taught it, not taught. Well, I have
taught it, but I learned originally on the job, and it was
something where it was a skill I didn't have, it was on
the job description. and I reached out to the company
and I said, "You know, I don't have any experience
with SQL but I'm competent and I can learn." And I got
the job and they taught me SQL, and it wasn't very
long before I was up and running. Even at Vista Print
where I was managing people, I'd have reports that
would also come with no SQL experience, and there we
didn't have people come in and teach us.
Kristen Kehrer: So I learned with like an external consultant that
literally came in and taught a group of us SQL. But at
Vista Print, I was teaching people SQL and it was
literally just like sitting down, and it's, "Here's these
tables and this is the Schema, and this is how you
read the Schema and now we're going to do some
joins." And people get up and running really quickly.
It's not a huge barrier. Like if you're somebody who's
listening right now and you don't know SQL, like you
can go and take an online course and do some
Googling, and with some effort you can pick it up
relatively quickly.
Kirill Eremenko: Oh, fantastic. Yeah, I totally agree with that. SQL, a
very good skill. And also, I see that SQL, I'm assuming
Microsoft SQL and pose gridscale. So it's good idea to
know at least two types of SQL because this for
dominant types of SQL in the world. There's also
oracle and there is also a mySQL. And so out of the
four, it's good to know at least two, get you through a
lot of situations. And then I also know that you used
both R and Python. Can you tell us a bit about how
and why you used the two tools rather than sticking to
just one of them?
Kristen Kehrer: Yeah. I had started with just one tool, I started with R
in 2004, and this was before R Studio.
Kirill Eremenko: Whoa, before R Studio. I can't even imagine R without
R Studio.
Kristen Kehrer: R Studio didn't come out until like 2010.
Kirill Eremenko: Wow. That must have been hectic to type in all that
code into a word editor.
Kristen Kehrer: The editor. Yeah, it was definitely. R has gotten so
much easier. Like if you're new to R, you should be
really grateful that you're jumping in at this time
because the learning curve was rough back in the day.
That's where I started with all my modeling, but in my
master's degree, there wasn't as much ... I wasn't
working with a database, so there wasn't as much
manipulation to do. So y core strengths in R is really
the modeling piece, and then I started picking up
Python only about six months ago and so I'm sort of in
the middle of this identity crisis where I will do a lot of
my manipulation and cleaning and automating
different things in pandas and NumPy.
Kristen Kehrer: And then if I'm building a model, I will call rpy2 and
run an model in R through Python after I do the data
cleaning in Python.
Kirill Eremenko: That's definitely a bit of an identity crisis. But I would
say it's beneficial that you are constantly interacting
with the tools because like I've met people who are
very proficient in R, and then they start learning
Python, and then two years later they haven't used R
that much and they don't really remember how to use
it and they're not as confident. Like even if there's
something that ... Because some tools are good for
some things, other tools are good for others. R and
Python how both have their advantages. And so in
those cases, people would know even that R might
have an advantage of doing something, but because
they haven't used it for two years, they will still stick to
Python.
Kirill Eremenko: Would you agree that like by using them constantly,
both at the same time, you are maintaining this high
level of acumen and you can jump into either tool
whenever you need it?
Kristen Kehrer: Oh yeah. I have both open on my work laptop right
now and I will just go back and forth. Or if somebody
mentions the new R package on LinkedIn, checking
that out. I want to use the coolest, newest, shiniest
thing and it doesn't matter which tool it's in.
Kirill Eremenko: Definitely. And I hope this serves as a inspiration to
our listeners that ... A lot of time we get asked the
question, R versus Python, which one to learn? Well,
learn both. Start with the one that's ... try out both,
see the one that you feel better about and then just
learn them both. I would personally say that probably
Python is a bit easier to learn. What'd you think?
Kristen Kehrer: Oh, absolutely. In terms of data manipulation,
Python's very intuitive to pick up, but at the same
time, R has some modeling capabilities that are tried
and true, and those packages have been around for
awhile and Python's starting to catch up. But even just
a couple of months ago, they released auto ARIMA in
Python, but it had been available for a long time in R.
And so there are certain times where just the depth,
it's the breadth and the depth of statistical modeling in
R that can just land you in R sometimes.
Kirill Eremenko: Yeah, totally agree. So another skill that you have, an
interesting one on your list of skills, which as we can
see, is already building up quite a diverse list of skills
in terms of data science. Is Hadoop and Hive, so that's
us moving into the space of big data. Could you tell us
like how valuable is it to have those skills? How
valuable has it been for your career to know how to
deal with big data?
Kristen Kehrer: I think it's been super valuable in a number of
different ways, and one of them is just simply that I
don't need to speak to the big data team if I think of a
variable that, or someone asks a question, if one of my
stakeholders asks a question and I know that that
data is available in the big data environment, I don't
need to ask somebody else to get it for me. I'm not
waiting on somebody else, or nothing's going to hold
me up when I'm trying to access all the data that I
might need for a model.
Kristen Kehrer: So that's been super useful. And then I think part of it
too is we hear big data and I had been hanging out in
the regular data world for a while and these things
become sort of big in your head like, "Oh, that person
... Everyone's talking about big data, and so you think
it's going to be this like thing that's scary or
intimidating and it's not. Like Hive is very similar to
SQL once you figure out how to access the big data
environment, like you can really easily start querying
that and getting results back in and it intuitively
makes sense if you already have the SQL knowledge.
Kirill Eremenko: That's very inspiring to hear. If people are interested in
big data, it's probably a good idea to check it out to at
least as you say, have that level of knowledge that
allows you to go in and get the data that you're looking
for and deal with these tools and learn them on the go.
So once you have that initial interaction with big data,
you see that it's not actually that scary, it's not that
different to SQL then that'll be helpful. Like personally,
I've worked with big data on the job using Greenplum
and with one of their consultants, we were going
through these things and indeed, it has its own
specifics, but at the same time, you can quite quickly
get your head around, not in extreme depths of the
topic, becoming a big data expert, but to have that
skill, to be confident that you won't get lost when you
need it. I think that's very useful for everybody.
Kristen Kehrer: I'm not setting up a cluster or anything.
Kirill Eremenko: Yeah Got you. And then let's quickly chats about
visualization. So that's another skill that you highlight
that you have in terms of data science and indeed your
talk at DataScienceGo was on killer presentations,
bringing model output to live of data storytelling. I
don't know, it was almost an hour talk or we're not
going to go through the whole thing now, but can you
get us some of the biggest takeaways? Why is data
storytelling such an important skill for data scientists?
Kristen Kehrer: We get this reputation that we are the person who's
going to try and solve this problem. We go and hide in
the corner for six months and then we emerge and we
try and explain our results to the business and to our
stakeholders in a way that they don't understand. And
a lot of these algorithms that we're building, the first
one that I start with is a neural net that I had built in
2011, and how I presented it to the business. And that
was showing them a bunch of functions that wasn't
going to land with the audience because these were
people who were nontechnical.
Kristen Kehrer: And instead of explaining it to the business in terms of
functions that they don't know what a Sigmoid
function is or maybe they've seen the graphs, but they
certainly don't need to see the function, I can bring
that to life by showing them examples of certain days
that I had forecasted, and what day is the forecast fell
apart because there was a popup thunderstorm or
what days the model performed particularly well. And
really bring to light like, "Okay, I built this model and
this is when it works the best. Here's some things that
we need to consider and when it's not going to work as
optimally."
Kristen Kehrer: And I can just show them nice intuitive graphs, or
even when I just talking about identifying customers
with seasonal usage patterns, I wasn't talking about
four year transforms. It was, "Here, look, here's a
customer," and I went into the database and I found a
person who was seasonal. And it was clear that their
business was going to be seasonal and I showed a
picture of that person in their logo and gave them an
understanding of this specific person and what their
needs might be. And then you're able to see their
usage pattern in a really simple graph.
Kristen Kehrer: And it's like the model said, this person was seasonal.
And I can also show a picture of Joan, this woman
runs a church group and churches are typically
looking for donations year round. And so you see that
this woman's a usage pattern isn't going to come up as
seasonal because regardless of the month, if I plot year
over year data, in any month, she could have sent zero
times or she could have sent one times. And in some
months, there was a spike, but there was no way for
the model to say that she was seasonal because there
was no definitive pattern to the way that she was using
the product.
Kristen Kehrer: And so, even if you're building a model that is
complex, there are ways that you should be able to
talk to the business and to create those visualizations
in a way that doesn't set the person off. Not set off,
that's not the right wording, but like showing model
output. If I show logistic regression model output, and
I had an example in my presentation where in 2013, I
thought I was doing better and I had this logistic
regression model output and I had converted log odds
to odds because of course, who knows what log odds
are like intuitively when they look at it?
Kristen Kehrer: And the thing is that slide did nothing for the audience
because first of all, I would have had to explain that
the coefficients were multiplicative and I would have
had to explain what the P values meant. And that
totally detracts away from the fact that the model that
I had built said, "Okay, these customers are more
likely to come back. We should target them." And sort
of what makes up these group of customers that are
more likely to come back and on the flip side, who are
the people who are less likely to come back and why is
that?
Kirill Eremenko: I Totally agree. And I think it also takes time. If you
find yourself explaining what logo odds are and how P
values work, then that's going to take like 20 minutes
at least of your audience's time, and by the time you're
finished, they've already forgotten what the whole
conversation was at the start and half of them are
already asleep. I'll say you really need to take into
consideration the technicality of your audience, the
average or the minimal technical level in your
audience and tailor your presentation to that.
Kristen Kehrer: Absolutely. Because if you show them model output
and you lose them in the beginning, you're not going to
get them back either for like your heavy hitters slides
at the end, they're already like, "Oh, this AI mumbo
jumbo even though it's not AI, you know." But we're
throwing that term around all the time.
Kirill Eremenko: By the way, what do you think of AI? You use neural
networks in your work, how powerful have you found
them to be?
Kristen Kehrer: The model that I built had a make of 0.85. I was
building this neural net to forecast hourly electric
load, and this was super instrumental in determining
capacity, like whether or not we had to move over
energy from one subsystem to another. I forget all of
the terminology in terms of what they did, but it was
so that they could manage the capacity of the load.
And originally, I had had some ARIMA models that I
had built to do this, but realistically, the relationship
between load and the weather, is nonlinear. We were
able to get much better accuracy, which was actually
had business implications in terms of making sure
that people's lights don't go off.
Kristen Kehrer: And that wasn't scary either, it was a whole lot of just
data, it was making sure that we took into
consideration daylight savings time and dummies for
holidays, dummies for the day of the week, dummies
for everything, dummies everywhere. And temperature
and dew point and humidity and amount of snow fall.
So there was like a lot of data, but it was way more
accurate than when we were using an ARIMA model.
Kirill Eremenko: Wow. That's really the power of AI right there. It's an
inspiring example of how you can take one approach,
replace it with deep learning, artificial intelligence, and
all of a sudden, you're taking so much more into
considerations. The price you pay ties into this whole
visualization and presentation. The price you pay is
that, it's harder to explain these models. A lot of people
see them as black box models. What are your thoughts
on that?
Kristen Kehrer: Obviously, the coefficients aren't as easily
interpretable as if we had a regression model or if we
had a cart decision tree where you can say, "Okay,
we're maximizing entropy and this guy is the most
important." But at the same time, you're still able to
take a step back and say, "Okay, I know that this
model isn't going to perform well when we all of a
sudden have a thunder storm or it's dependent on the
weather forecast. If the weather forecast for the day is
crap, then I'm not going to be able to accurately
forecast the load. I'm dependent on the weather
forecast." Those things are very conceptually easy to
understand, and I can explain those things.
Kristen Kehrer: The problem with stakeholders is that they just get
nervous when there's a black box and you can't calm
their nerves by showing them, like taking their hand
and saying, "It's okay. When the weather forecast is
good, this is what we can expect in terms of our
average error and on certain days, we're going to see
this behavior, but that's okay." And really spell it out
for them. So it can still be something that's difficult to
understand, but you can still explain it in a way that
makes people trust you, makes people become an
advocate of your work.
Kristen Kehrer: And that's what we're really trying to get to, is a point
where you're considered a thought partner and you're
not just the person who the business is going to come
to and say, "We need a model for this, build it."
Kirill Eremenko: Got you. That's a great way of putting it, that as long
as you can calm the people down and then be their
partner, that's what they're looking for. And yeah,
that's a great way of putting it.
Kristen Kehrer: It's that trust.
Kirill Eremenko: And you've got to build that so that they can ... And
that ties into like storytelling and presentation skills,
These are all people's skills, you can't build trust if
you're just focused on technical, technical, technical.
I'm really enjoying how this podcast is unraveling
because there are people who need to build out their
technical side of things, especially if you're starting out
as a data scientist, you've got some valuable, super
valuable tips here on what things to focus on, where to
start in SQL, Python, R, and how to build up your
technical expertise. But at the same time, if you were
already an advanced data scientist, you want to up
skill, up level your technical things.
Kirill Eremenko: And you've mentioned a couple of things like
multivariate testing that people don't often think
about. But also you want to be thinking about your
soft skills, your people skills, your presentation skills.
How are you going to show yourself, not just as a
person who can crunch numbers and get the outputs
and build a model, but a person who can bridge the
distance between the technical world and the business
decision makers' world, because those are the data
scientists that ultimately become the most in demand,
that thrive the most, that becomes the most useful
data scientists to business who can not just derive
insights, but actually communicate them and help the
business decision makers implement those insights to
help drive the business forward.
Kirill Eremenko: So it's been so great so far. What I want to talk about
now is you have a blog, it's called Data Moves Me, if
people haven't seen it, it's datamovesme.com. It's not
just a blog, it's a website. And I think you're doing
some great things there. So you have the blog if
anybody wants to invite you to a conference or work
with you on a project, there's a great work with me
part. But also I specifically wanted to touch on your up
level, your resume part. Tell us about ... You have a
course there, you have a course on how people can up
level their resumes. Tell us a bit about that. How did
that start?
Kirill Eremenko: Because I believe you only started this website in
August this year. Tell us a bit about this journey and
why you started and how are you helping people with
their resumes?
Kristen Kehrer: A couple of the first blog articles that I had published
were around what a job in data science looks like and
how to effectively interview and what a successful job
hunt looks like. And I had worked with a career coach,
I think I had already mentioned that I had gotten laid
off at one point and had the opportunity to work with a
career coach, and it taught me a lot. And so I shared
that through my blog and as a result, people started
sending me their resumes. And they'd say, "Can you
take a look at this? I need help. I'm not hearing from
the companies that I apply to."
Kristen Kehrer: And so for awhile, I was just, if somebody sent me
their resume, I'd just review it in my spare time and
send it back. And I saw a number of common themes,
and after I saw a number of common themes, I was
like, "I want to create a course so that I can help
people to effectively promote their skills and be more
targeted and communicate their value to the business
in a way that the business is going to be more
receptive to." And so I created that course and made it
available, and it was one of those things that if you do
something a couple of times, you're supposed to
automate it. So that's what I did, is I automated it.
Kirill Eremenko: Nice. And now you can reach more people and help
more people, right?
Kristen Kehrer: Absolutely. Absolutely.
Kirill Eremenko: What's the feedback been so far of the course?
Kristen Kehrer: Oh my God, the feedback has been incredible and it's
really difficult to put reviews up because a lot of these
people that are going through the course currently
have a job. And so they want to remain anonymous,
but nothing feels better than when somebody emails
me and they're like, "I got a job today." And of course,
the resume does not get you a job, I just want to be
clear that the resume opens the door to the interview
and then once you go into the interview, you need to
take it from there. But for those people who aren't
getting ...
Kristen Kehrer: I had one guy who has a PhD, and a ton of experience,
he's an older gentleman and he had been applying so
many places and not hearing back from anyone. And
he went through my course and in applying to 20
companies, he heard back from five and one of them
was Google. And so, it feels really great to get that
feedback in the way that I'm able to help people, it's
like a really special-
Kirill Eremenko: Fantastic. Can you give us like a tip, like an insight
from your course, something that's already on this
podcast, people can get value by just by knowing this
one thing? What would you say is one of the most ... I
don't want you to share the whole course here, I'm
sure, we won't even have enough time for that, but
give us like one thing that would bring value to our
listeners.
Kristen Kehrer: Definitely in terms of being able to get past the
automated systems and being able to get into the
hands of an actual person, it's really important that
your resume is parsible. Any of the medium to large
companies, majority of them are going to use these
automated systems and if you're Tableauizing your
resume, which I didn't even realize that was a term
until I'd seen it on LinkedIn, people creating their
resumes in Tableau or if you're putting charts on your
resume to show your skills with SQL is five stars,
those things aren't parsible, so you're not going to be
able to get through the automated systems. And then
again, I really push people to think about the value
that they're adding. Because you hear, you're
supposed to start with a verb and you're supposed to
end with a result.
Kristen Kehrer: And a lot of times, people are like, "Well, I don't have a
concrete, 'I added 5% in revenue.'" And so they leave
that out. But if you automated the process and that
saved man hours, that's value. There's a lot of things
that are value, that aren't necessarily as quantifiable.
Kirill Eremenko: Got you. Well, those are some valuable tips that you
recommend actually in your experiences including
those that value as much as possible, and highlighting
them.
Kristen Kehrer: Absolutely. Absolutely. Even on my own resume, like
with the neural net, it was, I built a neural net to
forecast hourly electric load. "Okay, cool story, Bro.
What was it used for?" "Oh, well, okay. Actually, this
was imperative during heatwaves to make sure that we
could manage capacity." That's value or, "I helped to
automate A/B test analysis through writing in our
package, that saved four hours per test that we ran
because we didn't need to have an analyst doing the
same thing over and over and over again." And that's
not a machine learning algorithm, that's just
automating a process.
Kristen Kehrer: And it's like, "But I'm saving four hours of somebody's
time," and a business is going to see that and be like,
"Wow, this person gets it." They can explain the value
that they're providing, and it's not always just, "I
increased revenue by 3%." Or, "I increased conversion
by 2%."
Kirill Eremenko: Yeah. And also the business, as you say, the business
sees that this person gets it, like they see that you
think, not in terms of just like, "I like doing data
science work. I like cool projects," which it's just a
valuable attitude in itself, but they also see that you
are thinking about how are you bringing value to the
business, how you've brought value to your past
business or your current business and therefore,
you're going to be thinking in the future about their
business as well. And they want people like that on
board, They want partners, as you said.
Kristen Kehrer: Absolutely.
Kirill Eremenko: Awesome. Fantastic. And can you comment on that tip
again. I found that really valuable. You shared with me
before that it's better to send Word versions of your
resume rather than PDF versions. Why is that?
Kristen Kehrer: Oh yeah. I actually have three blog articles on Data
Moves Me, one is around just getting past ATS, the
Automated Tracking System. One is around
positioning yourself during a career change and the
other one is about writing like crisp, concise content
that makes an impact on your resume. And in that
first blog article on getting past the applicant tracking
system, I put a link to Indeed where it shows you the
number of applicant tracking systems that are in use,
the number is in the hundreds.
Kristen Kehrer: And a lot of the older systems have difficulty parsing
PDFs. And so to hedge your bets, it's better to submit
your resume as a .DOC because you know that it will
be parsible by the applicant tracking system. So the
newer systems can parse PDFs, but not all of them
can.
Kirill Eremenko: Wow. Is it Doc or Docx is also okay.
Kristen Kehrer: Docx is also okay.
Kirill Eremenko: Okay. Well, you see, I didn't know that and I found
that very insightful. When I was applying for jobs, I
would always send PDFs because I thought they look
prettier and the person, when they open it up they
can't see all the underlines, in case there's some coma,
that I didn't put on purpose or formatting stuff. But
that's very insightful too, I hope it's very helpful to our
listeners here. And this is going to come completely
spontaneous, Kristen, I'm sorry to put you on the spot,
but I have a question for you. Would you be willing to
create a special coupon for our listeners on the
podcast in case there are listeners who are interested
in taking your course and would like to participate,
would you be willing to help out by like some special
discount for our listeners here?
Kristen Kehrer: Oh, absolutely.
Kirill Eremenko: Awesome. Thank you. So we'll discuss that after the
show, and everybody we'll include it in the show notes
and I'll mention the show notes at the outro of this
episode. So make sure to check out Data Moves Me
and we'll get some wonderful coupon from Kristen.
Thank you so much for that.
Kristen Kehrer: Yeah, no problem.
Kirill Eremenko: Okay. Well on that note, I think we've covered off quite
a lot. I'm sure there's a lot more. I have a whole ton of
questions like how you've managed teams, the
importance of building up brands, neural nets, which
you talked a little bit about, but probably one last
thing I wanted to cover off before letting you go is, the
book that you're working on with Kate Strachnyi called
Mothers of Data Science. Could you tell us a bit about
that and how the idea came to be, and what is this
book going to be all about when it's released?
Kristen Kehrer: Oh my God, it's so exciting. And I understand that it's
super niche, not everybody is a woman and certainly,
not a mother. So it's not necessarily for everyone
because it's super niche, but it was just an
opportunity. We interviewed Cathy O'Neil, Carla
Gentry, Lillian Pierson, Natalie Evans Harris, just like
a bunch of amazing women. And so it was really an
opportunity for Kate and I to have fantastic
conversations with women that we admire who are bad
asses in data science, who have been doing it for a
while, but also talk about the fact that a lot of times,
we're working in teams, that we're the only woman.
Kristen Kehrer: And when you have a child and you're working in an
all male team, things can get a little hairy in terms of
just trying to balance everything. And I understand,
I'm not saying that fathers don't have a lot to balance.
My husband's is absolutely a 50/50 partner in
everything that we do in the home, but it was just an
opportunity to get personal with some people that I
really look up to and share our experiences as mothers
of data science.
Kirill Eremenko: So cool. I can already feel the excitement in your voice.
What would you say that, like some of the biggest
highlights are that you've had in these conversations?
Kristen Kehrer: It's funny. I just love talking to Cathy O'Neil and she
just makes you think about everything. I had met her
at ODSC in May of this year, And I absolutely fell in
love with her personality, it's very straight and to the
point and she absolutely brought that to the interview.
And it's not necessarily something that will make it in
the book, but after reading her book, Weapons of Math
Destruction and all the things that she's thinking
about in terms of ethics and bias in modeling and how
we're perpetuating these biases, and then to talk to
her.
Kristen Kehrer: And I'm like, "I'm really excited about the work I do."
And she's like, "Yeah, but I don't want your emails."
And it's just like ... I just loved moments like that and
hearing about the struggles of Carla Gentry who, she
didn't have an incredibly easy time. And she talks
about some of her regrets in terms of choices that she
made putting work in front of her family life. And it
was just fascinating to watch, because she's been in
the industry now for 21 years. So to just hear from
someone who has just so much experience in ...
Kristen Kehrer: And actually, we also talked to Olivia Parr-Rud, who is
a grandmother of data science, and she was talking
about how she built a logistic regression model on
45,000 rows of data and would have to run it over the
weekend, and it would take that long to run. Oh man,
there was just so much fun, interesting to connect
with these people.
Kirill Eremenko: Even though you say it's a niche book, it sounds like
an interesting read. I would totally be interested in
reading about that. Obviously, I'm not at the stage
where I have kids, I'm not even a father of data
science, but to me, it sounds quite exciting, these
journeys. It's always interesting to hear somebody's
journey through their career, through data science and
the struggles they had. And like family time, is family
time for everybody, not only if you just have children.
So I'm very, very impressed and I'm grateful that
you're working on this project, I think it will help many
people. And personally, I will pick up one of these
copies. When is it coming out?
Kristen Kehrer: Oh man, Kate and I have a goal of making sure that it
gets out this year.
Kirill Eremenko: This year. Okay, good. Maybe some Christmas
presents for some people.
Kristen Kehrer: Yeah, Christmas presents.
Kirill Eremenko: Nice.
Kristen Kehrer: In your stocking, Mothers of Data Science.
Kirill Eremenko: Awesome. Okay. Well on that note, Kristen. Thank you
so much for coming on the show, it has been an
incredible pleasure. And before I let you go, could you
let us know, the listeners on the podcast, where are
some of the best places to find you online and follow
you and your career?
Kristen Kehrer: Oh yeah. Absolutely on LinkedIn. That's I think where
I'm the most active. And so you can absolutely follow
me there. I'm also on Twitter @Datamovesher and I'm
on Instagram @Datamovesher. My Instagram is
definitely more personal, I posted a picture of my kids
tonight, but I'm around and you can find me.
Kirill Eremenko: Awesome. Fantastic. And obviously, we also have the
website, Datamovesme.com.
Kristen Kehrer: Yes.
Kirill Eremenko: Great. And as mentioned, we'll include the coupon for
the course in the show notes. One last question.
What's the book that you would recommend to our
listeners to inspire and help their careers in data
science?
Kristen Kehrer: Oh man. Obviously, you need to pick up women,
Mothers of Data Science and the book that I read most
recently that I just mentioned was Weapons of Math
Destruction, and it really did push me to think about
these models that I'm building and to think about the
effect that they have on society.
Kirill Eremenko: Thanks so much. I've heard of that book, Weapons of
Math Destruction, I haven't read it yet, but that's
another reason to pick it up, your recommendation.
There you go. On that note, thanks so much, Kristen,
for being on the show today here and sharing this time
and your expertise in the space with us, which I'm
sure a lot of people got a lot of inspiration and insights
from this. Thank you so much.
Kristen Kehrer: Oh my God. Thank you so much for having me. This
was so much fun.
Kirill Eremenko: For sure. The pleasure is mine.
Kirill Eremenko: There you have it. That was Kristen Kehrer from Data
Moves Me. I hope you've enjoyed this episode as much
as I did. My personal favorite part was when Kristen
mentioned that there's two types of valuable skills in
data science, the ones that are useful, something that
you enjoy, that are useful to you personally, that
you're learning a lot through. And there's those other
ones that are useful to the business. Sometimes they
will match up and that's amazing, sometimes they
won't, but it's good to know both. It's good to know
which skills are great to explore and have fun with and
potentially find new ways of applying.
Kirill Eremenko: And it's good to know which skills are solid ones that
you want to go to and you know that there's a high
chance that they will bring value to the business,
because a lot of time, that's something that data
scientists miss. You need to know how to add value to
businesses. And it was very nice of Kristen, of course,
to share a coupon with us for her course. If you'd like
to take the course and definitely use the coupon in
that case, you can find it at
www.superdatascience.com/207. That's where you'll
find the link to Kristen's course and the coupon that
she mentioned.
Kirill Eremenko: It's for you to take her course on building your
resume, and also you'll find all the show notes there,
all the things that we've talked about, the materials,
link or URL to Kristen's LinkedIn, and the books that
we've mentioned. And make sure to, even if you don't
take the course, make sure to connect with Kristen
and follow her on LinkedIn, because there's going to be
lots of exciting announcements. And personally, I'm
looking forward to the book, Mothers in Data Science
coming out, hopefully later this year. So I can pick it
up, and I highly encourage you to check out a copy as
well.
Kirill Eremenko: The show notes are once again, at
www.superdatascience.com/207. Hope you've enjoyed
this and maybe you will be at DataScienceGo 2019
next year, to meet inspiring people like Kristen and
other speakers that we've had. On that note, thank
you so much for being here today and I look forward to
seeing you next time. Until then, happy analyzing.