SDS PODCAST EPISODE 121 WITH ALEX ANTIC · Kirill: This is episode number 121 with Senior Data...
Transcript of SDS PODCAST EPISODE 121 WITH ALEX ANTIC · Kirill: This is episode number 121 with Senior Data...
Kirill: This is episode number 121 with Senior Data Scientist at the
Australian Federal Government, Dr Alex Antic.
(background music plays)
Welcome to the SuperDataScience podcast. My name is Kirill
Eremenko, data science coach and lifestyle entrepreneur.
And each week we bring you inspiring people and ideas to
help you build your successful career in data science.
Thanks for being here today and now let’s make the complex
simple.
(background music plays)
Welcome back to the SuperDataScience podcast, ladies and
gentlemen. And today I've got a very interesting and
insightful episode for you. On the show, I have Dr Alex
Antic. Now, Alex started out into the space of data science
with a PhD in Applied Mathematics. And then his career
took him on an incredible whirlwind of journeys. He's been a
quantitative analyst, or a quant, in banks and investment
organisations. He's been in the space of customer analytics.
He's been a consultant at PriceWaterhouseCoopers, and he
has also worked with the Australian Federal Government. So
a very, very diverse background and in the first half of the
podcast, we will walk through all of it and you will find some
very interesting insights and applications that he's seen in
his career. And then in the second half of the podcast - well,
in the second half of the podcast, Alex really surprised us
with some special gifts that he shared on this podcast.
So Alex has a huge wealth of knowledge and experience in
the space of data science, and he actually runs a meetup
group in Canberra for data scientists, and he constantly
helps and mentors other data scientists in this space. And
so Alex was kind enough to actually prepare something for
this podcast. He prepared two guides for data scientists, and
he shared them with us on the podcast. In the second half,
you will find them there. So the first guide is how to become
an effective data scientist. And there, we don't just talk
about technical skills. We talk about technical skills, the
business side of things, communication, and attitude. Well
in fact, Dr Alex just shares all these things, all of his wealth
of knowledge in that space.
And the second guide is for those who want to build a
successful data science practice. So whether you are a
person who wants to get into data science and be the most
effective data scientist that you possibly can, or whether
you're looking to build a successful data science practice, in
both cases you will get incredible value from what Alex
shared on this podcast. In fact, the insights were so amazing
that we couldn't just leave them as audio, and together with
Alex and the design team at SuperDataScience, we put
together two infographics for you. So one for each of those
guides, and you can get those infographics if you go to
www.superdatascience.com/121. You can just download
and keep them in order to help you remember what Alex
mentioned on the podcast, the steps that he outlines,
whether it is for becoming an effective data scientist or
whether it is to build a successful career in data science.
So you have an opportunity, then go and download these
infographics before you listen so you can follow along. If
you're on the go, if you're in the car or you're running or
you're on a bicycle, or you're on public transport, that's ok,
listen to the podcast and then still make sure to download
those infographics so you can keep something tangible that
you can always reference just to refresh on how to do either
of those things.
On that note, you can already hear that I'm very excited
about this podcast, so on that note, without further ado, I
bring to you Dr Alex Antic.
(background music plays)
Welcome ladies and gentlemen to the SuperDataScience
podcast. Today I've got a very special guest calling in from
Australia, Dr Alex Antic. Welcome, Alex, to the show, how
are you going today?
Alex: Thank you, Kirill. Yeah, great to be here. Looking forward to
our discussion.
Kirill: Me too, very much so. And where are you right now?
Alex: Based in Canberra at the moment, so dialling in from home,
doing some errands and speaking to you before heading off
to work.
Kirill: Awesome, awesome. And how is the weather down there in
Canberra?
Alex: It's lovely. Nice, hot day today, quite warm. Nice change from
the recent rain we've had. That should be good.
Kirill: And a lot of people don't know this, but Canberra is actually
the capital of Australia. When I was a kid, I used to think it
was Sydney. Do you correct people often about that?
Alex: I haven't for a while, given most people I deal with these
days are Canberra-based, and hopefully they've figured that
out by now. I have on occasion, when I've travelled around.
Yes, that is a good point.
Kirill: For those listening, there's a bit of geography there.
Canberra is the capital of Australia. How big is Canberra?
Alex: I'm not sure size-wise, the population's about 400,000. It's
quite small in terms of relative, physical size to the rest of
the country. And as you know, it's located almost halfway
between Sydney and Melbourne.
Kirill: And there's a lot of government facilities there?
Alex: Yes, it's very much the heart of the country when it comes to
politics and government departments and agencies.
Kirill: And just looking through your background, I think that
information will be very relevant to our discussion. But let's
start with the beginning. You've got a very interesting and
diverse background, a PhD in mathematics if I'm not
mistaken, applied mathematics, and then you've done lots of
different consulting work and in fact, for those who are
listening, Alex was recommended for the podcast by one of
our previous guests, by Ot Ratsaphong, who heard one of
your talks, Alex, at I think the R User Group in Canberra, or
the Data Science User Group in Canberra, and he found it
really fascinating. So tell us a bit more about that. You run
these user groups for data scientists in Canberra, is that one
of your passions?
Alex: Yes, promoting analytics and data science overall is
definitely a passion of mine. So when an opportunity came
up a few years ago to host the Canberra R User Group and
Data Science Canberra locally, I thought it would be a
fantastic way to not only meet up with the large number of
data scientists and analysts we have throughout the
government space here in Canberra, but also to I guess
spend a bit more time mentoring aspiring and junior data
scientists who often come to me for advice, technical, career
advice, whatever the case may be, I thought it would be a
fantastic forum to actually get everyone together and just
share ideas and speak about what we're doing, which often
wouldn't be so easy to share those ideas and to see one
another outside of conferences that may occur.
So Canberra is quite unique in the sense that we have a lot
of really great people working in different departments and
agencies, but sometimes they're working on their own, or in
small teams, so they have very little oversight on what
others are doing, and sometimes I'm quite surprised to hear
that someone else is working on a similar problem, or is
working on something that's exciting and that they'd like to
get into and want to ask about.
So I tried to invite speakers who are doing something quite
interesting, I think it will apply and appeal to most people,
invite them along to have a chat, and share their ideas, and
it normally works quite well. People seem very happy to
attend and to reach out to one another to share stories, war
stories.
Kirill: That’s awesome. And how often do you have these groups?
How many times a month?
Alex: Yeah, it varies. I try and do one every month or two,
depending on my own availability. In my previous role I was
travelling quite a lot, so that made it difficult, but now that
I’ll be spending a lot of more time in Canberra I’ll try and do
it every month or two, at least have one of them running
every month, if not more frequently. That would be ideal
actually.
Kirill: And so when these groups get together, you have a speaker
or a couple of speakers who present or do you do some
exercises? How do these groups run?
Alex: Normally it would involve one main speaker and then myself
or someone else doing I guess a small introduction
beforehand to just give a small oversight of what they’re
currently working on and what may be of interest, and that
would lead into the main speaker and then have a lot of
questions. Question/answer session after that. And then
people would tend to mingle before and after just to catch up
with people they know or just to ask them informal
questions on what they’re working on. It’s quite relaxed and
casual in that sense. That tends to work well. A lot of us
tend to be introverts and we prefer these more informal
sessions to talk to one another and to get some advice or
just to share our own views. And it’s often a lot of fun, yes.
Kirill: Oh, fantastic. I’m going to play the devil’s advocate here. In
this day and age where everything is interconnected online
and there is plenty of resources and plenty of forums where
people can go online to find mentors or to connect with
others, find out about their work, talk and so on, how is
catching up in person better, how is it more beneficial, and
why do people get more out of it than just online interactions
that are readily available to them at any time of the day?
Alex: That’s a good question. I think in reality people use both,
both methods of communications, to learn. They’re both
fantastic and have a lot to offer. 24/7 access via the digital
platform is incredible, you can’t knock that at all, but I
guess being human and social creatures we love to actually
be able to speak face to face with people, get that immediate
response. And speaking to someone, asking them questions,
you can read their body language, which sometimes is very
helpful when someone is trying to answer a question about,
“Should I take this particular job that I’ve been offered?” or
“I’m having a problem with some technical issue. Do I have a
chance to solve it?” I think people are a bit receptive to the
human elements than they may be on a forum where you
can get a lot of negative feedback at times which isn’t always
helpful, a lot of criticism depending on the forums, so I think
there’s space for both and as humans, we appreciate both
streams. I think it’s a good thing having both open to us.
Kirill: Okay. I totally agree with that. I think you’re right. And that
human element, I don’t know, it has some magic to it that
you just can’t get online sometimes.
Alex: That’s right. When you see someone speak about a topic that
you’re interested in and passionate about, I think being
there can excite you and inspire you a lot more than just
reading about it online or hearing a recording sometimes.
Feeding off the people around you and the vibe in the room
can be quite powerful.
Kirill: Totally. So what would you then recommend to people who
are not yet attending meetups? I’m certain there are people
listening to this podcast who haven’t ever attended a data
science meetup. They like their profession, they go to work,
they do their job, they meet people at work, but they’ve never
gone out of their way to actually connect and meet others
through a meetup like this. What would your advice be for
them?
Alex: I highly recommend that they give it a go, maybe start one
themselves if there isn’t anything like that in their local area
or community. I think it’s quite easy to set up a meetup site
online, get a mailing list together, use your contacts and
networks. Otherwise definitely attend. There’s a lot of
specialty ones I’ve noticed within data science overall –
there’s deep learning ones, ones on machine learning, R,
Python, whatever the case may be. Pick one that you’d be
interested in. You may have a lot to offer that you don’t
realize. You may be able to learn a lot from your peers.
They’re normally quite short and very informal sessions, so
go along and you might be surprised by how much you enjoy
them.
Kirill: Fantastic. Any recommendations on where to find these
meetups if one is not arranging their own?
Alex: I think just look up the website, the Meetup website, and
have a look in your local area, do a search, or reach out to
your contacts and ask if they know of any as well.
Kirill: Meetup.com, yeah?
Alex: Yes.
Kirill: I was actually surprised at how very interesting and broad
that website is. I was in San Diego a few months ago and I
had nothing to do in the morning and I wanted to go do
some yoga. I looked up ‘yoga meetup’ and literally the next
morning I went to a yoga meetup and it was amazing.
Alex: It’s incredible.
Kirill: Yeah, it’s a really cool place. Okay, we jumped straight into
the meetups discussion. But now let’s rewind a little bit and
talk about your background. Walk us a little bit through it.
You started with a Bachelor’s degree in Math and Computer
Science. Let’s go from there.
Alex: I did a double degree, Mathematics and Computer Science,
which was quite new at the time, there weren’t many
universities in Australia offering an actual double degree
versus a double major.
Kirill: What’s the difference?
Alex: Double degree is you walk out with two degrees. I guess they
synthesize the six years of the two separate 3 year Bachelor
degrees into one 4 year degree. That has its own challenges
obviously, taking extra credits, but the reason I did that was
I really enjoyed mathematics a lot, worked hard and did well
at school, so I wanted to pursue that to learn more. I had no
specific career aspirations in mind when I did that. And the
computer science element, I was getting into the
programming and I thought it would make a great mix. I
thought math on its own wasn’t enough, I wanted to do
something else so it was either maths and physics, or maths
and computer science, and I thought, “Yeah, I’d love to learn
a bit more about coding and that might come in handy one
day,” which it very much has.
But throughout that, I have to be honest and say maths was
more my passion. After that I did an honours degree in pure
mathematics, which was very interesting, especially some of
the advanced algebra theory I did, some of the more
complicated stuff I’d studied in my life. And then after that I
was considering a doctorate and some of the applied maths
that I studied, I really enjoyed the element of applying maths
to the really world using both the mathematics and
computer science elements of my degree, and I ended up
doing a doctorate in applied mathematics, which is actually
with the CSIRO - Commonwealth Scientific and Industrial
Research Organisation, Australia’s premier science
organization, and that entailed looking at heat transfer in
grain silos, which was a fascinating topic.
Kirill: Sorry, what was that in grain silos?
Alex: Heat transfer throughout grain silos.
Kirill: Wow. That’s very applied for sure.
Alex: Very much.
Kirill: Okay. Any interesting discoveries there?
Alex: The aim of the research was to look at regions within the
grain silo where particular insect infestations were
occurring. And given the insects are quite small, the thermal
devices they had at the time to measure the heat in those
areas, it was too large to pick up the heat distribution, so it
wasn’t sensitive enough. So the only way we could actually
try to determine what the heat was and [indecipherable
16:00] was to actually do some mathematical modelling, so
hence the heat transfer component, and the idea was that
we discovered the insects were localized within certain
regions, which meant that you could, at a lower cost, only
heat those regions to kill the insects using either microwave
heating or just high heat methods, and that way you
wouldn’t have to invest in heating up the whole grain bulk to
disinfest because the chemical methods they were using
were being phased out globally.
So we determined that you could use microwave heating or
just use large heating elements to kill the insects on the
outside periphery without damaging any other properties
within the grain bulk. Destroy the insects, keep your costs
down, and then go forth and export your grain throughout
the world, which was the main driver in this case.
Kirill: Wow, that’s really cool. So was that research applied in the
end?
Alex: Yes, it was. The government was using that to determine
which regions of grain silos and grain bulk structures they
could disinfest at a lower cost, which was great for them and
for the farmers that were actually looking at disinfesting
there in the grain holdings.
Kirill: Wow! Congratulations! There you go. People eating bread in
Australia, you might have been influenced by Alex’s
research.
Alex: Also around the world, because what was happening is we
would be exporting to a country and we would disinfest with
a chemical method which would only have say, a 99% or
99.5% rate of killing those insects, so by the time the grain
has been shipped to another country, that bulk could be re-
infested. So we needed better methods to kill a higher
volume of those insects. And what the biologists were
finding, because they were actually, depending on the
ambient temperature, theya would move into different
portions of the grain bulk, which the chemicals weren’t
always at a high enough dosage, so being able to just target
those areas using heat was quite an efficient way to actually
exterminate all the insects.
Kirill: That’s so cool. What I really like about this example is—so
this was back in early 2000s?
Alex: Yeah. Quite a while ago, yes.
Kirill: So what I like about it is, right now I think most people
would agree that we would call that data science, very data
science kind of problem, solution and so on, but back then,
it was applied mathematics. Don’t you find it interesting how
the field of data science didn’t exist back then, but you were
already doing data science?
Alex: That’s true. The label in some ways has changed, and also,
as I’m sure you’re very well aware, the computing power we
have at our disposal these days which has really shaped the
world of data science and given us a lot more freedom and
power as to how we tackle these problems. That was
probably more mathematical in the sense that the equations
I was solving, the methods were semi-analytical and
numerical, whereas a lot of the work we’re doing these days
in data science is very much numerical.
That’s the shift I’ve noticed as I’ve progressed throughout my
career and tackled different problems, I’m doing less of the
analytical and semi-analytical solutions to problems and
much more now on the numerical side given the power we
have, the beauty and incredible availability of libraries and
functions through machine learning and deep learning. So
that has been the big shift I’ve seen, and quite an interesting
one for me too.
Kirill: That’s very interesting comments. I don’t often stop to think
about that, that back in the day you had to come up with
ingenious approaches to minimize your computational cost,
whereas now you don’t really care, you just go for it.
Alex: Exactly. Distributed systems, parallel processing, it’s
fascinating.
Kirill: And with the advancement of quantum computing, do you
have any comments on that, on how we’re going to move
even further into that space where we’re just going to throw
machine learning at anything and just brute force the
results out of it?
Alex: I think we really don’t know what we’re going to discover
with that revolution. It’s going to be amazing to see.
Hopefully it occurs in my lifetime. I think it will open up a lot
of doors in terms of the problems we can tackle and how we
can solve them, and more importantly I think it will allow
the broader public or the broader industry to really see how
they can apply the power of data science and analytics to
their own problems to find innovative solutions. In the
health space, I think there’s a lot more being done in that
world, of course physics – the traditional areas where
analytics was heavy. Computational power is being used in
astronomy, theoretical and practical physics. Yeah, I think
that will be quite interesting to see what happens there.
Kirill: Yeah. And do you think that we will have quantum
computing laptops in the next decade or two decades?
Alex: Hopefully. You need to speak to a quantum computing
expert on that. I would love to know. I’m hoping that does
eventuate. That actually reminds me of one project I once
worked on during my undergrad. It was in optical sciences,
so we were looking at creating circuit boards using light
effectively to transmit information on the circuit board
rather than etching copper circuits to make them much
faster. That was I think some of the early work being done
heading towards quantum computing. So you would alter
the refractive indices on these parts on a circuit board
effectively, you’d use light to transmit the information, so I
think that was a precursor to a lot of stuff that will happen
in the future. That was fascinating.
Kirill: Yeah. And I think I’ve heard of similar approaches. They
were maybe 10-15 years ago and now we’re heading into
quantum computing space.
Alex: Yeah, almost 20 years ago. It brings back memories.
Kirill: That’s really cool. Okay, so then what happened after your
PhD?
Alex: Sure. So, I spent a brief stint as an academic deciding will I
use my powers for good or evil. Do I stay in academia or do I
go into the real world? I had a couple of professors pull me
aside and say, “The world the academia is changing. You
may want to think about heading off and doing something
different.”
Kirill: On to the dark side. (Laughs)
Alex: On to the dark side before coming back in the future. So I
thought I did the right thing morally and went out to make
some money. I guess there were two main reasons for that.
One was I’ve been in academia for almost a decade doing
undergrad and postgrad studies and some teaching. I was
enjoying that, but I felt like I needed a change. And two, a
guy who I’d done a PhD with, he was a couple of years
ahead, he went over to the real dark side, he went over into
investment banking, and he talked to me about these
fascinating problems that they were solving and how you
could use mathematics and computer science to actually do
something meaningful and I thought, “Okay, that sounds
really interesting.” I’d done a course in derivatives pricing
and I thought that was quite cool, you know, I get to use my
maths and computer science skills to do something
interesting.
So off I went into the world of financial services. Initially I
spent about a year as a lead quant in a fund of hedge funds
when hedge funds were quite sexy and the rage, which was a
great way using my skills to look at portfolio optimization
and trying to understand how to actually make more money
for the organization I work with, help them make more
money by looking at the distribution of your own
investments – in this case it was investing in hedge funds, so
that posed some really interesting challenges.
And then after a year I was approached by an investment
bank to go and actually do some front office quant work of
derivatives pricing, share pricing, and I spent almost 6 years
doing that, which was incredible. I learned a lot. That was
probably the highlight of my career in many ways in terms
of—from a technical viewpoint it was very challenging, but
also very exciting.
Kirill: I’d like just to pause you for a second, because I’m looking at
your LinkedIn and you mentioned that you used some
modelling techniques including Monte Carlo simulations.
Alex: Yes.
Kirill: I’m not an expert on Monte Carlo simulations, but I’ve done
some work with them, and I find that approach so
interesting. Would you care to share some insights about
Monte Carlo with us?
Alex: I guess in a simple way it’s very much like rolling the dice. I
used to explain to people it’s looking at a brute force
approach to try and solve an equation. So you might do a
million or 10 million simulations on all possible results that
you can have and you effectively average them out in the
end. So at the time, before we had the power of machine
learning that we do today, we had to try and solve some
quite complex problems numerically, some of them we could
solve analytically and semi-analytically, which was great,
but the others we had to take a numerical approach and
often in that space, in the derivatives pricing and the
financial world, Monte Carlo was quite a popular way and an
effective way to actually come up with those solutions.
I found it quite interesting to use because you were looking
at some of the fundamental mathematics through a
computational solution. And I think in some ways, if I can be
so vague, it was kind of a precursor to a lot of the machine
learning we’re doing today, especially the more brute force
approaches. It’s something I haven’t touched since then, to
be honest. I predominantly used it in that career and haven’t
had to think about it for many years. It’s interesting that you
bring it up. It’s good that you bring it up. Actually, I wonder
how much it’s used these days given the power we have of
machine learning.
Kirill: Yeah, it’s interesting. I’ve talked to a few people who’ve used
Monte Carlo, but not as much. Still some use it in finance,
but I’ve discovered that some biologists use it in modelling
evolutionary—
Alex: [indecipherable 26:10]
Kirill: Yeah. So one TED Talk I was listening to, what they did I
think is they were modelling—okay, so do we see the world
as it is, the world around us? Is the table I’m sitting at, is it
actually white, in reality does it exist the same way I see it?
So what they were modelling, the theory they were trying to
prove, was that this table or whatever we see is actually a
mind projection and in reality these things might be
completely different. Like, a tomato might not be a tomato, it
might be something else, but our brain makes it look at it as
it’s red, it’s this form, it’s this smell, whatever, because it’s
good for us to eat it.
So they were modelling, like, “Let’s see if there’s a species
that sees the world as it is versus a species that sees the
world as the brain tells you to see it, and which one will
outperform the other one.” And they used Monte Carlo to
run those simulations to see on average who is going to win.
Alex: That’s fascinating. It reminds me of Schrödinger's cat. That’s
incredible.
Kirill: Yeah, very interesting examples of that. Any idea why it’s
called Monte Carlo? I don’t think I’ve ever answered that
question.
Alex: I once heard it came from being done in Monte Carlo itself.
I’m not sure if that’s true or not. It could have been based on
some of the techniques people were using. Yeah, it’s heavily
based around repeated random sampling, the process itself,
so maybe it does come from the gambling world, I’m not
sure.
Kirill: Okay. Well, there we go. If anybody is interested, Monte
Carlo is a pretty interesting averaging method. But let’s
move on. You spent almost 6 years in the commodities
space.
Alex: Fixed income, currencies and commodities, yeah. What was
interesting about that is I was there during the GFC, so it
was pre- and post-GFC.
Kirill: Did you notice any change?
Alex: I did. Pre-GFC, the appetite within the investment sector
from our clients leading up to it was looking at more
complex, more intricate, exotic options in derivatives that we
were pricing, so that was very challenging for us in the
quant space, for my team, actually looking at more complex
and complex problems to solve, so we were often having to
reach out to the academic world and read journal articles to
try and look at inventions being made in that space and how
to turn those often theoretical solutions into practical
problems.
That was really challenging and interesting, but the problem
was we couldn’t publish any of these in the IP. That was a
shame because we came up with some really interesting
solutions to a lot of these problems that we were facing that
others would have benefited from as well, but of course we
didn’t want the competition to get ahead of us.
Kirill: That’s the price you pay for going to the dark side.
Alex: Very much so. And then post-GFC, that appetite waned.
What we were pricing were more of the vanilla-based
products, the simpler options, so that became I guess for me
less challenging. Having spent quite a bit of time there, I felt
like I needed a change. I was doing more mentorship, so the
management side was interesting me a little bit more, but
not just doing the daily grind of just hacking away at
problems and coming up with intricate solutions. I wanted
to share my knowledge and experience a bit more.
Yeah, that made me decide to take a short break and then
move on. I felt in some ways burnt out after that. Even
though I greatly enjoyed it, I worked with some fascinating
people, some of the best quants in the country, I just really
wanted a change so it was great to have a chance to have a
short break and then move into another role.
Kirill: That’s very admirable, you know, for anyone to have the
courage to say no to a senior quant position, because that’s
such a sought-after position, and a lot of people even in the
space of data science think that that’s a dream job, to be a
quant or even a senior quant at a bank. The perception is,
“Once I get that job, I am going to be set for life, I will be
happy and so on.” But as you say, sometimes you just want
a change. You only have one life, right? You want to grow all
the time.
Alex: Yes. And I grappled with the moral issues as well. I wanted
to use my powers for good. I wanted to be able to—a long
time ago I wanted to share some of that knowledge and
experience to move into a government space, which
happened years later. But that was something that I was
thinking about at the time, and it was a difficult decision to
make. It’s a very difficult role to get into. Once you worked
up and built that reputation and experience, it’s very hard to
turn your back to that. And I still get asked to this day,
“Why the hell did you leave? What were you thinking?” But
as you said, life is about more than just one job or one
career. I think it’s important for me in particular to move
around, learn new things, meet new people.
Kirill: Exactly. And not to say there aren’t companies that provide
that. You know, in one company you might grow and learn
and do different things, but if you feel you’re stagnating,
then why not?
Alex: Exactly.
Kirill: Awesome. We’re going to have 500 people quit their jobs
after this. (Laughs)
Alex: I’ll be blamed for the next GFC!
Kirill: (Laughs) Yeah. By the way, with the GFC, I wanted to ask
you, why do you think they made the shift from complex
products to simpler products?
Alex: Risk appetite wanes. People weren’t willing to take as much
risk given the heat on the banks at the time and what was
happening in the banking sector, especially in the U.S. with
the large banks folding and struggling. Investors wanted
something a bit safer, a bit more secure. As you know, in
that space, high risk is high return, so people wanted a bit
more stability and safety so there was less focus on quick
wins and a real large appetite for risk. And also, some of
those products were based on short-selling, which a lot of
issues occurred, so with less interest on that, there was in
some ways less complexity with some of the models we were
actually trying to price. At least in Australia, that’s what I
saw happen for quite a while.
Kirill: It totally makes sense. I can see how people wouldn’t want to
buy those—what are they called, credit default swaps?
Alex: Yeah.
Kirill: Yeah, the main cause of the whole crisis in the first place.
Alex: Yes, unfortunately.
Kirill: Okay, so what happened after that? How did you get out of
the dark side? Where did you go?
Alex: I didn’t have a specific goal in mind. I just got to a point
where I thought I’d like to take a short break, travel a bit,
and just unwind from that. I hadn’t really taken any leave
during that period. So I took a short break and had many
offers, a lot of similar roles immediately come up, which I
thought, “I better not be swayed into that, I really want to try
something different.”
And I was told about an interesting role to move over into
the insurance space, which is a sector I’ve been thinking
about because I’ve worked with a couple of actuaries, or
people who are former actuaries. I didn’t want to necessarily
do something as technical initially. I wanted something that
was a bit more varied, so the role that I ended up taking was
a management position where I was managing a marketing
analytics team.
There were two elements to it. There was managing that
team, helping them with their marketing, looking at
customer churn, acquisition, the usual things you look for
in that space, but also what I guess is termed these days as
a lead data scientist role and also helping actuaries with the
more technical problems.
But what’s particularly interesting is that’s when I first
became aware of a lot of these techniques that are these
days known as machine learning techniques. That’s when I
first heard about that. People approached me and said,
“We’re looking into this. We’re trying to understand some of
the statistics behind this. Can you just help us out and we
can bounce some ideas and learn together?” And that’s how
I got into what is formally machine learning and data
science, moving away from the traditional analytics, from all
the mathematical modelling, stats modelling, into the more
computational side. So I did a little bit of that in that role
along with looking at customer insights and marketing
strategies, and then from there I transitioned into the
government space where I did most of my machine learning
and data science.
Kirill: It’s interesting to follow your career because on LinkedIn I
can see the dates and it’s like this transition happened for
you as data science started becoming more popular, around
2010/2011.
Alex: Yeah, that’s an interesting point. I guess I can be held
responsible for that.
Kirill: (Laughs) There we go.
Alex: Yeah, it’s funny that it happened that way. I guess I could
see what was happening and the different opportunities that
were coming up, and it just sounded very interesting to me
so I pursued it more from that point of view, as I’ve done
anything in my career, out of interest and what I’d like to
work on as opposed to having a specific career goal in mind.
Kirill: Interesting. So that brings us to Canberra, to the
government work. As we mentioned at the start, Canberra is
400,000 people and 399,000 of them work in the
government.
Alex: (Laughs) It does feel that way, doesn’t it?
Kirill: Yeah. So, what kind of work were you doing? Again,
whatever you can disclose, because I know there is some
probably sensitive topics there.
Alex: Sure. That’s understandable. So, the first role was effectively
around risk profiling, looking at predictive modelling
techniques primarily to try and find bad people, categorize
them in some way, whatever the department agency
categorized as bad. In that case it was with the Department
of Immigration and Border Protection, so trying to find, pick
out that small number of people that are trying to get into
the country illegally or maybe they’re part of some drug
syndicate or whatever the case may be. So looking at some
advanced techniques to try and pick them out, moving away
from more traditional approaches, looking at traditional
techniques, looking at people using their own tacit
knowledge or anecdotal evidence to try and profile a person.
They are looking at high-powered analytical methods to
actually target them better, to minimize how many good
people you actually catch on the border to interrogate and
interview and then to increase the number of actual bad
people you end up catching. So, yeah, that was fascinating. I
spent over two years there and worked with some really
interesting problems, helping develop systems that are used
to this day at our borders to protect our country, which I’m
very proud of.
Kirill: That’s fantastic, very exciting to hear. I’m sure a lot of
Australians listening to this will be excited to hear that data
science is making our country safer.
Alex: Yeah, that’s right.
Kirill: I hope all other countries are following in the same way.
Alex: I’m sure they are, yeah. We did a lot of liaison work with our
fellows in other Commonwealth governments and there was
a lot of interesting work that was being shared and that was
great to see.
Kirill: Fantastic. And then I noticed you moved into a different role
with the government. What caused you to move?
Alex: Once again, that was looking at new opportunities and in
some ways career progression, because the next move
involved going into a department that was much more
immature in terms of their data science and analytics
capability. So I was brought in to try and help push that
agenda forward and to help set up a platform, a cloud-based
platform looking at Hadoop and R, integrating that to really
increase their power of their analytics capability. They
needed someone to help build that up, to promote that, to
get people inspired as to what can be done with that, come
up with some proof of concepts. That was really interesting.
A lot more management in that role, managing staff,
projects, dealing with senior execs, and really helping spread
the word of data science and the power of data science,
which is something I do a lot of these days. It’s not just the
technical work which I find interesting and challenging, it’s
really promoting what can be done. There’s a long way to go,
I think, especially in the government space. I’m sure in this
country, like many others, it’s getting people to feel
comfortable with what you can do with analytics and not to
be scared of it. They see it as a black box often. It’s building
trust so they can trust the methods, the systems. And that
can often be a challenge, but quite a rewarding one when
you see people have their aha moment, “I can see why you
do it this way or why this works,” so a lot of my time is spent
educating these days, which I really enjoy.
Kirill: That’s fantastic. You mentioned on this podcast that you
would be happy to share about building a successful data
science practice from a management perspective. I think
maybe this is the right time to go a little bit into that
discussion. What tips or advice can you share for people
trying to build a successful data science practice?
Alex: Sure. I think there’s three key elements to that, which I’m
happy to go into the detail. For me it revolves around people,
the value you can add, and communication. On the people
side, it’s most imperative to have the right people, the right
mix of people in your team, to have the data scientists,
quants, whatever the case may be with whatever industry
you’re working in. They’d have to have the right technical
skills, people that have proven themselves throughout their
career to not just be able to do the technical work, but to
communicate it to yourself, to the broader department and
agency. In some roles it’s enough to have the people that are
stronger in a technical way, that can just sit at their desk
and hack away at code. We definitely need those people and
they’re highly valuable, but sometimes you need people that
can engage with the stakeholders to collect the functional
requirements effectively and then translate them to technical
requirements.
It’s having the right people with the skills to do the data
wrangling, the modelling, the engineering side to embed the
models into the enterprise-wide system. You’re having a mix
of a person that can do all that. You can’t build a successful
practice without the right people.
On the people side, you definitely need a willing coalition of
support from within the department, agency, organization,
whatever the case may be. You need support from your
peers and from senior management. Without that, you really
won’t be effective. Because the goal of data science and
analytics is to initially develop insight from data, but more
importantly it’s to make that actionable. Without that you
are pretty much just doing academic research in many ways.
Unless you have support from senior management to turn
those insights into actions, I don’t think the actual data
science practice will be effective or successful in the end. So
it’s really important to have that.
I’ve noticed many cases where people just take on data
scientists just to build teams more for the vanity reasons
rather than actual need or actually wanting to support it,
and that’s where it often fails and where a lot of challenges
occur. So if you’re going into a job, building up a practice,
make sure you have that support from higher up above.
Otherwise your efforts may be wasted in the end.
On the people side, it’s important to be a data science
evangelist to really show the benefits, to educate people
about what it can actually provide to them, how they can
personally benefit. I think that’s very important. People don’t
always see that connection so I think it’s important to take
on that inspirational role, which can be hard for some
people. They don’t feel as comfortable talking about what
they do, but it’s important to share your ideas, to
communicate, to always become a marketer of analytics and
data science, so inspire others and create meetups or
informal groups within your own organization, attend
meetups to see what other people are doing and get advice
from them. I think that’s quite imperative to the success.
Also the hierarchy can be an issue. Areas where it’s very
much hierarchical, I’ve found don’t work as effective in these
technical teams as opposed to having a more flat structure
where there’s more autonomy and flexibility. I think that
tends to work much better and data scientists prefer to work
in those environments. And also, you need to know how to
manage—for management to know how to manage data
scientists, because their career aspirations can be quite
different to the rest or to more generalists, say.
So keeping them interested, engaged, giving them access to
the right people, tools, software, whatever the case may be,
is very important. Normally it comes around making sure
there’s data for them to work on and challenging problems
for them to solve. Without that, you’ll lose those people,
you’ll have such a high churn rate, which I’ve seen many
times where I’ve managed teams and tried to hold on to
people or to bring people on. It really depends on the
organization, the problems they’re working on, and that’s
sometimes what drives me out as well. If I don’t have
interesting problems or enough data, then I’ll move on
myself. So that’s on the people side.
On the value side, the most important thing there is to be
seen as a trusted professional and not just a technical
genius. A lot of people that build these teams or work in
these teams, and rightly so, they want to be seen as the
technical gurus, that they can be approached to solve these
problems. You don’t really get the practice off the ground
unless senior management and your peers actually trust you
as well. So part of that is to share knowledge, to educate
those around you, inform them and be transparent about
what you’re doing and why you’re doing it, don’t just be a
black box, show them how you can help them and that you
want to support them and how you’ll go about doing that. I
think that’s very important.
And also what is very much key is to link the outcomes, the
work you’re doing, to the strategic goals of the organization.
Sometimes you have to make that connection quite clear,
what you’re doing and how will that benefit the organization,
how will customer churn, increase money, increase value to
the public or private sector, whatever the case may be.
Making that connection clear, always having that connection
in the back of your mind so you can use that when you’re
speaking to senior management, I think is paramount to
actually having them trust you and believe in you and throw
money and resources at you to actually try and solve their
problem.
On the more technical side around adding value, I think it’s
important for the analysts to develop software development
practices, which I see less of these days with people coming
into data science without having the more computational IT
background. They’re not used to doing things like unit
testing, peer code reviews. These days, with things like
github, that’s all becoming more popular, but for a while the
source control was something no one had ever thought of.
They’re just developing models on their own, systems stored
where no one else can see the code or debug it or do
anything. So I think those practices are very important to
having the right team and for the team to add value to
organization.
And I think using prototypes to overcome doubt and
resistance is very important because often, as we face a lot
of doubt and resistance from people around us as to how
we’re building this model, improve what we’re doing. We’ve
been doing it for years. A traditional one is, “Why is your
method any better or any different?” So building a POC to
show people, “Look, this is the insight we’re getting from the
data. This is what we can do quickly. This is the actual value
that we can add. How about we invest more time and money
to actually develop a full-blown system?”
Another important thing is to put people before technology.
It’s more important to have the right people rather than first
investing in some software solution that a vendor is pushing
and then getting people to adapt to that, which I’ve seen
happen a lot in my career. Getting the right people, let them
choose what is possible – it’s not always possible depending
on what you’re working on for security reasons, funding,
whatever, but ideally you want to get that first.
[indecipherable 46:37], worry about technology if you’re
building the system from ground up because the good people
will tell you what the need, they want flexibility. These days
it’s really moving towards open source, even in the
government space, which is great to see. We’re using R.
Python, Hadoop platforms rather than the traditional SAS-
based systems, IBM, etc. So that’s very good to see. It gives
you a lot more flexibility and it’s cheaper for the organization
department.
Another key thing is don’t be afraid to fail, or fail fast and
cheap. I think that’s important as well, don’t be shy, try new
ideas, developing the mindset like a hacker’s mindset of just
giving something a go, see if it works, work in a more agile
way, and then move on. Don’t just put all your eggs in one
basket or just have this one solution at the end. Work in a
more iterative fashion and make sure that the people around
you are comfortable doing that, and that senior management
understands and your stakeholders understand that you
want to interact with them a lot more. That interaction, I
think is very important, interacting with the business. You
can’t be isolated. You need to be constantly engaging with
them, understanding their business world.
That is key, to understand the business and then to go back
and develop solutions. And what often helps in those cases
where I’ve led those teams is to have some of my analysts
embedded in some of those teams, either in IT, in the
business unit we’re working with, in some of our stakeholder
teams, just to make sure there is a constant flow of
information back and forth. That’s what often really
increases the chance of success.
And the final one is around communication. So, one key
point there is to focus on the outcomes, not the methods
and tools used. So when you’re communicating as a data
scientist or a leader of the data scientists to other people,
talking about what the real outcomes are, something that
they can understand, use their terminology, understand
their jargon, and don’t worry so much about the tools and
methods you’ve used. That’s important to you, but may not
be the main focus from their point of view.
Try to communicate those ideas very clearly, limit the jargon
you’re using, but use their business lingo so they
understand. You all want to be speaking one language.
Visualization is often important with this. These days,
having tools like Tableau and Qlik and SAS VA is a great
way to show people some of the solutions you’re actually
coming up with. Visualization, I’ve noticed, works very well,
especially when you’re talking to people from a less technical
background.
And building trust within them also helps sometimes when
you’re doing roadshows. I used to do a lot of stakeholder
roadshows within an organization, go around to the different
teams and show them what my team can do for them, what
we can do. Let them pose particular problems and we’d say
to them, “Okay, give us a week or two, depending on our
own timing, to try and come up with a simple solution or a
roadmap that we can work together on doing a proof of
concept for you.” And often that’s great, because people then
are more open in those informal settings to discuss ideas, to
come up with questions, you know, “We’ve thought about
this. Is this possible? Does this fall into your realm?” You
open up that dialogue, that communication. It shows them
that you’re keen to learn about their passion about what you
do, and it gives them a chance to ask questions face to face.
It goes back to what we were saying earlier about the
meetups. We as humans are a lot more comfortable,
especially with these technical issues, talking face to face.
And what you find sometimes is people can be embarrassed
about asking a question. They’re not sure if it’s a stupid
question or does this technical question mean anything, but
once you allay those fears and they see you as just another
human as well, not just a technical genius or a geek, that
dialogue opens up and then often it becomes quite
successful from that point on. So hopefully that ramble
answers those questions and provides some insight as to
how I’ve managed to, at times, build quite successful teams.
Kirill: Wow. Alex, that was amazing. I was listening and at first I
was writing down everything you’re saying, so much value. I
was writing down from the people side and the value add
side and then I just ran out of space on my paper. (Laughs)
But I think it’s so valuable. If you don’t mind – the good
news is this has all been recorded – I’ll ask someone in the
team to put all of this into an infographic and then we’ll
share it on the page. So, guys listening to this, you just go to
the page of the podcast which you will hear at the end of the
session or at the start of the session, and you can download
the infographic absolutely free and we’ll share it on
LinkedIn. I think this is super valuable for people building a
data science team.
Alex: Yes, it is. If we have time, I can run through something
similar on actually becoming an effective data scientist,
which is something that I’ve often been asked by people. I’ve
got a similar list that I tend to work through in my head.
Kirill: Please, let’s do that. We definitely have time and let’s do that
again. I’m sure this is going to be super valuable on the
flipside, for those who want to not just build the data
science team but be the data scientist. So, here we go. How
to be an effective data scientist?
Alex: In my view, anyway. So, there’s four particular areas I like to
break it down to: that’s looking at skills, the business side,
communication and attitude is quite an important one, I
feel. So from the skills side, you need to have strong
quantitative skills, of course, to become a great data
scientist. The main point there is to build up those analytic
capabilities, not so much on the tool side, but how to think
about problems, the problem-solving logic involved. A key
element of that is to ignore the math and stats at your own
peril. I don’t expect people necessarily to go out there and
get a PhD in mathematics and statistics and to understand
all the fundamentals in great detail. An important thing, as
I’m sure you’re going to agree, Kirill, is to understand at an
intuitive level. I think that really makes or breaks a person
as an analyst in general throughout their career.
I’ll give you one quick reason or an anecdote as to why that
happens. I was once mentoring a junior staff member
looking at solving a particular semi-analytical solution to a
problem. And they had the answer and they said, “Look, I’ve
got this answer now. How do I know if it’s right? How do I
actually test this?” And I said, “Well, first of all, you should
be using common sense, and I can tell you your answer is
wrong.”
And he looked at me and he said, “How do you know? I’ve
gone through the mathematics, I’ve done the computations
and everything seems to make sense and I’ve done this
many times before. How do you know the answer is wrong?”
I said, “Well, first of all, if you understood the business
problem, you will see that we’re out by an order of
magnitude. The answer was 32 and I was expecting 320. So
obviously there’s a problem there.” That’s one thing. That’s
the business side. They weren’t really engaging with the
business enough to understand. They just took on the
problem and thought, “This is very much now an academic
problem and I’ll go away and solve it,” which was something
they’re comfortable with and they’ve done many times
before.” And I get that. We all fall into that trap at times.
But the important thing is, if they were also to look at the
structure of the mathematical equation underneath, they
would see why we were out by order of magnitude. So having
that intuitive grasp of what the model is doing helps you
then understand are you getting the right answer, are you in
the right ballpark, which is often very important. And also,
the structure tells you, “How do I actually go and debug it or
how do I find out where a small change in my input is giving
me this large change in the output which I’m not expecting?”
So after we worked through that, we were quite quickly able
to determine what was happening. And it gave them just a
better insight for how to actually go about solving those
problems. That’s one thing, I guess. I’m used to these days
having less of an opportunity to work on those analytical
and semi-analytical solutions, to get lost in the beauty of
mathematics given that now most things I work on are so
empirical and very much computationally-based. I’m used to
that at times, but…
So, the intuitive grasp still holds true. If you’re working on—
let’s say we’re looking at machine learning, artificial neural
networks. You’re looking at forward propagation, backward
propagation, there’s gradient descent methods, cross-
entropy, cross-functions, whatever the case may be. People
are looking at all these complex-looking equations and then
they’re thinking, “I don’t understand what’s going on here.”
Someone asked me recently a question about—they were
looking at a derivation. They were trying to gauge from me
what does this mean mathematically. One point was to
explain it from an intuitive level, to try and explain what’s
happening. But an element of that that was really important
was going back to fundamental principles they would have
studies such as calculus, in this case the chain rule, you
know, understanding how forward propagation and gradient
descent work is all around the chain rule.
So once you understand the concept of the chain rule and
what it’s doing, looking at the underlying derivatives, then
you can quickly understand, “Okay, I can see that the rate
at which the weight learns is controlled by an area in the
outputs, so large areas mean I’m getting faster learning in
the neuron itself.” So, that really helped the person grasp
the concept, even if they didn’t look at all the derivation and
do it themselves – I did that for them – but the ability to look
at it and say, “Okay, now I know what’s happening
intuitively with this calculation,” means I have a better grasp
not only of how does the method work, but how can I test
them, my own solutions, is it the right method to use for a
particular problem. That’s all paramount and really
important to actually becoming a strong data scientist.
That goes back to how do things work intuitively, which I
think people forget sometimes, but it’s incredibly important
to focus on that as you learn. It helps you learn and
understand how any problem works in life, not just the
mathematical ones.
On the quant side, often I’m asked, “Should I be doing a
Master’s or a PhD in data science?” And my question to
them is, “Why? Do you just like the idea of having those
extra letters after your name, or do you think it’s important
for your actual career?” What I think is more important to do
further studies, if any, in the fundamental sciences, in the
maths, the stats, econometrics, physics, whatever, I think
the maths and stats skills (or actuarial studies as well) that
you gain from that gives you a better grasp of what’s
happening intuitively in the modelling sense than just doing
say a Master’s in data science, which is becoming all the
rage these days that I’ve seen. That might have merit as well,
but I think understanding the fundamentals at a deeper
level will take you further in your career, especially if you’re
moving into a more technical area of machine learning, AI,
whatever. I think that’s much more important.
If you just want to work more on the periphery and
understand what’s happening, then Master’s has a lot of
benefit, but to go deeper, you need to go deeper with the
fundamentals. There’s a lot of great courses online,
SuperDataScience and others, and I try and tell people
about some of that stuff. And going back to my earlier point,
when you’re building an effective data science practice, I
think trying to do some of that education internally is
important. Like, if you were to run some introductory R
courses or data science courses internally, you’ll find that
there’s lots of analysts that are interested and that people
want to build up their skills, so not only do you share your
passion, but now you have extra people you can use within
your organization and department to help spur your cause,
and also to help you with your modelling. You know, you
have extra staff now that you can use to help you with your
workload. I think that’s great. I’ve seen that happen many
times and there’s lot of people that are very keen to learn.
They may not all go and become data scientists, but having
a greater understanding of R or Python or some simple
predictive modelling really helps them set them up for a new
element of their career they never would have done before.
So through you, they’re now going to learn more and share
that passion, which is fantastic.
Another great way to learn and to build your skills is to do a
lot of hands-on on-the-job training, which I think is a
fantastic way to learn. Kaggle, of course, and things like that
are great. At times, new areas that I’ve wanted to learn has
happened in two ways: either putting myself in a situation
where I’m trying to solve a problem in a technical field that I
haven’t really done before, it’s a great way to learn. And the
other one is to actually be asked to teach a course that I’ve
never done before. So what better way to learn than through
teaching?
So, there’s a lot of ways for people to learn these days, as we
discussed earlier: go to meetups, ask informal questions, do
a lot of online training, do formal courses… So many options
these days, but on-the-job training, I think, can be quite
important, especially if you’re working with strong analysts
around you, people that are willing to share. And when
you’re forced to work on a problem, that forces you to
quickly learn a technique and try things out, not to be
scared or just to sit back and think theoretically about a
problem but actually get your hands dirty.
The coding side, on the skills side, is very important. You
can’t get anywhere without SQL or the Hadoop equivalent
these days. You need to be able to extract the data before
you do any analysis on it, of course. R and Python are the
go-to languages. I’ve seen a lot more growth in the
government space in Australia, which has been great, not
just moving towards open source, but all the libraries and
functionality available through R and Python has been great
and opening up opportunities on the types of problems that
people can solve – predictive modelling, natural language
processing, AI, ANNs. It’s just fantastic. Even if it doesn’t go
anywhere, people are learning it and trying new things and
they’re moving away from more traditional stuff like SAS and
C#, C++, etc., so more towards the functional programming
languages, which is good.
And I guess for people it’s important on the skill side to just
be familiar with most techniques. Even if you don’t use
them, just be aware of the different things that exist, you
know, natural language processing space, convolutional
neural networks and their implications, text mining, new
advances in predictive modelling, other analytical
techniques. Just be aware at least what exists and
something you may turn to to help you solve a problem at
some point in the future. Just be aware of it. I think it’s good
going to meetups, reading stuff online. It’s a great way to
broaden your knowledge base. That’s on the skill side.
On the business side, the fundamental thing you have to
keep in the back of your mind and to constantly strive for is
to understand the business problem. Understand the
business you’re working with, assuming you’re working in
an area that I’ve—many times when you work with a
business unit, you need to understand their world. You need
to feel their pain, know what their pain points are, where
can you really add value, and then start thinking, “Okay,
how can I use my skill and experience as an analyst to
actually help these guys?”
It can be something as simple as helping them transition
away from doing some sort of high-level analytics and
reporting in Excel to a more advanced system, or do they
actually have a problem that entails itself to a predictive
modelling solution. You won’t know that until you really
focus on their area, their business area, to understand what
problems they face. So that should be at the forefront of
your mind rather than “What techniques should I be using?”
or “What do I want to work on today?” Understand the
business. That adds value to you and it adds value to the
organization.
You have to work with stakeholders, open up
communication and engage with them. You don’t want to be
isolated. As part of that, where often people get caught out is
they don’t have clear objectives defined. You know, someone
will talk about a problem at a very high-level and you think,
“Okay, I have a potential solution to this.” You go away and
develop something after a few weeks, you come back and
they’ll say, “Well, that’s not really what we meant.” “Well,
that’s what you told me it was. Hold on, where are the actual
objectives? Nothing is being written down.” So it’s important
to try from an early point to clearly stipulate what the
problem is and what the approach is to actually getting a
solution. And what will the solution look like? Will it be a
one-off report? Will it be predictive modelling that goes into
an enterprise-wide system that’s run continuously for risk
scoring? Whatever the case may be, try to have that set up
early. Sometimes it’s an iterative process. They don’t what
they want and you don’t know what you’re going to come up
with until you see the data. So work towards that.
And part of that, what’s really fundamental, is make sure
they have data. I’ve been in these situations where people
call me in and they want me to solve their problems using
the fancy world of data science, but they don’t really have
much data available, or they don’t have a large historical
database. I’m quite limited in what I can do in those cases
when there isn’t much data. Or you have data, you have
access to limited data, but you have to wait a month to go
through the security clearance, get all the data sorted. Or
one group gives you the data, but another group, because
their data is always sitting in a disparate system, it’s always
dirty, it’s always messy to work with – how are you going to
join these different datasets?
So understanding the business, understanding where the
data which is important to an organisation lies, who are the
guardians of that data, how to win their confidence to share
it with you? Because sometimes people aren’t happy to
share the data with you, I’ve noticed. They want to hold onto
it, they think the data is their property, as opposed to
belonging to the organization. So often there’s a lot of these
internal battles that you’ll have to convince people that they
should relinquish the data, that you want to help them, and
that you’re there for the greater good of the organization.
So often, when you’re working with a business to solve their
problems, I think it’s important to validate the models and
the analytics you’re doing with the people, not just by
statistics and evaluation techniques. It goes back to this
iterative process of engaging with your stakeholders. Take
them on a journey, tell them a story of what you’re working
on, why you’re doing something, show them interim results,
does this make sense, educate them why it may not make
sense, or what we’re expecting to get at this point. So that
communication and education with your stakeholders is
really important, it should never stop as you work on these
projects. I think it’s very valuable.
That takes us on to the next point, which is on
communication. So when you’re trying to communicate it
with internal stakeholders, senior management, whatever
the case may be, I think it’s important to try and excite these
people about what you’re doing, share the passion, tell a
story, show them how you can help remove the pain that
they’re facing, increase efficiencies, whatever the case may
be. Link it back to the strategic goals as I mentioned earlier
about the organization and try to make it as clear as
possible that what you’re doing will help them. As opposed
to what you’re doing is cool or exciting or is the right thing to
do, but how does it actually help them? I think that’s really
important.
Often I found what works best is to use demos, not so much
leverage the power of PowerPoint slides, which is great but it
can be a bit boring and stale to people. So, try to show them
a demo, get them involved, get them to interact with it,
“Here, change this value,” you know, “Put in the value that
you think is realistic. Let’s see what happens with the
outcome. Let’s look at some ‘What if’ analysis or try and
predict something three months down the line if we change
this particular data point now.”
And as part of that, when you’re talking to them, as I
mentioned earlier, try and adopt as much business jargon
and terminology as possible rather than just revert back to
jargon we tend to use in the data science space. If people are
keen and want to learn about that, that’s great. But if not,
it’s probably bad to try and force it on them.
Because what often happens is people get a bit more
sceptical when they hear some of these big terms that we
use, and they feel intimidated in some ways, they feel
uncomfortable. So try and steer away from a lot of that
technical jargon, use it when you have to, but speak more in
business terms and in a way that really helps them
understand how it’s going to benefit them. So that’s why I
think visualization is often important in that space.
Visualizing something as a human is sometimes easier to
grasp rather than the words that we use, which aren’t
common to everyone.
And that takes me to my final point, which is around
attitude. And I think an important aspect is don’t be scared
to fail, as I mentioned earlier, as a data scientist. Try new
ideas, talk to people. If a particular method doesn’t work, it’s
okay – try something else and no one is going to think any
less of you. It’s a dynamic world, things are changing,
there’s new techniques to try, data can be very difficult and
complex to work with, understanding the business rules
around it is very hard. Once again, having good
communication with the business owners really helps in
understanding those underlying business rules, which is
something that can catch you out, especially in the
government space, where there’s a lot of complexity in these
old legacy systems, data sitting all over the place. How does
it all hang together? What’s used and why, depending on
particular legal policy implications that sit around it? You
really need to understand that when you’re working with the
data and coming up with a model, hence the business is
your go-to for that.
Also along those similar lines with attitude is to adopt, as I
call it, a hacker’s mindset, just try new ideas and don’t be
scared if you’ve never done something in Python but you’ve
done it in R, give it a go in Python, build up your own skills.
You may find another library that you think is helpful, that
may be better or faster, whatever the case may be. Don’t be
scared to get your hands dirty and play around. And on
that, don’t be scared to ask for help when you need it, either
from a peer, from your manager, whatever. Don’t let it go too
long and you think, “Oh, I’ll finally get the answer, it will be
okay,” when this deadline is looming. If you need help with
something, either a business or a technical problem, just
yell out. I’ve noticed a lot of people, and myself at times, are
afraid to ask, think people will think we’re stupid or we don’t
know enough. You quickly realize people do want to help you
to be there for the greater good and everyone goes through
these situations where you just don’t know the answer to
something, so just ask.
And on that too, as I’ve also alluded to, always focus on the
outcomes, not the methods that you’re using. So, it’s
important to build a simple model first, not just go to
something that’s complex and exciting, which is harder to
understand, harder to debug, harder to explain to people. So
just focus on those outcomes and then worry about the
methods used to deliver on those outcomes. And part of that
is really understanding the business problem to help you
strive towards what the correct outcome is. On attitude, you
need to be curious and a problem solver, you need to enjoy
problems, and be an evangelist to really inspire others and
to share the passion for analytics and data science. So that
can be through your own work in formal gatherings, create
your own meetup, write a blog, whatever the case may be. I
think the attitude is an important one, yeah. So, yeah,
hopefully that sums up some of my ideas that I’ve come up
with over the past few years. I hope that helps someone with
their own career.
Kirill: Fantastic. I’ve been listening to this and I’ve learned quite a
few things from here and I really like how you broke it down
into those different parts, about skills, the business side,
communication, attitude. I think they’re all very valuable.
Once again, with this one we’ll also aim to do an infographic
and share, and that way for any listener, whether you’re
trying to create a data science practice or you are trying to
be the most effective data scientist that you can be, you’ll
have something to follow along.
Thank you so much, Alex, for sharing. I’m just curious, how
did you come up with these? You mentioned ‘over the years,’
but did you have a system that helped you develop these
bullet points?
Alex: I guess when I gave a meetup a little while ago, I had to force
myself to think about how to synthesize what I’ve learned in
my career in a way that would help those aspiring to become
better data scientists or to transition to the area, because
I’ve been asked many times, you know, “How do I actually go
about becoming a data scientist?” or “I’ve been doing a
different type of analytics for a while. I want to move into it.
Why should I, what skills do I need?”
And in terms of building effective teams, I guess as I was
building more and more teams, I’d often think about “What
do I need to do in the next role I move into to actually make
sure it’s as successful as it was before or better than it was
before?” So then I started thinking, “Well, I think I better
start jotting down what’s worked and what hasn’t, and then
of course bouncing ideas off peers, reflecting on what’s
worked in the past with managers I’ve had and what hasn’t
worked, what could have been done better.” And just try to
quantify everything, put it down into a nice little framework
that helps me revert back to it or just hand it to someone
and say, “Try to focus on this and then see how you go.
Come back to me if you have any questions. This should
help you get started.”
Kirill: Yeah. That’s very admirable.
Alex: Thank you.
Kirill: It’s great to see how you’re giving back to people who are
starting out and getting there. Hopefully we will help you
spread the word through this podcast.
Alex: I hope so. That would be great.
Kirill: Awesome. Well, we’re out of time, but thank you for coming.
What is the best way people can follow you? If there’s
somebody maybe in Canberra who might want to get in
touch, or somebody who wants to follow your career online,
what are the best ways to do that?
Alex: LinkedIn would be great. It has links to my Meetup groups
as well through there, so they can link to that and then
register as members and then come along to the next
meetups. But LinkedIn is a great way to stay in touch and
ask any questions anyone may have. I’m always happy to
offer advice and help out in any way I can.
Kirill: Okay, fantastic. Thank you so much. And I just have one
last question for you. Is there a book that you can
recommend to our listeners to help them in their careers?
Alex: One book I think would be great for people is written by
someone I know, aa friend of mine who is now Director of
Data Science at Microsoft. His name is Graham Williams
and the book is “Data Mining with Rattle and R.” It’s a great
book in helping people understand how to actually go
through data mining, data science process. It particularly
uses R and a package called Rattle, which Graham invented
himself. It’s a really great GUI for doing predictive modelling
within R rather than having to do all the modelling through
a command line. It’s got a great interface, makes it easy to
learn, and a great way to interrogate and understand the
data, so I highly recommend working through his book and
some examples for people who haven’t done much and
maybe look at some techniques. That’s “Data Mining with
Rattle and R” by Graham Williams. Graham is the guy who
is always happy to help people as well.
Kirill: Yeah, it says a lot that now he’s working as a data scientist,
is that right, at Microsoft?
Alex: Yeah, and for a long time he was the key data scientist at
the Australian Tax Office and he did some fantastic work
there, so he’s likely respected throughout the Australian
data science sector, one of the leads there.
Kirill: Fantastic. Well, there you go. Graham Williams: “Data
Mining with Rattle and R.” Once again, Alex, thank you so
much for coming on the show. This has been invaluable and
I’m sure lots and lots of people have and will get a lot of
value out of the insights you shared today.
Alex: I hope so. It’s been great chatting to you, Kirill. I really
enjoyed it.
Kirill: So there you have it. That was Dr Alex Antic, senior data
scientist at the Australian Federal Government. I really hope
you enjoyed today’s episode. There was lots and lots of
information to share. First of all, make sure you go to
www.superdatascience.com/121 and download those two
infographics. We put in quite a bit of effort into those and, as
you can see, Alex put in all of his life’s and career’s
experience into those, so you definitely don’t want to miss
out on those. And moreover, you might know somebody who
they can help, so feel free to share them around. That’s our
mission, that’s our goal, to help people into the space of data
science, to help spread the word, and we’ll only be excited
and happy if you can contribute to that as well.
The other question I wanted to ask you is, what was your
favourite takeaway from this podcast? Again, there was lots
and lots of information, but personally for me, I was most
curious about the discussion that we had with the shift
that’s happened in the world from analytical to numerical. I
think that was a very philosophical conclusion from what we
can observe in the world right now. Before, you had to be
very smart and cunning about the mathematical equations
you develop in order to solve problems. And now you can
just be less like that and just throw things into machine
learning algorithms and get them to churn the numbers and
it’s still going to work. It’s just like a brute force approach
instead of a very elegant mathematical approach.
That’s exactly what gave rise to the space of data science. If
we didn’t have a machine that can churn that many
numbers and brute force through things like that, it’d still
be called just mathematics, just applied mathematics, and
that’s what we’d be doing. But now we have the power of
data, the power of analysing lots of data and that’s called
data science. Interestingly, it’s continuing, that trend is
continuing. With the rise of quantum computing, we will be
able to brute force even more, we’ll have to think even less
about how to approach problems, just throw everything into
the computer and let it spit out the results and you I guess
will be able to use more and more sophisticated algorithms
like deep learning, which require a lot of computational
power simply because you will have that computational
power.
So there we go, a very interesting way in which the world is
going. And again, your takeaways from this episode might be
different, but in any case, I hope you enjoyed this
conversation today. If you did, make sure to share it around
with others who might benefit from it as well. Don’t forget to
connect with Dr Antic on LinkedIn. You can find his
LinkedIn URL and the link to his Meetup group at
www.superdatascience.com/121. You can also get all the
show notes, including the two infographics there as well. On
that note, thank you so much for being here. I really
appreciate you taking the time to join us for our discussion.
I can’t wait to see you back here next time. Until then,
happy analysing.