Machine Learning

How I’d start learning machine learning again (3-years in)

Putting the engineer back in machine learning engineer

Daniel Bourke

15 Aug 2020 • 10 min read

I’m underground, back where it all started. Sitting at the hidden cafe where I first met Mike. I’d been studying in my bedroom for the past 9-months and decided to step out of the cave. Half of me was concerned about having to pay $19 for breakfast (unless it’s Christmas, driving Uber on the weekends isn’t very lucrative), the other half about whether any of this study I’d been doing online meant anything.

In 2017, I left Apple, tried to build a web startup, failed, discovered machine learning, fell in love, signed up to a deep learning course with zero coding experience, emailed the support team asking what the refund policy was, didn’t get a refund, spent the next 3-months handing in the assignments four to six days late, somehow passed, decided to keep going and created my own AI Masters Degree.

9-months into my AI Masters Degree, I met Mike, we had coffee, I told him my grand plan; use AI to help the world move more and eat better, he told me I should I meet Cam, I met Cam, I told Cam I’m going to the US, he said why not stay here, come in on Thursday, okay, went in on Thursday for a 1-day a week internship and two weeks later was offered a role as a junior machine learning engineer at Max Kelsen.

14-months into my machine learning engineer role, I decided to leave and try it on my own. I wrote an article about what I’d learned, Andrei found it, emailed me asking if I wanted to build a beginner-friendly machine learning course, I said yes, we built the course and 6-months in we’ve got the privilege of teaching 27,177 students in 150+ countries.

Add it up and you get about 3-years. About the time my original undergraduate degree was supposed to take (due to several failures, I took 5-years to do a 3-year degree).

So as it stands, I feel like I’ve done a machine learning undergraduate degree.

Someone looking from the outside in might think I know a fair bit about machine learning and I do, I know a lot more than I started but I also know how much I don’t know. That’s the thing with knowledge.

1-year in: The honeymoon phase, also known as the noob gains period. You’re much better than a beginner, perhaps even a little too confident (though this isn’t a bad thing).
2-years in: The oh, maybe I’m not as good as I thought phase. Your beginner skills are starting to mature but now you realise getting better is going to take some effort.
3-years in: The wow, there’s still so much to learn phase. Not a beginner anymore but now you know enough to realise how much you don’t know (I’m here).

Learning is non-linear (not a straight line). You may study for an entire month and feel like you’ve made zero progress. Then seemingly out of nowhere, a discovery appears. If you want an example of how we fool ourselves, did you catch the error? It seems I still forget how to spell.

But enough about me. That’s my story. Yours might be similar or you might be starting out today.

If you’re getting started, this article is for you. If you’re a veteran, you can offer your advice or critique my ideas.

Let’s get into it, shall we?

If you came for a list of courses, you’re in the wrong place

I’ve done a bunch of online courses. I’ve even created my own.

And guess what?

They’re all remixes of the same thing.

Instead of worrying about which course is better than another, find a teacher who excites you.

Learning anything is 10% material and 90% being excited to learn.

How many of your school teachers do you remember?

My guess is, regardless of what they taught, you remember the teacher themselves more than the material. And if you remember the material, it’s because they sparked a fire in you enough for it to be burned into your memory.

What then?

Dabble in a few resources, you’re smart enough to find the best ones. See which ones spark your interest enough to keep going and stick with those.

It isn’t an unpleasant task to learn a skill if the teacher gets you interested in it.

The curse of the engineer (and technology nerd)

Show me an engineer who proclaims her use case of the latest and greatest tools and I’ll show you an amateur.

I’ll confess. I’m guilty. Every new shiny framework which comes out, every new state of the art model, I’m onto it.

Often I’ll catch myself trying to invent a problem to use whatever new tool is on the market. A classic cart before the horse scenario.

A chef’s entire work centres around two tools, the controlled use of fire and a knife.

This is embodied in the best programming advice I’ve ever received: learn the language, not the framework.

If you’re just starting out and can’t count the number of tools you’re learning on one hand, you’re trying to use too many.

“I want to build things”

If you want to build things, such as web applications or mobile applications, learn software engineering before (or at least alongside) machine learning.

Too many models live and die within Jupyter Notebooks.

Why?

Because machine learning is an infrastructure problem (infrastructure means all the things which go around your model so others can use it, the hot new term you’ll want to lookup is MLOps).

And deployment, as in getting your models into the hands of others, is hard.

But that’s exactly why I should’ve spent more time there.

If I was starting again today, I’d find a way to deploy every semi-decent model I build (with exceptions for the dozens of experiments leading to the one worth sharing).

How?

Don’t be afraid to make something simple. A basic front-end which someone can interact with is far more interesting than a notebook in a GitHub repo.

No really, how?

Train a model, build a front-end application around it with Streamlit, get the application working locally (on your computer), once it’s working wrap the application with Docker, then deploy the Docker container to Heroku or another cloud provider.

Sure, we’re going against the rule here of using a few too many tools, but pulling this off a few times will get you thinking about what it’s like to get your machine learning model into people’s hands.

Deploying your models will raise the questions you don’t get to ask when your machine learning model lives its life in a Jupyter Notebook, like:

How long does inference take (the time for your model to make a prediction)?
How do people interact with it (maybe the data they send to your image classifier is different to your test set, data in the real world changes often)?
Would someone actually use this?

“I want to do research”

Building things becomes research. You’ll want your models to work faster, better. To achieve this, you’ll need to research alternative ways of doing things. You’ll find yourself reading research papers, replicating them and improving upon them.

I’m often asked, “how much math should I know before I start machine learning?”

To which I usually reply, “how much walking should I know before I go for a run?”

I don’t really say this, I’m usually nicer and say something like, “can you solve the problem you’re currently working on?”, if so, you know enough, if not, learn more.

As a side note, I’ve just ordered the Mathematics for Machine Learning book. I’m going to be spending the next month or two reading it cover to cover. Having read the free text online it’s more than enough to cover the fundamentals.

Skill before certificates

I’ve got online course certificates coming out of my ass.

I got caught thinking more certificates equals more skills.

I’d burn through lectures on 1.75x speed just to get to the end, pass the automated exam and share my progress online.

I optimised for completing courses instead of creating skills. Because watching someone else explain it was easier than learning how to do it myself.

Idiot.

Here’s the thing. Everything I learned for an exam, I’ve forgotten. Everything I learned through experimenting, I remember.

Now, this isn’t to say online certifications and courses aren’t worth your time. Courses help to build foundational skills. But working on your own projects helps to build specific knowledge (knowledge which can’t be taught).

Instead of stacking certificates, stack skills (and prove your skill through sharing your work, more on this later).
Instead of doing more courses, repeat the ones you’ve already done.
Instead of looking for the newest tools, improve your use of the ones which have been around the longest.
Instead of looking for more resources, reread the best books on your shelf.

Learning (anything) isn’t linear, better to read the same book twice (as long as it’s got some substance) than to add more to the pile.

I often tell my students, despite the immense proudness I feel when I see someone share a graduation certificate, I’d prefer them not to finish my course and instead take the parts they need and use them for their own work.

Before you add something, ask yourself, “have I sucked the juice out of what I’ve already covered?”

How I’d start again

First of all, more important than any resource is to get rid of the “I can’t learn it” mentality. That’s bullsh*t. You’ve got the internet. You can learn anything.

The internet has given rise to a new kind of hunter-gatherer. And if you decide to take on the challenge you can gather resources to create your own path.

The following path isn’t set either. It’s designed to be a compass rather than a map. And guess what? It’s all accessible online.

Let’s lay some foundations.

An excerpt of the 2020 Machine Learning Roadmap. **Note:** This curriculum is heavily focused on code-first, Python code in particular. It also neglects mobile or embedded device development. However, it contains more than enough resources to get an outstanding grounding in the field.

The beginner path (6–12+ months)

If I was starting again I’d learn far more software engineering practices intertwined with machine learning.

My main goal would be to build more things people could interact with.

The machine learning specific parts would be:

Machine learning concepts — understand what kind of problems machine learning can and should be used for. Elements of AI is great for this.
Python — the language itself, along with the machine learning specific frameworks, NumPy, pandas, matplotlib, Scikit-Learn. Check out pythonlikeyoumeanit or the official documentation for each of these.
Machine learning tools — the main one being Jupyter Notebooks.

[Note: The Zero to Mastery Machine Learning course (the course I teach) teaches the above 3 topics.]

Machine learning math — linear algebra from 3Blue1Brown or Khan Academy, matrix manipulation and calculus from Khan Academy or just read the Mathematics for Machine Learning Book.

Alongside these, I’d go through:

freeCodeCamp — for web development skills.
CS50 + CS50 artificial intelligence — for foundational computer science and artificial intelligence skills.
The Missing Part of Your CS Degree — for the parts CS50 misses out on and for coverage of all the tools you’ll end up using here and there anyway.
Hands-On Machine Learning with Scikit-Learn and TensorFlow Part 1 — covers a vast majority of the most useful and time-tested machine learning techniques.

There’s a lot here. So to consolidate my knowledge I’d build 1–2 milestone projects using Streamlit or the web development skills I’d learned from freeCodeCamp. And of course, these would be shared on GitHub.

The advanced path (6–12+ months/ongoing)

Once I’d gotten some foundational machine learning skills, I’d build upon them with the following.

All of fast.ai’s curriculum(s) — practical use cases of many deep learning and machine learning techniques. Watching one fast.ai lecture turned into a solution we built for a client.
Any of deeplearning.ai’s curriculum(s) — choose the one which sparks your interest the most. Compliments fast.ai’s practical approach with theory.
Full-stack deep learning curriculum — this is where you’re going to tie together the machine learning knowledge you’ve got with the web development knowledge you’ve been learning.
Replicate a research paper (or multiple).
Hands-on Machine Learning Book with Scikit-Learn and TensorFlow Part 2 — TensorFlow focused but the concepts bridge to many different applications.

Again, after going through these, I’d consolidate my knowledge by building a project people can interact with.

An example would be a web application powered by a machine learning model.

Example curriculums

Two of the biggest things you pay for with a college degree is accountability and structure.

Good news is, you can get both of these yourself.

I created my own AI Masters Degree as a form of accountability and structure. You can do something similar.

In fact, if I was starting again, I’d follow something more similar to Jason Benn’s How I learned web development, software engineering & ML. It’s similar to mine but includes more software engineering practices.

If you can find a (small) community to learn with others, that’s a big bonus. I’m still not quite sure how to do this.

A billion dollar idea is to develop a platform where people can create their own self-driven curriculums and interact with others who are on similar paths. I say self-driven here because all knowledge is largely self-taught. Rather than hand-feed knowledge, the role of an instructor is instead more to excite, guide and challenge.

Friends, does exist a platform which allows students to:

- Create their own curriculum’s (e.g. collecting various online resources)
- Find and interact with students on similar paths
- Ask questions in a shared knowledge base

If not, why not?

Has this been tried before?
— Daniel Bourke (@mrdbourke) August 10, 2020

Learning and reading is inhaling. Building and creating is exhaling. Don’t hold your breath.

Balance your consumption of materials with creations of your own.

For example, you might spend 6 weeks learning, then 6 weeks putting your knowledge together in a form of shared work.

Your shared work is your new resume.

Where?

GitHub and your own blog. Use the other platforms when needed.
For machine learning projects, a runnable Colab notebook is your minimum requirement.

What’s missing?

Everything here is biased by my own experience of graduating from a nutrition degree, spending 9-months studying machine learning in my bedroom whilst driving Uber on the weekends to pay for courses, getting a machine learning job, leaving the job and building a machine learning course.

I have no experience of going to a coding bootcamp or university to learn technological skills so therefore can’t compare the differences.

Though, since we’re talking about code and math, it either works or it doesn’t. Knowing this, the contents of the materials you choose doesn’t matter as much as how you learn it.

Video version of this article

I put together some clips from the last three years as well as riffed on a few points to go along with this article. Not all points are the same but they stick with the theme.