Learning to share my progress as an amateur data scientist

May 15, 2021

Amateur data scientist Motivation Frustration

It has been over half a year since my last blog post and it’s time to end the post drought. Ever since I started this blog, it was my goal to write a post every time I finish a course from the Data Science specialization or every time I have exiting results to share. In August 2020, I started the “Practical Machine Learning” course. The material is challenging for someone who encounters the algorithms for the first time, the course can’t be done in 4 weeks for a data scientiest newbie, progress is made slowly and that somehow produced in me some dissatisfaction and demotivation. Throughout all this time, I thought to myself, I’d wait a little bit more until I finish the course the write my next blog post. However, many months have passed and little I knew about the time it would really take me to complete the course. Now, I’m not done with the course yet and I still need to finish the last project of the course. So, what about this blog post? About two months ago I run into the youtube channel of Ali Abdaal, who in this youtube video talks about 3 books that changed his life, one of them was “Show your work” from Austin Kleon. Kleon writes in his book that sharing the progress is equally valueable specially in the stage, where there’s still lots of learning to do and the results of any projects are difficult report. Additionally, in this digital age, not only experts contribute to the information pool, but also amateurs can contribute to something. Amateurs might lack formal training, but they’re all lifelong learners, and they make a point of learning in the open so that others can learn from their failures and successes(…). Amateurs are not afraid to make mistake or look ridiculous in public. They’re in love, so they don’t hesitate to do work that others think of as silly or just plain stupid (…). Because they have little to lose, amateurs are willing to try anything and share the results. Thus, I’ll redefine myself as an amateur data scientist who in this blog post shares her experience of learning the basics of machine learning and how her motivation fluctuated throughout the course. This post is addressed to my future me or some other amateur out there who is going through a stage in the learning process, where progress seems to be very slow and demotivation is just around the corner.

The Practical Machine Learning course is the 8th course from the Data Science Specialization and so far the most interesting course. Week 1 introduces the concepts of prediciton, types of errors and cross validation followed by week 2 which explains the caret package in R which is used to preprocess the data before the training and then train the data to make predictions. Week 3 and 4 cover the most interesting and challenging material as it introduces a large variety of algorithms such as trees, model based predictions, regularized regressions, some forecasting with time series and even some introduction to unsupervised learning. As usual, at the end of each week, you have a small quiz with 5 questions. In most questions, you won’t be tested on the things you learned during the week, but rather you’ll be asked new concepts. Therefore, be prepared to do some googleing and researching during the quiz which could take you some time depending on the previous experience you’ve had with machine learning. The course squeezes in 4 weeks numerous machine learning concepts and algorithms, and so the math behind the algorithms and the output of the R code is not explained thoroughly. The course makes a good job referencing to many external resources such as books and other websites that we could read, if we want to “know more” about the topic. Though, the course doesn’t make a good job in elucidating the basic groundwork of the algorithms, and for me the “know more” part in this context meant understanding the math and the R output of the algorithm.

In my opinion, the course is great for someone who already has experience working with different types of algorithms and wants to refresh on the topics. Nevetheless, for someone learning this material for the very first time, it is hard to follow the videos and understand everything at once. I googled a lot and looked into the books referenced in the course such as “An Introduction to Statistical Learning” by James, Witten, Hastie and Tibshirani and “The Elements of Statistical Learning” by Hastie, Tibshirani and Friedman. Both are really great to learn the maths in the algorithms but the mathematical notation is easier in the first one, since it doesn’t use matrix notation. The former one also provides at the end of each chapter some code and exercises to perform the algorithms in R. If you find these books too demanding, Josh Starmer uses little math to provide amazing explanation and intuition behind the algorithms in his youtube channel StatQuest. His videos are just great and I can highly recommend them.

The whole learning process wasn’t easy. I particularly felt unskilled when the course scheduled one week for four different types of algorithms and I needed one week to grasp the basics of one algorithm. In some days, I was making steady progress while in some other ones I felt overwhelmed by all the stuff I wasn’t able to comprehend. Those days were the days I got demotivated, got into the demotivation loop, was disappointed with Coursera and lost momentum. I am not sure at which point I came out of the dissapointment and demotivation loop, but I think it fluctuated throughout the course. I learned that you won’t come out of the demotivation loop by doing nothing. That won’t happen for sure. Furthermore, momentum comes and goes, and so, if you want to make steady progress, don’t wait around for motivation or momentum to come back. As I once read in a book I don’t quite remember its name waiting for inspiration to happen is just a terrible habit. Make a habit out of it. I am still working on building my habit to learn every day, so I don’t have any good tipps for it. However, for my future me or any other amateur out there being overwhelmed with challenging material, I’ve summarized some points below that you could or should tell your brain next time you’re about to give up.

It’s totally fine not to understand anything at the beginning. You don’t have to.
Break down the material you’re learning into small chunks.
Start with the easiest chunk. For instance, if you’re working with a book, read one page a day.
Google the concepts. A lot of amateurs went through the same problems when learning and they too have posted something online that could help you understand the chunks or maybe little pieces of the small chunks you made in made in step 2.
You’ll feel demotivated and disappointed in yourself for not understanding all at once. That’s also perfectly fine! Be compassionate to yourself and treat yourself as if you were a 10-year child trying to understand complex algorithms. You won’t get mad at them for not understanding nothing at once.
Repeat step 3 and 4 (for at least 100 times).
After the 100th time, you’ll understand a little bit more…maybe or maybe not :D, there are no guarantees in life. Either way it is also fine.
Start again from step 2 or 3, if you still feel stuck.
Be patient and don’t lose hope! Rome wasn’t build in a day.

Finally, as James Clear says: If you get one percent better each day for one year, you’ll end up thirty-seven times better by the time you’re done. So, you’re on the right track by just getting one percent better each day.

I’m really exciting to be finishing the “Practical Machine Learning” course very soon and hope that my next blog post will cover one of the many algorithms I’ve learned in this time. Stay safe and hope you enjoyed this blog post!