How to avoid information overload - initial Data Science learning

Posted on Wed 02 March 2022 in Data Science General • 3 min read

When the a sudden urge to make a new batch of flourless brownies strikes, the internet can be a very quick and efficient way of learning. Some Google searches, a couple of Youtube videos and even a cheeky peak or two at social media and you're good to bake. If, however; the goal is to learn a whole new field (such as Data Science) this strategy can lead to weaving bits and pieces of information together from different articles, videos, courses and books. It can be overwhelming, slow and lead to foundational holes that need to be continuously filled. There is often a mish-mash of very surface-level introductory content and advanced, deep-level information. Whenever I’m asked, I have a kind of framework to suggest that can aid with navigating this file-dump overload.

Start with the big big picture:

Understand roughly what there is in the field and approximately where you would like to be heading. Are you interested in solving business-problems in your current role using machine learning techniques? Is there an interest in becoming an analytics engineer who mixes statistics, machine learning and other problem-solving tools on a daily basis? Are you interested in some sub-field of Deep learning such as Computer Vision, Natural Language Processing, Robotics? Or maybe Deep learning as a whole? Maybe you would like to experiment with all of the above. Having this picture will help with where and how to focus because it is all too easy to bury your head too deep too fast and completely miss the wood for the trees.

Move on to the big picture (within your level of interest identified above):

Get a high level picture of what this field looks like. This can be a course, some articles, some YouTube videos. At this point, what you're not doing is picking one niche and watching every video you can find on it. You are also not watching and reading every random thing that comes along. Here, you are getting an idea of what exists, what is used for what - make a mind-map. Most crucially, you'll be building an understanding of what the daily bag of tools in this field looks like. Things will start repeating, the jargon will start to sound familiar, the same technologies, frameworks, resources will be mentioned and this will form the basis of the "language" of this field. This bird's eye-view enables you to see what you are already familiar with, what is important and where you should spend your time.

Cut up the big picture into a many tiny steps

It is now probably clear what the 'big tools' are and what is used often. Start with that. Practice that. Apply practically over and over until none of it feels new. Now you can go one level deeper. How do these black boxes work? How do we test these black boxes better? How can we understand more about what's going on?

This type of approach will let theory stick because it has some base already. Learn one tiny new thing at a time. Try not to “get stuck” on any one problem. It is far too easy to rabbit hole and go too deep too fast or to go on a tangent with really irrelevant and (for initial purposes) obscure things. The opportunity cost of getting stuck on one project is many solid small learnings as opposed to the vague idea that you will get by keeping on at the rabbit hole. Using this type of method, you fill out your foundational knowledge.

This probably sounds counter to what most people suggest - which is to start with all the maths and statistics and computer science first and then the machine learning; however, this is a faster, more direct way of doing it. For some people, beginning with the math will feel confusing as it is not clear where and how it will be used. Because of this, it becomes unclear when your knowledge and skill in these areas is good enough. You can go on learning Calculus, Linear Algebra, Optimization, Statistics and other foundational fields forever! Go for what you want directly instead. Now that you have the application and the theory you’re pretty much good to go. Explore, niche down and most importantly, play.

Hope this helps and happy learning.