The requirements to get started in Machine Intelligence

The requirements to get started in Machine Intelligence

Machine learning is a cross-disciplinary field that includes computer science, mathematics, and sometimes domain knowledge. We can’t just focus on one part and expect to be great at it, and no you don’t have to be a master at everything to get started as it is said jack of all trades, master of none. You need to have a basic understanding of these topics

The maths

math_for_ml picture credit: towardsdatascience

Let’s make this clear; machine learning is math, so to be capable and comfortable with machine learning, I will advise starting by brushing up your maths. Not all maths of course, but linear algebra, single variable, and multivariable calculus, statistics, probability, and lastly programming are required.

As we can see in the picture above linear algebra makes up 35% of ML, you need to understand what a matrice is, basic operations on matrices such as addition, subtraction, multiplication, division applied to matrices. On top of that need to know topics such as principal component analysis (PCA), singular value decomposition (SVD), eigendecomposition of a matrix, LU decomposition, QR decomposition/factorization, symmetric matrices, orthogonalization & orthonormalization, matrix operations, projections, Eigenvalues & Eigenvectors, vector spaces and norms are needed for understanding the optimization methods used for machine learning. An excellent course on linear algebra is the MIT linear algebra courseware found here

The next is single variable and multivariate calculus some of the essential topics include differential and integral calculus, partial derivatives, vector-values functions, directional gradient, hessian, Jacobian, Laplacian, and lagrangian distribution. You can find a tutorial on this at Khan academy here.

Statistic and probability are also vital since machine learning and statistics aren’t very different fields. Some of the fundamental statistical and probability theory needed for ML are combinatorics, probability rules & axioms, Bayes’ theorem, random variables, variance and expectation, conditional and joint distributions, standard distributions (Bernoulli, binomial, multinomial, Uniform and Gaussian), moment generating functions, Maximum Likelihood estimation (MLE), prior and posterior, maximum a posteriori estimation (MAP) and Sampling Methods. Check out this MIT statistic courseware here and this Harvard probability course here

Programming

The other part of ML is programming, and as of the end of 2018, the most popular language to write machine learning is Python because it is syntactically easy to understand. Hear me right, I am NOT saying that you can only write machine learning using Python, No you can write ML in Javascript, Java even Assembly language if you want to but what I mean is that Python is the most suited since it has so many ML frameworks written in Python and is relatively easy to learn for beginners. Here is an excellent course on Udemy by Tim buchalka that will take you from beginner to upper-intermediate level, it is worth every cent. The course is found here

It is also vital to learn Data Structure and Algorithm. This is not mentioned on many resources online, but I can say that this probably the most crucial part in programming since data structure and algorithms are the backbones of any computer science field. They are also vital for understanding the computational efficiency and scalability of our machine learning algorithm and for exploiting sparsity in our datasets. Knowledge of data structures (Binary Trees, Hashing, Heap, Stack, etc.), dynamic programming, randomized & sublinear algorithm, graphs, gradient/stochastic descents, and primal-dual methods are needed. I recommend Cracking the coding interview book here and Elements of Programming Interviews in Python here.

Good to know

The rest of the skills are good to have but not required, these skills are data engineering which is the ability to get insight into a large amount of data using Hadoop, Spark, SQL, and NoSQL and data analysis and visualization skills using tableau or excel.

Only after acquiring understanding these topics above, Do the Andrew NG course on coursera found here. After finishing that course, you can consider specializing in any ML algorithm such as linear regression, K-means, Logistic regression, clustering, reinforcement learning, etc. or start using any ML frameworks like TensorFlow, Scikit-learn, Keras, etc.

Last but not least, you must practice! And keep on practicing to sharpen your skills. You will understand more Machine Learning if you work on real projects. Start by using free datasets found on websites like this or by competing at Kaggle challenges. Another thing you can do is to read research papers found on websites like paper with code to stay up to date with the latest trend in machine learning at least once a week. You should also improve your art of googling because the chances are that you will run into problems or errors and you need to search on how someone else fixed it. Another area to improve is your communication skills because if you are applying for a machine learning jobs, you will likely be working in a team where you will be required to communicate and present all your findings effectively to the non-techies staffs or management board which is crucial.

Conclusion

I know all this list might seem daunting at first to learn all these maths and programming skills and I can’t say it will be easy because me too I am a not expert at all of them. I can promise that if you are determined and persevere through you can learn this and get an understanding of the foundations of ML; then it will be much much more comfortable to debug errors (yes you are going to encounter many of those) while using sophisticated machine learning algorithms.

Thank you for reading this tutorial. I hope you have learned one or two things. If you like this post, please subscribe to stay updated with new posts, and if you have a thought or a question, I would love to hear it by commenting below.