Women Who Code is a fantastic non-profit organization whose goal is to expose more women to tech related careers. Their chapter Women Who Code Data Science is putting on a 6 week introduction to machine learning course on Saturdays from 5/16/20 – 6/20/20. Each week, on the following Monday, I will be posting a recap of the course on my blog. Here’s the recap of Part 1; the full video is available on Youtube.
Part 1 Focus: Intro to Supervised Learning
We don’t implement algorithms because they exist, we implement algorithms because they fit the data we are working with.-Sumana Ravikrishnan
1. Machine Learning History
- Believe it or not, the idea of AI and machine learning first came onto the scene in the 1950s, when Alan Turing introduced the concept of the Turing test. There have been fluctuations in the time and money invested into AI, but interest in the subject is at an all time high.
2. Types of Machine Learning
- Supervised Learning: Your model is learning from example data that comes with the outcome you are trying to predict (think of x values and y values). This means you already have an idea of your target and how it behaves with the input data.
- Unsupervised Learning: Your model is learning from example data that only has input data (basically, x values with no y values). Thus, the model has to rely on its own algorithm to detect patterns between the observations.
- Semi-supervised Learning: Your model is learning from example data that comes with “half supervised” data and “half unsupervised” data (some x values have y values and some don’t).
- Reinforcement Learning: Your model is learning from example data that comes with the outcome you are trying to predict – but, the model also learns from the positive/negative feedback it receives as it makes predictions along the way. This type of learning was related to the idea of trial and error.
3. Types of Supervised Learning
- There are two main types of supervised learning, classification and regression. It is important to honor the advice given by Sumana above and pick the algorithm that fits the data you are working with. That is what will drive your decision on which to use.
- Classification: The target response, or y, is categorical. For example, this could be a model that needs to predict test outcome (Pass/Fail), a weather type (Sunny/Rainy/Windy), or college year (Freshman/Sophomore/etc).
- Regression: The target response, or y, is continuous or numerical. In this case, this could be a model that could predict annual salary in USD or weight in kilograms.
4. Hypothesis Testing
- Hypothesis testing consists of two hypotheses: the null hypothesis and the alternate hypothesis.
- When we conduct the hypothesis test, the quantifiable result that we receive from our test that drives our ultimate decision is called the p-value.
- The significance level, or the amount of wiggle room we want to give ourselves in our test accuracy, is called alpha.
- The purpose of this type of test is to explore if the null hypothesis holds true, or if the data in question provides enough information to allow us to reject the null hypothesis and accept the alternate hypothesis.
Data alone is not interesting. It is the interpretation of the data that we are really interested in.-Sumana Ravikrishnan
Now you’re all caught up from Part 1 – which means you can totally just click this link now and sign up for Part 2: Classification, and all of the rest of the courses while you’re at it! Trust me, you don’t want to miss out on all of the insightful, valuable, and free information being presented by top data scientists in the industry.
I can’t wait to see you all (virtually) at next Saturday’s course. If for someone reason you can’t make it next week, don’t fret! You can catch a recap every Monday right here on my website 🙂