Chapter 1: What Is Machine Learning
Explicit Programming vs Machine Learning
Traditional Programming
Humans explicitly write rules, and the computer executes processing according to those rules.
Input: Data + Program (Rules) -> Output: Result
Machine Learning
The computer automatically discovers rules (patterns) from data.
Input: Data + Result (Labels) -> Output: Program (Model)
Example: Handwritten Digit Recognition
- Traditional: Humans write rules like "3 consists of two semicircles stacked vertically" -> breaks down due to too many exceptions
- Machine Learning: Patterns are automatically learned from a large collection of handwritten digit images and their correct labels
Discovering Patterns
Pattern Recognition
The essence of machine learning is discovering regularities (patterns) hidden within data.
Why Is Machine Learning Effective?
- Complex patterns that humans cannot articulate as rules (face recognition, speech recognition, etc.)
- Problems involving a vast number of variables (recommendation systems, financial prediction, etc.)
- Problems where the environment keeps changing (spam filters, anomaly detection, etc.)
Prediction and Generalization
Prediction
Using learned patterns (a model) to estimate outputs for new data.
Generalization
The ability to make correct predictions not only on training data but also on unseen data.
This is the most important goal in machine learning.
Example: House Price Prediction
- Training: Learn patterns from historical housing data (area, age, price)
- Prediction: Estimate the price of a new property from its area and age
What matters is that the model can make reasonable predictions even for properties not in the training data (generalization).
Definition of Machine Learning
Mitchell's Definition (1997)
"A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E."
In essence, this means "if getting more experience makes a program better at what it does, then it is learning." This applies the same idea as humans improving through study and practice to computer programs.
- Task T (what to do): The goal the program should achieve
- Performance measure P (how well): The metric for evaluating task performance
- Experience E (what to learn from): The data or accumulated trials given to the program
In other words, a program that improves its performance (P) as it receives more data (E) can be said to be "learning." Let us confirm this with concrete examples.
Example: Chess Program
- Task T: Playing chess
- Performance measure P: Win rate in games
- Experience E: Data from past games
-> If the win rate (P) improves as more games are played (E), the program is "learning."
Example: Email Classification
- Task T: Classifying emails as spam or non-spam
- Performance measure P: Classification accuracy
- Experience E: Email data labeled as "spam" or "non-spam"
-> If classification accuracy (P) improves as more labeled email data (E) is provided, the program is "learning."
Summary
- Machine Learning: An approach that automatically learns patterns from data
- Difference from traditional methods: Rules are discovered from data, not written by humans
- Pattern discovery: Finding regularities hidden in data
- Generalization: The ability to handle unseen data is the most critical goal
Frequently Asked Questions
Q. What is machine learning?
Machine learning is a technology in which computers automatically discover patterns from data and make predictions on unseen data. In traditional programming, humans explicitly write rules, whereas in machine learning, the computer learns rules (a model) on its own when given data and correct answers.
Q. What is the difference between traditional programming and machine learning?
In traditional programming, the flow is "Data + Rules -> Result," and humans write the rules. In machine learning, the flow is "Data + Labels -> Model," and the computer automatically discovers rules (a model) from the data. This is especially effective for problems where humans find it difficult to articulate rules, such as handwritten digit recognition.
Q. What is generalization?
Generalization is the ability to make correct predictions not only on training data but also on unseen data. It is the most important goal of machine learning. The phenomenon in which a model fits the training data too closely, resulting in poor generalization performance, is called "overfitting."
References
- Mitchell, T. M. (1997). Machine Learning. McGraw-Hill. — Source of the definition of machine learning introduced in this chapter
- Machine learning — Wikipedia
- Pattern recognition — Wikipedia
- Generalization error — Wikipedia