Related Data Science Links
Learn Ml Concepts Data Science Tutorial, validate concepts with Ml Concepts Data Science MCQ Questions, and prepare interviews through Ml Concepts Data Science Interview Questions and Answers.
Core Machine Learning Concepts
Understand what machine learning is, types of ML, the basic pipeline, and key terminology, with a simple linear regression example in Python.
What is Machine Learning?
Machine Learning (ML) is about teaching computers to learn patterns from data instead
of programming them with hard-coded rules. A model learns a mapping from inputs
X to an output y.
Types of Machine Learning
Supervised Learning
Data has both inputs and labels.
- Regression: predict a number (price, temperature).
- Classification: predict a class (spam vs non-spam).
Unsupervised Learning
Data has inputs only, no labels.
- Clustering: group similar items/customers.
- Dimensionality reduction: compress features.
Reinforcement Learning
An agent learns by trial and error to maximize rewards (e.g., game playing, robotics).
Supervised Learning Pipeline
- Collect and clean data.
- Split data into train and test sets.
- Choose a model (e.g., Linear Regression).
- Train the model on training data.
- Evaluate performance on test data.
- Improve by tuning or choosing a better model.
Example: Linear Regression in scikit-learn
Linear regression tries to fit a straight line that best describes the relationship between a numeric input (e.g., house size) and a numeric output (e.g., price).
import numpy as np
from sklearn.linear_model import LinearRegression
# Features (X): house sizes in square feet
# Must be 2D: each row is one example, each column a feature
X = np.array([[500], [750], [1000], [1250], [1500]])
# Target (y): house prices in thousands of dollars
y = np.array([100, 150, 200, 250, 300])
# Create the model
model = LinearRegression()
# Train the model on the data
model.fit(X, y)
# Predict price for a 1200 sq ft house
new_size = np.array([[1200]])
predicted_price = model.predict(new_size)
print("Predicted price (in thousands):", predicted_price[0])
# View learned parameters (slope and intercept)
print("Weight (slope):", model.coef_[0])
print("Bias (intercept):", model.intercept_)
Key Terms
- Feature: input variable (e.g., size, rooms, location).
- Label / Target: what we want to predict (e.g., price).
- Model: function that maps features to a prediction.
- Parameters: internal values learned by the model (weights).
- Overfitting: model memorizes training data, performs poorly on new data.
- Underfitting: model is too simple and misses important patterns.