hand-waveAI & Machine Learning – Complete Self-Learning Syllabus

This is an introduction to AI & Machine Learning self-learning. If you are a newcomer or do not have a strong foundation in basic programming, please complete basic programming fundamentals C, Java & OOPS first before starting ML.


0. Orientation & Foundations


0.1 Artificial Intelligence (AI)

Overview

  • What is Artificial Intelligence

  • Ability of machines to mimic human intelligence

  • Real-world AI examples


0.2 Machine Learning (ML)

Overview

  • What is Machine Learning

  • Subset of Artificial Intelligence

  • Uses data to solve tasks

  • Learns patterns from past data


0.3 Deep Learning (DL)

Overview

  • What is Deep Learning

  • Subset of Machine Learning

  • Uses neural networks inspired by the human brain


0.4 Comparison of Concepts

AI vs ML vs DL

  • AI as a broad concept of intelligent systems

  • ML as data-driven statistical learning

  • DL as neural-network-based learning

ML vs Data Roles

  • Machine Learning vs Data Science

  • Machine Learning vs Data Analyst


0.5 Traditional Programming vs Machine Learning

Traditional Programming

  • Rules + Data → Output

  • Fixed logic

  • No learning from data

Machine Learning

  • Data + Output → Model

  • Model learns rules automatically

  • Improves with experience


0.6 Usage of Machine Learning

Why Machine Learning is Used

  • Handles large amounts of data

  • Works with structured and unstructured data

  • Learns automatically without explicit rules

  • Improves performance over time

When Machine Learning Should NOT Be Used

  • Very small datasets

  • Simple rule-based problems

  • No clear objective

Real-World ML Systems Overview

  • Recommendation systems

  • Fraud detection systems

  • Quality control systems

  • Autonomous decision systems


1. Introduction to Machine Learning


1.1 Basics

Definition & Concept

  • Definition of Machine Learning

  • How machines learn from data

  • Learning from historical data

  • Predictive capability

  • Examples of ML in daily life

ML Models

  • Trained using data

  • Based on probability, statistics, and linear algebra


1.2 Data in Machine Learning

Why ML handles Data

  • Handles large amounts of data

  • Improves performance with experience

Real-World Data Examples

  • Structured data (Rows, columns, databases)

  • Unstructured data (Text, images, audio)

  • E-commerce data (Sales reports)

  • Customer datasets (Age, Gender, Location)


2. Types of Machine Learning & Algorithms


2.1 Supervised Learning

Overview

  • Definition: Uses labeled data where input and output are known

  • Target variable is known

2.1.1 Regression

  • Definition: Predicts continuous values

  • Examples: House price prediction, Temperature prediction, Stock prices

  • Algorithms:

    • Linear Regression

    • Multiple Linear Regression

  • Evaluation Metrics:

    • Mean Squared Error (MSE)

    • R² Score

2.1.2 Classification

  • Definition: Predicts categorical values

  • Examples: Spam detection, Disease diagnosis, Pass/Fail

  • Algorithms:

    • Logistic Regression

    • K-Nearest Neighbors (KNN)

    • Decision Tree

    • Random Forest

    • Support Vector Machine (SVM)

    • Naive Bayes

  • Evaluation Metrics:

    • Accuracy

    • Precision

    • Recall

    • F1-Score

    • Confusion Matrix

    • Classification Report


2.2 Unsupervised Learning

Overview

  • Definition: Uses unlabeled data to find hidden patterns

2.2.1 Clustering

  • Definition: Groups similar data points

  • Examples: Customer segmentation, Product grouping

  • Algorithms:

    • K-Means Clustering

    • Hierarchical Clustering

2.2.2 Dimensionality Reduction

  • Definition: Reduces number of features for visualization and performance

  • Algorithms:

    • Principal Component Analysis (PCA)


2.3 Semi-Supervised Learning

Overview

  • Uses combination of labeled and unlabeled data

  • Used when labeling is expensive


2.4 Reinforcement Learning

Overview

  • Learns using trial and error

Components

  • Agent

  • Environment

  • Actions

  • Rewards

  • Policy

Applications

  • Game playing

  • Robotics

  • Autonomous systems


2.5 Comparison of ML Types

Analysis

  • Differences between supervised, unsupervised, semi-supervised, and reinforcement learning

  • Use-cases for each ML type


3. Applications of Machine Learning


3.1 Industry Use-Cases

Key Areas

  • Recommendation systems

  • Image recognition

  • Speech recognition

  • Natural Language Processing (NLP)

  • Fraud detection

  • Healthcare

  • Manufacturing & quality control

  • Autonomous systems

  • Chatbots & virtual assistants


4. Machine Learning Workflow


4.1 End-to-End ML Pipeline

Steps

  1. Data Collection

  2. Data Pre-processing

    • Cleaning the data after collecting it

      • Handling missing values

      • Removing duplicate values

      • Handling other anomalies such as skewed data, outliers, noise, etc.

  3. Exploratory Data Analysis (EDA)

    • Understanding and studying the data

    • Gaining strong knowledge about the dataset

    • Analyzing data distributions, relationships, and patterns

  4. Feature Engineering

    • Creating or adding new columns (features) into the dataset if required

    • Feature Encoding:

      • In the data, there might be categorical values (for example, string data) These need to be converted into numerical values.

      • To do this conversion, many encoding methods are available in machine learning

        • One-Hot Encoding

        • Dummy Encoding

        • Label Encoding

        • etc.

  5. Feature Selection

    • The dataset may contain many unnecessary or unwanted columns

      • Feature selection is the process of selecting only the necessary columns

      • Many machine learning algorithms are available to perform feature selection

  6. Split into Training and Testing Sets

    • 80% of the data is used for training

    • The remaining 20% of the data is used for testing

    • The same data should not be used for both training and testing

  7. Feature Scaling

    • The dataset may contain values in different units or formats

      • Making uniformity among these values is called feature scaling

    • Techniques used for feature scaling in machine learning include

      • Standard Scaling

      • Min-Max Scaling

      • etc.

  8. Building the Machine Learning Model

    • In machine learning, there are many algorithms such as regression and classification

      • Linear Regression

      • Logistic Regression

      • Clustering

      • etc.

    • After understanding the problem and task, an appropriate ML algorithm is selected to build the model

  9. Model Evaluation

    • After building the model, it must be tested or evaluated

      • Various model evaluation metrics are used to measure performance

  10. Hyperparameter Tuning

    • If the model performance is not sufficient or the result is not good, it is improved by providing more training or adjust parameters to improve performance

  11. Model Saving

    • Once the model performance is verified as good during evaluation

      • The trained model is saved (Machine learning libraries provide methods to save models).

  12. Testing with Unseen Data

    • After completing all processes, the model is tested with unseen data

    • Training and evaluation are done using available data. Fresh, new data used for testing is called unseen data

  13. Model Deployment

    • This is the final stage of the machine learning workflow

    • The trained model is implemented in a real-world application


4.2 Data Preprocessing

Tasks

  • Handling missing values

  • Handling outliers

  • Removing duplicates

  • Encoding categorical variables

  • Feature scaling

    • Normalization

    • Standardization


4.3 Exploratory Data Analysis (EDA)

Techniques

  • Dataset overview

  • Data types and shape

  • Statistical summary

  • Data distribution

  • Central tendency

  • Data spread

  • Correlation analysis

  • Visualization


5. Python Programming for Machine Learning


5.1 Python Basics

Fundamentals

  • Python introduction

  • Variables

  • Keywords

  • Comments

  • Indentation


5.2 Data Types

Types

  • Integer

  • Float

  • String

  • Boolean

  • Type conversion


5.3 Data Structures

Structures

  • List

  • Tuple

  • Set

  • Dictionary

  • Differences between data structures

  • Use-cases of data structures in ML


5.4 Operations on Data Structures

Operations

  • Insert

  • Update

  • Delete

  • Indexing

  • Slicing


5.5 Operators & Control Flow

Operators

  • Arithmetic operators

  • Relational operators

  • Logical operators

  • Assignment operators

  • Membership operators

Control Statements

  • if

  • if-else

  • elif

  • for loop

  • while loop

  • break

  • continue

  • pass

  • Difference between for loop and while loop


5.6 Functions & IO

Inbuilt Functions

  • len()

  • sum()

  • min()

  • max()

  • sorted()

  • type()

Input & Output

  • input()

  • print()

  • Formatted output


5.7 Logical Practice Problems

Practice

  • Prime number

  • Palindrome

  • Armstrong number

  • Fibonacci series

  • Factorial

  • Unique elements in list

  • Frequency counting

  • Largest and smallest element

  • Pattern problems

  • Array and string problems


6. Python Libraries for Machine Learning


6.1 NumPy

Concepts

  • Arrays

  • Array operations

  • Vectorized operations

  • Mathematical functions


6.2 Pandas

Concepts

  • Series

  • DataFrame

  • Data loading

  • Data cleaning

  • Data manipulation


6.3 Data Visualization

Concepts

  • Matplotlib

  • Seaborn

Plots

  • Line plot

  • Bar plot

  • Histogram

  • Box plot


6.4 Scikit-Learn

Concepts

  • Introduction to scikit-learn

  • Datasets

  • Model training

  • Model prediction

  • Model evaluation


7. Statistics for Machine Learning


7.1 Descriptive Statistics

Measures

  • Mean

  • Median

  • Mode

  • Range

  • Variance

  • Standard deviation

  • Quartiles

  • Interquartile range (IQR)


7.2 Statistical Equations

Formulas

  • Mean formula

  • Variance formula

  • Standard deviation formula

  • Quartile calculation


7.3 Usage of Statistics in ML

Applications

  • Mean for normalization

  • Median for outlier handling

  • Variance for feature importance

  • Standard deviation for scaling

  • Quartiles for data distribution analysis


8. Mathematics for Machine Learning


8.1 Vectors

Concepts

  • Vector definition

  • Vector representation

  • Vector addition

  • Scalar multiplication

  • Dot product

  • Vector usage in ML


8.2 Matrices

Concepts

  • Matrix representation

  • Matrix addition

  • Matrix multiplication

  • Matrix transpose

  • Identity matrix

  • Matrix usage in ML


9. Probability for Machine Learning


9.1 Probability Basics

Concepts

  • Probability definition

  • Sample space

  • Events


9.2 Types of Events

Categories

  • Independent events

  • Dependent events

  • Conditional probability


9.3 Probability in Machine Learning

Applications

  • Classification problems

  • Prediction confidence

  • Risk and uncertainty

  • Naive Bayes intuition


9.4 Advanced Probability Topics

Topics

  • Bayes theorem

  • Random variables

  • Probability distributions

  • Normal distribution

  • Binomial distribution


10. Model Evaluation & Optimization


10.1 Evaluation Concepts

Topics

  • Training data vs testing data

  • Overfitting

  • Underfitting

  • Bias vs variance

  • Cross-validation

  • Hyperparameter tuning


11. Machine Learning Projects


11.1 Project List

Projects

  • Student performance prediction

  • House price prediction

  • Customer segmentation

  • Spam detection

  • Recommendation system (basic)

  • Quality defect prediction

  • End-to-end ML project workflow


12. MLOps & Deployment (Introductory)


12.1 Deployment Basics

Concepts

  • Model saving and loading

  • Basic deployment concepts

  • ML lifecycle overview

  • Monitoring models


13. Ethics & Responsible AI


13.1 Responsible AI

Topics

  • Bias in ML models

  • Fairness

  • Explainability

  • Privacy concerns

Last updated