Introduction to Artificial
Intelligence (AI)
Artificial
Intelligence (AI) is the field of computer science that focuses on creating
systems that can perform tasks that typically require human intelligence. These
tasks include problem-solving, learning, decision-making, and understanding
natural language. AI is used in various industries to improve efficiency and
automate complex processes.
Applications of AI
AI is
widely applied in different sectors, including:
1. Healthcare – AI assists in diagnosing diseases, predicting
patient outcomes, and automating administrative tasks.
2. Finance– Used for fraud detection, risk assessment, and
automated trading.
3. Education – AI-powered chatbots and adaptive learning help
personalize education for students.
4. Transportation – Self-driving cars and traffic management
systems use AI for navigation and safety.
5. Customer Service – AI chatbots handle customer inquiries
efficiently.
6. Entertainment – AI recommends music, movies, and games based
on user preferences.
Setting Up Python for AI Projects
To
work on AI projects, we need to set up a Python environment. Below are three
popular tools:
1. Installing Anaconda
Anaconda
is a distribution of Python that includes essential libraries for data science
and AI.
-
Download Anaconda from [anaconda.com](https://www.anaconda.com/).
-
Install it by following the on-screen instructions.
- Open
Anaconda Navigator and launch Jupyter Notebook to start coding.
2. Using Jupyter Notebook
Jupyter
Notebook is an interactive coding environment commonly used for AI and machine
learning.
- It
allows you to write and execute Python code in cells.
- You
can install additional libraries using commands like:
python
!pip install numpy pandas
3. Google Colab
Google
Colab is a cloud-based platform that allows you to run Python code without
installing anything on your computer.
-
Visit [colab.research.google.com](https://colab.research.google.com/) and sign
in with your Google account.
- You
can create a new notebook and start coding immediately.
Python Basics for AI
To
build AI applications, understanding basic Python concepts is important.
1.
Variables and Data Types
Variables
store data values, and Python has different data types such as integers,
floats, strings, and booleans.
python
name =
"AI Learning"
age =
25
is_smart
= True
2.
Loops
Loops
help in executing repetitive tasks.
python
for i
in range(5):
print("AI is powerful!")
3.
Functions
Functions
are used to organize code into reusable blocks.
python
def
greet(name):
return f"Hello, {name}!"
print(greet("AI
Student"))
Introduction to NumPy and Pandas for Data Handling
AI
projects involve handling large amounts of data. NumPy and Pandas are Python
libraries designed for efficient data processing.
1.
NumPy – For numerical computing
python
import
numpy as np
arr =
np.array([1, 2, 3, 4, 5])
print(arr
* 2) # Multiply each element by 2
2.
Pandas – For data analysis and manipulation
python
import
pandas as pd
data =
{"Name": ["Alice", "Bob"], "Age": [25,
30]}
df =
pd.DataFrame(data)
print(df)
These
tools help process and analyze data, which is essential for training AI
models.
Assignment:
Write
a Python program to analyze simple data (e.g., sales data).
Create
a NumPy array and perform basic operations.
Week 2: Machine Learning Basics & Data Preprocessing
In
this week, we will explore the fundamentals of Machine Learning (ML) and learn
how to prepare data for building ML models.
1. Introduction to Machine Learning (ML)
Machine
Learning is a subset of Artificial Intelligence that allows computers to learn
from data and make predictions or decisions without being explicitly
programmed. It is widely used in various applications, such as:
-
Fraud detection in banking
- Recommendation
systems (Netflix, YouTube)
-
Self-driving cars
-
Medical diagnosis
Types of Machine Learning
There
are three main types of Machine Learning:
1.
Supervised Learning
- The
model learns from labeled data (input-output pairs).
-
Example: Predicting house prices based on size, location, and number of
rooms.
-
Algorithms: Linear Regression, Decision Trees, Neural Networks.
2. Unsupervised Learning
- The
model finds patterns in data without labels.
-
Example: Customer segmentation in marketing.
-
Algorithms: K-Means Clustering, PCA (Principal Component Analysis).
3. Reinforcement Learning
- The
model learns by interacting with an environment and receiving rewards.
-
Example: Training a robot to walk or play chess.
-
Algorithms: Q-Learning, Deep Q Networks (DQN).
Understanding Datasets (CSV, JSON formats)
Before
training a machine learning model, we need to understand how data is
stored.
1. CSV (Comma-Separated Values)
A CSV
file is a simple text file where data is stored in rows and columns.
Example:
Name,
Age, Score
Alice,
25, 90
Bob,
30, 85
Reading
CSV files in Python using Pandas:
python
import
pandas as pd
df =
pd.read_csv("data.csv")
print(df.head()) # Display the first 5 rows
2. JSON (JavaScript Object Notation)
JSON
stores data in a structured format, often used in web applications.
Example:
json
{
"students": [
{"name": "Alice",
"age": 25, "score": 90},
{"name": "Bob",
"age": 30, "score": 85}
]
}
Reading
JSON files in Python:
python
df =
pd.read_json("data.json")
print(df)
4.
Data Cleaning using Pandas
Raw
data often contains errors, missing values, or duplicates. Data cleaning is a
crucial step in ML.
1.
Handling Missing Values
python
df.fillna(0,
inplace=True) # Replace missing values
with 0
df.dropna(inplace=True) # Remove rows with missing values
2. Removing
Duplicates
python
df.drop_duplicates(inplace=True)
3.
Converting Data Types
python
df["Age"]
= df["Age"].astype(int) #
Convert age to integer
Data Visualization with Matplotlib & Seaborn
Data
visualization helps us understand patterns in data.
1.
Matplotlib for Basic Charts
python
import
matplotlib.pyplot as plt
x =
[1, 2, 3, 4, 5]
y =
[10, 20, 30, 40, 50]
plt.plot(x,
y, marker='o')
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.title("Simple
Line Graph")
plt.show()
2. Seaborn for Advanced Visualization
python
import
seaborn as sns
sns.histplot(df["Age"],
bins=5)
plt.show()
This
lesson covered the basics of Machine Learning, dataset formats, data cleaning,
and visualization.
Assignment:
Download
a dataset (e.g., Titanic Dataset) and clean it using Pandas.
Create
basic charts to visualize the data.
Week 3: Supervised Learning - Regression
This
week, we will explore *Regression, a fundamental technique in Supervised
Learning used for predicting continuous values.
1. Introduction
to Regression Models
Regression
models help predict a numerical outcome based on input features. Common
applications include:
- House
price prediction (based on location, size, etc.)
- Stock
price forecasting
- Sales
prediction
Types
of Regression Models
1.
Linear Regression– Predicts a straight-line relationship between input (X) and
output (Y).
2.
Multiple Regression – Uses multiple features to make predictions.
3.
Polynomial Regression – Fits a curve instead of a straight line.
2.
Linear Regression using Scikit-Learn
Step
1: Import Libraries
python
import
numpy as np
import
pandas as pd
import
matplotlib.pyplot as plt
from
sklearn.model_selection import train_test_split
from
sklearn.linear_model import LinearRegression
Step
2: Load Dataset
python
data =
pd.read_csv("house_prices.csv")
print(data.head()) # Display first 5 rows
Step
3: Preprocess Data
python
X =
data[["Size"]] # Feature
(Independent variable)
y =
data["Price"] # Target
(Dependent variable)
X_train,
X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
Step
4: Train Model
python
model
= LinearRegression()
model.fit(X_train,
y_train)
Step
5: Make Predictions
python
y_pred
= model.predict(X_test)
Evaluating Regression Models
To
measure how well our model performs, we use the following metrics:
1. R²
Score (Coefficient of Determination)
python
from
sklearn.metrics import r2_score
print("R²
Score:", r2_score(y_test, y_pred))
- R² close to 1 = Good model
- R²
close to 0 = Poor model
2.
Mean Squared Error (MSE)
python
from
sklearn.metrics import mean_squared_error
print("MSE:",
mean_squared_error(y_test, y_pred))
-
Lower MSE = Better predictions
Project:
House Price Prediction using Linear Regression
Objective:
Build
a model that predicts house prices based on size.
Steps:
1.
Load and clean data
2.
Train a *Linear Regression* model
3.
Evaluate performance using *R² Score & MSE*
4.
Predict prices for new house sizes
Plotting
Results
python
plt.scatter(X_test,
y_test, color='red', label="Actual Prices")
plt.plot(X_test,
y_pred, color='blue', label="Predicted Prices")
plt.xlabel("House
Size (sq ft)")
plt.ylabel("Price
($)")
plt.legend()
plt.show()
This
lesson covered Linear Regression, model training, and evaluation.
Assignment
Train
a Linear Regression model on house price data.
Evaluate
model accuracy and improve it
Week 4: Supervised Learning - Classification
This
week, we will explore Classification, a key technique in Supervised Learning*
used for predicting categories.
What
is Classification?
Classification
is a machine learning task where the goal is to categorize data into predefined
groups. Examples include:
- Spam
Detection:Classifying emails as spam or not spam.
- Medical
Diagnosis: Identifying diseases based on symptoms.
- Sentiment
Analysis: Classifying text as positive, neutral, or negative.
Common
Classification Algorithms:
1. Logistic Regression – Used for binary classification (e.g.,
spam vs. non-spam).
2. Decision Trees – Uses tree structures to make decisions.
3. Random Forest – An ensemble of multiple decision trees for
better accuracy.
2.
Logistic Regression, Decision Trees, and Random Forest
Logistic
Regression
- Used
when the target variable has two classes (e.g., spam vs. not spam).
- Uses
the *sigmoid function* to predict probabilities.
Decision
Trees
-
Splits data based on feature conditions.
- Easy
to interpret but may overfit the data.
Random
Forest
- Uses
multiple decision trees and averages their predictions.
- More
accurate and less prone to overfitting than a single decision tree.
Implementing a Spam Email Classifier
We
will use Scikit-Learn to build a spam classifier using the Naïve Bayes
algorithm, a common choice for text classification.
Step
1: Import Libraries
python
import
pandas as pd
import
numpy as np
from
sklearn.model_selection import train_test_split
from
sklearn.feature_extraction.text import TfidfVectorizer
from
sklearn.naive_bayes import MultinomialNB
from
sklearn.metrics import accuracy_score, precision_score, recall_score
Step
2: Load Dataset
python
# Load
the dataset (Example dataset with 'text' and 'label' columns)
data =
pd.read_csv("spam.csv")
print(data.head()
Step
3: Data Preprocessing
python
X =
data["text"] # Features (Email
content)
y =
data["label"].map({"spam": 1, "ham": 0}) # Convert labels to numerical values
#
Split into training and testing sets
X_train,
X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
Step
4: Convert Text to Features
python
vectorizer
= TfidfVectorizer(stop_words="english")
X_train_tfidf
= vectorizer.fit_transform(X_train)
X_test_tfidf
= vectorizer.transform(X_test)
Step
5: Train the Model
python
model
= MultinomialNB()
model.fit(X_train_tfidf,
y_train)
Step
6: Make Predictions
python
y_pred
= model.predict(X_test_tfidf)
Model Evaluation: Accuracy, Precision, Recall
1.
Accuracy – Measures overall correctness.
python
print("Accuracy:",
accuracy_score(y_test, y_pred))
2.
Precision – Measures how many predicted spam emails are actually spam.
python
print("Precision:",
precision_score(y_test, y_pred))
3.
Recall– Measures how many actual spam emails were correctly classified.
python
print("Recall:",
recall_score(y_test, y_pred))
Conclusion
This
lesson covered:
✅ *Classification & its applications*
✅ *Logistic Regression, Decision Trees, Random Forest*
✅ *Spam Email Classifier using Naïve Bayes*
✅ *Model evaluation with Accuracy, Precision, and Recall*
Assignment:
Train
a Spam Classifier model using the SMS Spam Dataset.
Compare
the accuracy of different models (Logistic Regression
vs Random Forest).
Week
5: Unsupervised Learning - Clustering & NLP
This
week, we will explore Unsupervised Learning, focusing on Clustering and Natural
Language Processing (NLP)
1. What
is Unsupervised Learning?
Unsupervised
learning is a type of machine learning where the algorithm finds patterns in
data without labeled outputs.
Example
Applications:
-
Customer Segmentation: Grouping similar customers based on behavior.
- Anomaly
Detection: Identifying fraud in financial transactions.
- Document
Clustering: Organizing news articles into topics.
2.
Clustering Techniques
K-Means Clustering
- A
popular algorithm that groups data points into K clusters.
- Each
data point is assigned to the nearest cluster center.
- Used
in: Market segmentation, Image compression.
Implementation
in Python:
python
from
sklearn.cluster import KMeans
import
numpy as np
#
Sample data
data =
np.array([[1, 2], [1, 4], [1, 0],
[10, 2], [10, 4], [10, 0]])
#
Apply K-Means with 2 clusters
kmeans
= KMeans(n_clusters=2, random_state=0).fit(data)
print("Cluster
Centers:", kmeans.cluster_centers_)
print("Labels:",
kmeans.labels_)
Hierarchical
Clustering
-
Creates a tree-like structure of clusters.
- Does
not require specifying the number of clusters beforehand.
Implementation
in Python:
python
from
scipy.cluster.hierarchy import dendrogram, linkage
import
matplotlib.pyplot as plt
#
Perform Hierarchical Clustering
linked
= linkage(data, method='ward')
# Plot
Dendrogram
plt.figure(figsize=(8,
5))
dendrogram(linked)
plt.show()
3.
Natural Language Processing (NLP) Basics
NLP
enables machines to understand, interpret, and generate human language.
*Common
NLP tasks:*
✅ *Text Classification* (Spam detection)
✅ *Named Entity Recognition* (Identifying names, places in
text)
✅ *Sentiment Analysis* (Determining the emotion behind
text)
###
*Preprocessing Text Data with NLP*
python
import
nltk
from
nltk.tokenize import word_tokenize
from
nltk.corpus import stopwords
import
string
nltk.download('punkt')
nltk.download('stopwords')
text =
"Natural Language Processing (NLP) is amazing!"
tokens
= word_tokenize(text.lower()) #
Tokenization
filtered_words
= [word for word in tokens if word not in stopwords.words('english') and word
not in string.punctuation]
print("Processed
Text:", filtered_words)
4.
Sentiment Analysis using NLP*
We
will analyze *Twitter data* to classify sentiments as *positive, negative, or
neutral*.
*Implementation
in Python:*
python
from
textblob import TextBlob
#
Sample tweets
tweets
= ["I love Python!", "This is so frustrating.", "I am
feeling okay today."]
#
Perform Sentiment Analysis
for
tweet in tweets:
sentiment =
TextBlob(tweet).sentiment.polarity
if sentiment > 0:
print(f"'{tweet}' - Positive 😊")
elif sentiment < 0:
print(f"'{tweet}' - Negative 😠")
else:
print(f"'{tweet}' - Neutral 😐")
Project:
Twitter Sentiment Analysis*
Step
1: Install Required Libraries*
pip
install tweepy textblob
Step
2: Authenticate with Twitter API*
python
import
tweepy
# Set
up API keys (Get these from Twitter Developer Portal)
api_key
= "your_api_key"
api_secret
= "your_api_secret"
access_token
= "your_access_token"
access_secret
= "your_access_secret"
auth =
tweepy.OAuthHandler(api_key, api_secret)
auth.set_access_token(access_token,
access_secret)
api =
tweepy.API(auth)
Step
3: Fetch and Analyze Tweets
python
public_tweets
= api.search_tweets(q="AI", count=10)
# Search for "AI" tweets
for
tweet in public_tweets:
analysis = TextBlob(tweet.text)
sentiment = "Positive" if
analysis.sentiment.polarity > 0 else "Negative" if
analysis.sentiment.polarity < 0 else "Neutral"
print(f"Tweet:
{tweet.text}\nSentiment: {sentiment}\n")
##
*Conclusion*
This
week, we learned:
✅ *Clustering with K-Means & Hierarchical Clustering*
✅ *NLP Basics & Sentiment Analysis*
✅ *Twitter Sentiment Analysis Project*
Assignment:
Use
NLP to classify tweets as positive or negative.
Visualize
sentiment trends using Word Clouds.
Week 6: Deep Learning & Neural Networks
This
week, we will dive into *Deep Learning* and explore how *Neural Networks* work.
We will also implement a project on *Handwritten Digit Recognition* using
*TensorFlow and Keras*.
1.
What is Deep Learning?
Deep
Learning is a subset of *Machine Learning* that uses *Artificial Neural
Networks (ANNs)* to learn from large amounts of data.
Key
Features of Deep Learning:
✅ *Learns from raw data* (images, text, audio)
✅ *Reduces the need for manual feature engineering*
✅ *Performs well on complex tasks* like image recognition and
NLP
2. Building
a Simple Neural Network
We
will use *TensorFlow* and *Keras* to create a *basic neural network*.
Step
1: Install Required Libraries
bash
pip
install tensorflow keras numpy matplotlib
Step
2: Create a Neural Network
python
import
tensorflow as tf
from
tensorflow import keras
import
numpy as np
#
Sample dataset (X: inputs, Y: outputs)
X =
np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32)
Y =
np.array([[0], [1], [1], [0]], dtype=np.float32) # XOR logic
#
Define the model
model
= keras.Sequential([
keras.layers.Dense(4, activation='relu',
input_shape=(2,)),
keras.layers.Dense(1, activation='sigmoid')
])
#
Compile the model
model.compile(optimizer='adam',
loss='binary_crossentropy', metrics=['accuracy'])
#
Train the model
model.fit(X,
Y, epochs=100, verbose=1)
# Test
the model
predictions
= model.predict(X)
print("Predictions:",
predictions)
3.
Convolutional Neural Networks (CNNs)
CNNs
are a special type of neural network designed for *image recognition*.
*Key
Components of CNNs:*
✅ *Convolution Layer* – Extracts features from images
✅ *Pooling Layer* – Reduces the size of images
✅ *Fully Connected Layer* – Makes final predictions
CNN
Architecture Example
python
from
tensorflow.keras.models import Sequential
from
tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
#
Define CNN model
cnn_model
= Sequential([
Conv2D(32, (3,3), activation='relu',
input_shape=(28,28,1)),
MaxPooling2D(2,2),
Flatten(),
Dense(128, activation='relu'),
Dense(10, activation='softmax')
])
#
Compile model
cnn_model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy', metrics=['accuracy'])
print(cnn_model.summary())
## *4.
Project: Handwritten Digit Recognition (MNIST Dataset)*
We
will build a CNN model to recognize handwritten digits from the *MNIST
dataset*.
Step
1: Load the MNIST Dataset
python
from
tensorflow.keras.datasets import mnist
import
matplotlib.pyplot as plt
# Load
dataset
(X_train,
y_train), (X_test, y_test) = mnist.load_data()
#
Display a sample image
plt.imshow(X_train[0],
cmap='gray')
plt.show()
Step
2: Preprocess the Data
python
#
Normalize pixel values
X_train
= X_train / 255.0
X_test
= X_test / 255.0
#
Reshape for CNN input
X_train
= X_train.reshape(-1, 28, 28, 1)
X_test
= X_test.reshape(-1, 28, 28, 1)
Step
3: Train the CNN Model
python
cnn_model.fit(X_train,
y_train, epochs=5, validation_data=(X_test, y_test))
Step
4: Evaluate and Test
python
test_loss,
test_acc = cnn_model.evaluate(X_test, y_test)
print("Test
Accuracy:", test_acc)
#
Predict a sample image
import
numpy as np
sample
= np.expand_dims(X_test[0], axis=0)
prediction
= np.argmax(cnn_model.predict(sample))
print("Predicted
Label:", prediction)
Conclusion
This
week, we learned:
✅ *How Neural Networks work*
✅ *Building a simple ANN with Keras*
✅ *Understanding CNNs for image classification*
✅ *Handwritten Digit Recognition with MNIST*
Would
you like to explore *Recurrent Neural Networks (RNNs)* next?
Assignment:
Train
a CNN model to recognize handwritten digits.
Test
your model with new images.
Final
Project Ideas (Choose One)
✅ Chatbot using
NLP
✅ Face
Recognition System
✅ Movie
Recommendation System
✅ Stock Market
Price Prediction
0 comments:
Post a Comment