AI & ML Lab Manual1_BCE.pdf artificial intelligence

KaviShetty 7 views 29 slides Feb 13, 2025
Slide 1
Slide 1 of 29
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29

About This Presentation

AI ML


Slide Content

Artificial Intlligence and Machine Learning Laboratory

Computer Science and Engineering

Bahubali college of Engineering, Shravanabelagola




















Department of Computer Science and Engineering


Lab Manual

Artificial Intelligence and Machine Learning Laboratory


Subject Code: 18CSL76

Artificial Intlligence and Machine Learning Laboratory

Computer Science and Engineering

CONTENTS



Sl.No Experiments
1 Implement A* algorithm
2 Implement AO* algorithm
3 For a given set of training data examples stored in a .CSV file, implement and demonstrate the
Candidate-Elimination algorithm to output a description of the set of all hypotheses consistent with the
training examples.

4 Write a program to demonstrate the working of the decision tree based ID3 algorithm. Use an
appropriate data set for building the decision tree and apply this knowledge to classify a new sample.

5 Build an Artificial Neural Network by implementing the Back propagation algorithm and test the same
using appropriate data sets.

6 Write a program to implement the naïve Bayesian classifier for a sample training data set stored as a
.CSV file. Compute the accuracy of the classifier, considering few test data sets.

7 Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data set for clustering
using k-Means algorithm. Compare the results of these two algorithms and comment on the quality of
clustering. You can add Java/Python ML library classes/API in the program.

8 Write a program to implement k-Nearest Neighbour algorithm to classify the iris data set. Print both
correct and wrong predictions. Java/Python ML library classes can be used for this problem

9 Implement the non-parametric Locally Weighted Regression algorithm in order to fit data points. Select
appropriate data set for your experiment and draw graphs

Artificial Intlligence and Machine Learning Laboratory

Computer Science and Engineering

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

1. Implement A* Search algorithm.

def aStarAlgo(start_node, stop_node):

open_set = set(start_node)
closed_set = set()
g = {} #store distance from starting node
parents = {}# parents contains an adjacency map of all nodes

#ditance of starting node from itself is zero
g[start_node] = 0
#start_node is root node i.e it has no parent nodes
#sostart_node is set to its own parent node
parents[start_node] = start_node


while len(open_set) > 0:
n = None

#node with lowest f() is found
for v in open_set:
if n == None or g[v] + heuristic(v) < g[n] + heuristic(n):
n = v


if n == stop_node or Graph_nodes[n] == None:
pass
else:
for (m, weight) in get_neighbors(n):
#nodes 'm' not in first and last set are added to first
#n is set its parent
if m not in open_set and m not in closed_set:
open_set.add(m)
parents[m] = n
g[m] = g[n] + weight


#for each node m,compare its distance from start i.e g(m) to the
#from start through n node
else:
if g[m] > g[n] + weight:
#update g(m)
g[m] = g[n] + weight
#change parent of m to n
parents[m] = n

#if m in closed set,remove and add to open
if m in closed_set:
closed_set.remove(m)
open_set.add(m)

if n == None:
print('Path does not exist!')
return None

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

# if the current node is the stop_node
# then we begin reconstructin the path from it to the start_node
if n == stop_node:
path = []

while parents[n] != n:
path.append(n)
n = parents[n]

path.append(start_node)

path.reverse()

print('Path found: {}'.format(path))
return path


# remove n from the open_list, and add it to closed_list
# because all of his neighbors were inspected
open_set.remove(n)
closed_set.add(n)

print('Path does not exist!')
return None

#definefuction to return neighbor and its distance
#from the passed node
def get_neighbors(v):
if v in Graph_nodes:
return Graph_nodes[v]
else:
return None
#for simplicity we ll consider heuristic distances given
#and this function returns heuristic distance for all nodes
def heuristic(n):
H_dist = {
'A': 11,
'B': 6,
'C': 99,
'D': 1,
'E': 7,
'G': 0,

}

return H_dist[n]

#Describe your graph here
Graph_nodes = {
'A': [('B', 2), ('E', 3)],
'B': [('C', 1),('G', 9)],
'C': None,
'E': [('D', 6)],
'D': [('G', 1)],

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

}
aStarAlgo('A', 'G')

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

2. Implement AO* Search algorithm.
class Graph:
def __init__(self, graph, heuristicNodeList, startNode): #instantiate graph object with
graph topology, heuristic values, start node

self.graph = graph
self.H=heuristicNodeList
self.start=startNode
self.parent={}
self.status={}
self.solutionGraph={}

def applyAOStar(self): # starts a recursive AO* algorithm
self.aoStar(self.start, False)

def getNeighbors(self, v): # gets the Neighbors of a given node
return self.graph.get(v,'')

def getStatus(self,v): # return the status of a given node
return self.status.get(v,0)

def setStatus(self,v, val): # set the status of a given node
self.status[v]=val

def getHeuristicNodeValue(self, n):
return self.H.get(n,0) # always return the heuristic value of a given node

def setHeuristicNodeValue(self, n, value):
self.H[n]=value # set the revised heuristic value of a given node


def printSolution(self):
print("FOR GRAPH SOLUTION, TRAVERSE THE GRAPH FROM THE START
NODE:",self.start)
print("------------------------------------------------------------")
print(self.solutionGraph)
print("------------------------------------------------------------")

def computeMinimumCostChildNodes(self, v): # Computes the Minimum Cost of child
nodes of a given node v
minimumCost=0
costToChildNodeListDict={}
costToChildNodeListDict[minimumCost]=[]
flag=True
for nodeInfoTupleList in self.getNeighbors(v): # iterate over all the set of child node/s
cost=0
nodeList=[]
for c, weight in nodeInfoTupleList:
cost=cost+self.getHeuristicNodeValue(c)+weight
nodeList.append(c)

if flag==True: # initialize Minimum Cost with the cost of first set of
child node/s
minimumCost=cost
costToChildNodeListDict[minimumCost]=nodeList # set the Minimum Cost child node/s

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

flag=False
else: # checking the Minimum Cost nodes with the current
Minimum Cost
if minimumCost>cost:
minimumCost=cost
costToChildNodeListDict[minimumCost]=nodeList # set the Minimum Cost child node/s


return minimumCost, costToChildNodeListDict[minimumCost] # return Minimum
Cost and Minimum Cost child node/s


def aoStar(self, v, backTracking): # AO* algorithm for a start node and backTracking
status flag

print("HEURISTIC VALUES :", self.H)
print("SOLUTION GRAPH :", self.solutionGraph)
print("PROCESSING NODE :", v)
print("-----------------------------------------------------------------------------------------")

if self.getStatus(v) >= 0: # if status node v >= 0, compute Minimum Cost nodes of v
minimumCost, childNodeList = self.computeMinimumCostChildNodes(v)
self.setHeuristicNodeValue(v, minimumCost)
self.setStatus(v,len(childNodeList))

solved=True # check the Minimum Cost nodes of v are solved
for childNode in childNodeList:
self.parent[childNode]=v
if self.getStatus(childNode)!=-1:
solved=solved & False

if solved==True: # if the Minimum Cost nodes of v are solved, set the current
node status as solved(-1)
self.setStatus(v,-1)
self.solutionGraph[v]=childNodeList # update the solution graph with the solved nodes
which may be a part of solution


if v!=self.start: # check the current node is the start node for backtracking the
current node value
self.aoStar(self.parent[v], True) # backtracking the current node value with backtracking
status set to true

if backTracking==False: # check the current call is not for backtracking
for childNode in childNodeList: # for each Minimum Cost child node
self.setStatus(childNode,0) # set the status of child node to 0(needs exploration)
self.aoStar(childNode, False) # Minimum Cost child node is further explored with
backtracking status as false



h1 = {'A': 1, 'B': 6, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 5, 'H': 7, 'I': 7, 'J': 1, 'T': 3}
graph1 = {
'A': [[('B', 1), ('C', 1)], [('D', 1)]],
'B': [[('G', 1)], [('H', 1)]],

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

'C': [[('J', 1)]],
'D': [[('E', 1), ('F', 1)]],
'G': [[('I', 1)]]
}
G1= Graph(graph1, h1, 'A')
G1.applyAOStar()
G1.printSolution()

# h2 = {'A': 1, 'B': 6, 'C': 12, 'D': 10, 'E': 4, 'F': 4, 'G': 5, 'H': 7} # Heuristic values of Nodes
# graph2 = { # Graph of Nodes and Edges
# 'A': [[('B', 1), ('C', 1)], [('D', 1)]], # Neighbors of Node 'A', B, C & D with repective
weights
# 'B': [[('G', 1)], [('H', 1)]], # Neighbors are included in a list of lists
# 'D': [[('E', 1), ('F', 1)]] # Each sublist indicate a "OR" node or "AND" nodes
# }

# G2 = Graph(graph2, h2, 'A') #Instantiate Graph object with graph, heuristic values and
start Node
# G2.applyAOStar() # Run the AO* algorithm
# G2.printSolution() # Print the solution graph as output of the AO* algorithm search

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering


3. For a given set of training data examples stored in a .CSV file, implement and
demonstrate the Candidate-Elimination algorithmto output a description of the
set of all hypotheses consistent with the training examples.

import numpy as np

import pandas as pd
data=pd.DataFrame(data=pd.read_csv('P2.csv'))

Concepts=np.array(data.iloc[:,0:-1])
target=np.array(data.iloc[:,-1])

def learn(Concepts,target):

specific_h=Concepts[0].copy()

general_h=[["?" for i in range(len(specific_h))] for i in
range(len(specific_h))] for i,h in enumerate(Concepts):
if target[i]=="yes":

for x in range(len(specific_h)):
if h[x]!=specific_h[x]:

specific_h[x]='?'
general_h[x][x]='?'

if target[i]=="no":
for x in range(len(specific_h)):

if h[x]!=specific_h[x]:
general_h[x][x]=specific_h[x]

else:
general_h[x][x]='?'

indices=[i for i,val in enumerate(general_h) if
val==['?','?','?','?','?','?']] for i in indices:
general_h.remove(['?','?','?','?','?','?'])

return specific_h,general_h
s_final,g_final=learn(Concepts,target)

print("final S:",s_final)
print("final G:",g_final)

data.head()


P2.csv:
sunny,warm,normal,strong,warm,same,yes
sunny,warm,high,strong,warm,same,yes

rainy,cold,high,strong,warm,change,no
sunny,warm,normal,strong,cool,change,yes

OUTPUT:
final S: ['sunny' 'warm' '?' 'strong' '?' '?']
final G: [['sunny', '?', '?', '?', '?', '?'], ['?', 'warm', '?', '?', '?', '?']]

sunny warm normal strong warm.1 same yes

sunny warm high strong warm same yes

rainy cold high strong warm change no

sunny warm normal strong cool change yes

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

4. Write a program to demonstrate the working of the decision tree based ID3
algorithm. Use an appropriate data set for building the decision tree and apply
this knowledge toclassify a new sample.

import math
import csv

def load_csv(filename):

lines=csv.reader(open(filename,"r"));
dataset=list(lines)

headers=dataset.pop(0)
return dataset,headers

class Node:

def __init__(self,attribute):
self.attribute=attribute

self.children=[]
self.answer=""

def subtables(data, col, delete):

dic={}
coldata=[row[col] for row in data]

attr=list(set(coldata))
counts=[0]*len(attr)

r=len(data)
c=len(data[0])

for x in range(len(attr)):
for y in range(r):

if data[y][col]==attr[x]:
counts[x] += 1

for x in range(len(attr)):

dic[attr[x]] =[[0 for i in range(c)] for j in
range(counts[x])] pos=0
for y in range(r):

if data[y][col] == attr[x]:
if delete:

del data[y][col]
dic[attr[x]][pos]= data[y]

pos+=1
return attr, dic

def entropy(S):

attr=list(set(S))
if len(attr) == 1:

return 0

counts = [0,0]
for i in range(2):

counts[i] = sum([1 for x in S if attr[i] == x]) / (len(S) * 1.0)
sums=0

for cnt in counts:

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

sums += -1 * cnt * math.log(cnt,2)

return sums

def compute_gain(data,col):
attValues, dic = subtables(data, col, delete=False)

total_entropy=entropy([row[-1] for row in data])

for x in range(len(attValues)):

ratio=len(dic[attValues[x]]) entro=entropy([row[-1]
for row in dic[attValues[x]]]) total_entropy-=
ratio*entro

return total_entropy

def build_tree(data,features):
lastcol=[row[-1] for row in data]

if(len(set(lastcol))) == 1:
node=Node("")

node.answer=lastcol[0]
return node

n = len(data[0])-1
gains=[0]*n

for col in range(n):
gains[col]=compute_gain(data,col)

split=gains.index(max(gains))

node=Node(features[split])
fea=features[:split]+features[split+1:]

attr, dic = subtables(data,split, delete=True)

for x in range(len(attr)):
child=build_tree(dic[attr[x]],fea)

node.children.append((attr[x],child))
return node

def print_tree(node,level):

if node.answer!="":
print(" "*level, node.answer)

return
print(" "*level, node.attribute)

for value, n in node.children:
print(" "*(level+1),value)

print_tree(n, level+2)

def classify(node,x_test,features):
if node.answer!="":

print(node.answer)
return

pos=features.index(node.attribute)
for value, n in node.children:

if x_test[pos]==value:
classify(n, x_test,features)

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

dataset,features= load_csv("dataset.csv")
node=build_tree(dataset,features)

print("the decision tree for the dataset using ID3 Algoritm is ")

print_tree(node,0)

testdata, features = load_csv("data3_test.csv")
for xtest in testdata:

print("The test instance : ", xtest)
print("The predicted label: ", end="")

classify(node,xtest,features)

dataset.csv
Outlook,Temparature,Humidity,Wind,Target

sunny,hot,high,weak,no
sunny,hot,high,strong,no

overcast,hot,high,weak,yes
rain,mild,high,weak,yes

rain,cool,normal,weak,yes
rain,cool,normal,strong,no

overcast,cool,normal,strong,yes
sunny,mild,high,weak,no

sunny,cool,normal,weak,yes
rain,mild,normal,weak,yes

sunny,mild,normal,strong,yes
overcast,mild,high,strong,yes

overcast,hot,normal,weak,yes
rain,mild,high,strong,no

data3_test.csv
Outlook,Temparature,Humidity,Wind
overcast,cool,normal,strong

sunny,hot,high,strong

OUTPUT:

the decision tree for the dataset using ID3 Algoritm
is Outlook
rain

Wind
weak

yes
strong

no
sunny

Humidity
normal

yes
high

no
overcast

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

yes

The test instance : ['overcast', 'cool', 'normal', 'strong']
The predicted label: yes

The test instance : ['sunny', 'hot', 'high', 'strong']
The predicted label: no

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

5. Build an Artificial Neural Network by implementing the
backpropagation algorithm and test the same using appropriate data
sets.

import numpy as np
x=np.array(([2,9],[1,5],[3,6]),dtype=float)

y=np.array(([92],[86],[89]),dtype=float)
x=x/np.amax(x,axis=0)

y=y/100

def sigmoid(x):
return 1/(1+np.exp(-x))

def derivatives_sigmaoid(x):

return x*(1-x)

epoch=7000
lr=0.1

inputlayer_nuerons=2
hiddenlayer_nuerons=3

output_nuerons=1

wh=np.random.uniform(size=(inputlayer_nuerons,hiddenlayer_nuerons))
bh=np.random.uniform(size=(1,hiddenlayer_nuerons))

wout=np.random.uniform(size=(hiddenlayer_nuerons,output_nuerons))
bout=np.random.uniform(size=(1,output_nuerons))

#Forward Propagation

for i in range(epoch):
hinp1=np.dot(x,wh)

hinp=hinp1 + bh
hlayer_act = sigmoid(hinp)

outinp1=np.dot(hlayer_act,wout)
outinp= outinp1+bout

output = sigmoid(outinp)
#Backpropagation

EO = y-output
outgrad = derivatives_sigmaoid(output)

d_output = EO* outgrad
EH = d_output.dot(wout.T)

hiddengrad = derivatives_sigmaoid(hlayer_act)
#how much hidden layer wts contributed to error
d_hiddenlayer = EH * hiddengrad
wout += hlayer_act.T.dot(d_output) *lr

# dotproduct of nextlayererror and currentlayer op
bout += np.sum(d_output, axis=0,keepdims=True)
*lr wh += x.T.dot(d_hiddenlayer) *lr

print("Input:\n",str(x))
print("Actual output:\n", str(y))

print("Predicted output:\n",output)

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

OUTPUT:
Input:
[[0.66666667 1. ]

[0.33333333 0.55555556]
[1. 0.66666667]]

Actual output:
[[0.92]

[0.86]
[0.89]]

Predicted output:
[[0.89480823]

[0.87861171]
[0.89560203]]

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

6. Write a program to implement the naïve Bayesian classifier for a sample training
data set stored as a .CSV file. Compute the accuracy of the classifier,
considering few test data sets.

import csv

import random

import math

def loadCsv(filename):
lines = csv.reader(open(filename, "r")) dataset = list(lines)

for i in range(len(dataset)):
dataset[i] = [float(x) for x in dataset[i]]

return dataset

def splitDataset(dataset, splitRatio):
trainSize = int(len(dataset) * splitRatio)

trainSet = []
copy = list(dataset)

while len(trainSet) < trainSize:
index = random.randrange(len(copy))
trainSet.append(copy.pop(index))

return [trainSet, copy]

def separateByClass(dataset):
separated = {}

for i in range(len(dataset)):
vector = dataset[i]

if (vector[-1] not in separated):
separated[vector[-1]] = []

separated[vector[-1]].append(vector)
return separated

def mean(numbers):

return sum(numbers)/float(len(numbers))

def stdev(numbers):
avg = mean(numbers)

variance = sum([pow(x-avg,2) for x in numbers])/float(len(numbers)-1)
return math.sqrt(variance)

def summarize(dataset):

summaries = [(mean(attribute), stdev(attribute)) for attribute in
zip(*dataset)] del summaries[-1]
return summaries

def summarizeByClass(dataset):

separated = separateByClass(dataset)
summaries = {}

for classValue, instances in separated.items(): summaries[classValue]
= summarize(instances)

return summaries

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

def calculateProbability(x, mean, stdev):

exponent = math.exp(-(math.pow(x-mean,2)/(2*math.pow(stdev,2))))
return (1 / (math.sqrt(2*math.pi) * stdev)) * exponent

def calculateClassProbabilities(summaries, inputVector):

probabilities = {}

for classValue, classSummaries in summaries.items():

probabilities[classValue] = 1
for i in range(len(classSummaries)):

mean, stdev = classSummaries[i] x = inputVector[i]
probabilities[classValue] *= calculateProbability(x, mean, stdev)
return probabilities

def predict(summaries, inputVector):

probabilities = calculateClassProbabilities(summaries, inputVector)
bestLabel, bestProb = None, -1

for classValue, probability in probabilities.items():
if bestLabel is None or probability > bestProb:

bestProb = probability

bestLabel = classValue
return bestLabel

def getPredictions(summaries, testSet):

predictions = []
for i in range(len(testSet)):

result = predict(summaries, testSet[i])

predictions.append(result)

return predictions

def getAccuracy(testSet, predictions):
correct = 0
for i in range(len(testSet)):

if testSet[i][-1] == predictions[i]:
correct += 1

return (correct/float(len(testSet))) * 100.0

def main():

filename = 'data.csv' splitRatio = 0.67
dataset = loadCsv(filename)

trainingSet, testSet = splitDataset(dataset, splitRatio)

print('Split {0} rows into train={1} and test={2}
rows'.format(len(dataset), len(trainingSet), len(testSet)))

# prepare model

summaries = summarizeByClass(trainingSet) # test model predictions =
getPredictions(summaries, testSet) accuracy = getAccuracy(testSet,
predictions) print('Accuracy: {0}%'.format(accuracy))

main()

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

OUTPUT :

Split 306 rows into train=205 and test=101 rows
Accuracy: 72.27722772277228%

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

6. Assuming a set of documents that need to be classified, use the naïve Bayesian
Classifier model to perform this task. Built-in Java classes/API can be used to write
the program. Calculate the accuracy, precision, and recall for your data set.

import pandas as pd

msg=pd.read_csv('data6.csv',names=['message','label']) #Tabular form
data print('Total instances in the dataset:',msg.shape[0])

msg['labelnum']=msg.label.map({'pos':1,'neg':0})

X=msg.message
Y=msg.labelnum

print('\nThe message and its label of first 5 instances are listed below')
X5, Y5 = X[0:5], Y[0:5]

for x, y in zip(X5,Y5):
print(x,',',y)

# Splitting the dataset into train and test data

from sklearn.model_selection import train_test_split
xtrain,xtest,ytrain,ytest=train_test_split(X,Y)
print('\nDataset is split into Training and Testing
samples') print('Total training instances :', xtrain.shape[0])
print('Total testing instances :', xtest.shape[0])


# Output of count vectoriser is a sparse matrix

# CountVectorizer - stands for 'feature extraction'

from sklearn.feature_extraction.text import
CountVectorizer count_vect = CountVectorizer()

xtrain_dtm = count_vect.fit_transform(xtrain) #Sparse
matrix xtest_dtm = count_vect.transform(xtest)

print('\nTotal features extracted using CountVectorizer:',xtrain_dtm.shape[1])
print('\nFeatures for first 5 training instances are listed below')
df=pd.DataFrame(xtrain_dtm.toarray(),columns=count_vect.get_feature_names())
print(df[0:5])#tabular representation

#print(xtrain_dtm) #Same as above but sparse matrix representation

# Training Naive Bayes (NB) classifier on training
data. from sklearn.naive_bayes import MultinomialNB

clf = MultinomialNB().fit(xtrain_dtm,ytrain)
predicted = clf.predict(xtest_dtm)

print('\nClassstification results of testing samples are given below')
for doc, p in zip(xtest, predicted):

print('%s -> %s ' % (doc, msg.label[p]))

#printing accuracy metrics

from sklearn import metrics
print('\nAccuracy metrics')

print('Accuracy of the classifers',metrics.accuracy_score(ytest,predicted))
print('Accuracy of the classifer is',metrics.accuracy_score(ytest,predicted)) print('Recall
:',metrics.recall_score(ytest,predicted),'\nPrecison :',metrics.precision_score(ytest,predicted))
print('Confusion matrix')

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

data6.csv
I love this sandwich,pos
This is an amazing place,pos

I feel very good about these beers,pos
This is my best work,pos

What an awesome view,pos
I do not like this restaurant,neg

I am tired of this stuff,neg
I can't deal with this,neg

He is my sworn enemy,neg
My boss is horrible,neg

This is an awesome place,pos
I do not like the taste of this juice,neg

I love to dance,pos
I am sick and tired of this place,neg

What a great holiday,pos
That is a bad locality to stay,neg

We will have good fun tomorrow,pos
I went to my enemy's house today,neg

OUTPUT:
Total instances in the dataset: 18

The message and its label of first 5 instances are listed
below I love this sandwich , 1

This is an amazing place , 1
I feel very good about these beers , 1

This is my best work , 1
What an awesome view , 1

Dataset is split into Training and Testing samples

Total training instances : 13
Total testing instances : 5

Total features extracted using CountVectorizer: 41

Features for first 5 training instances are listed below

about amazing an awesome beers best boss can dance deal ... \
0 0 0 0 0 0 0 0 0 0 0 ...
1 0 0 0 0 0 0 0 0 0 0 ...
2 0 0 0 0 0 0 1 0 0 0 ...
3 0 0 0 0 0 0 0 0 0 0 ...
4 0 0 0 0 0 1 0 0 0 0 ...
these this to today very view went what with work
0 0 1 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 1 0 0
2 0 0 0 0 0 0 0 0 0 0
3 0 0 1 1 0 0 1 0 0 0

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

4 0 1 0 0 0 0 0 0 0 1

[5 rows x 41 columns]

Classstification results of testing samples are given
below I am tired of this stuff -> pos
I am sick and tired of this place -> pos

We will have good fun tomorrow -> pos
I love this sandwich -> pos

That is a bad locality to stay -> pos

Accuracy metrics
Accuracy of the classifer is 0.6

Recall : 1.0
Precison : 0.5

Confusion matrix
[[1 2]

[0 2]]

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

7: Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data
set for clustering using k-Means algorithm. Compare the results of these two algorithms
and comment on the quality of clustering. You can add Java/Python ML library
classes/API in the program.

import matplotlib.pyplot as plt
from sklearn import datasets

from sklearn.cluster import KMeans
import pandas as pd

import numpy as np

# import some data to play with
iris = datasets.load_iris()

X = pd.DataFrame(iris.data)

X.columns = ['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width'] y
= pd.DataFrame(iris.target)
y.columns = ['Targets']

# Build the K Means Model

model = KMeans(n_clusters=3)
model.fit(X) # model.labels_ : Gives cluster no for which samples belongs to

# # Visualise the clustering
results plt.figure(figsize=(14,14))

colormap = np.array(['red', 'lime', 'black'])

# Plot the Original Classifications using Petal
features plt.subplot(2, 2, 1)

plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[y.Targets],
s=40) plt.title('Real Clusters')

plt.xlabel('Petal Length')
plt.ylabel('Petal Width')
# Plot the Models Classifications

plt.subplot(2, 2, 2)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[model.labels_], s=40)

plt.title('K-Means Clustering')
plt.xlabel('Petal Length')

plt.ylabel('Petal Width')
# General EM for GMM

from sklearn import preprocessing
# transform your data such that its distribution will have a

# mean value 0 and standard deviation of 1.
scaler = preprocessing.StandardScaler()

scaler.fit(X)
xsa = scaler.transform(X)

xs = pd.DataFrame(xsa, columns = X.columns)
from sklearn.mixture import GaussianMixture

gmm = GaussianMixture(n_components=3)
gmm.fit(xs)

gmm_y = gmm.predict(xs)
plt.subplot(2, 2, 3)

plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[gmm_y], s=40)
plt.title('GMM Clustering')

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

plt.xlabel('Petal Length')

plt.ylabel('Petal Width')
plt.show();

print('Observation: The GMM using EM algorithm based clustering matched the true
labels more closely than the Kmeans.')

OUTPUT:

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering


























































Observation: The GMM using EM algorithm based clustering matched the true labels
more closely than the Kmeans.

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering


8: Write a program to implement k-Nearest Neighbour algorithm to classify the iris

data set. Print both correct and wrong predictions. Java/Python ML library classes can
be used for this problem.

from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn import datasets iris=datasets.load_iris()


print("Iris Data set loaded...")

x_train, x_test, y_train, y_test = train_test_split(iris.data,iris.target,test_size=0.1)
print("Dataset is split into training and testing samples...")

print("Size of trainng data and its label",x_train.shape,y_train.shape)
print("Size of trainng data and its label",x_test.shape, y_test.shape) for
i in range(len(iris.target_names)):
print("Label", i , "-",str(iris.target_names[i]))

classifier = KNeighborsClassifier(n_neighbors=15)
classifier.fit(x_train, y_train)

y_pred=classifier.predict(x_test)
print("Results of Classification using K-nn with K=10")

for r in range(0,len(x_test)):

print(" Sample:", str(x_test[r]), " Actual-label:",
str(y_test[r]), " Predicted-label:", str(y_pred[r]))
print("Classification Accuracy :" , classifier.score(x_test,y_test));

OUTPUT:
Iris Data set loaded...
Dataset is split into training and testing samples...

Size of trainng data and its label (135, 4) (135,)
Size of trainng data and its label (15, 4) (15,)

Label 0 - setosa
Label 1 - versicolor

Label 2 - virginica
Results of Classification using K-nn with K=10

Sample: [5.6 2.5 3.9 1.1] Actual-label: 1 Predicted-label: 1
Sample: [4.8 3.4 1.6 0.2] Actual-label: 0 Predicted-label: 0

Sample: [6.9 3.1 5.1 2.3] Actual-label: 2 Predicted-label: 2
Sample: [7.7 3.8 6.7 2.2] Actual-label: 2 Predicted-label: 2

Sample: [4.8 3. 1.4 0.1] Actual-label: 0 Predicted-label: 0
Sample: [5.7 2.6 3.5 1. ] Actual-label: 1 Predicted-label: 1

Sample: [6.7 3.1 4.7 1.5] Actual-label: 1 Predicted-label: 1
Sample: [4.9 3.1 1.5 0.1] Actual-label: 0 Predicted-label: 0

Sample: [4.9 3.1 1.5 0.1] Actual-label: 0 Predicted-label: 0
Sample: [6.3 2.5 4.9 1.5] Actual-label: 1 Predicted-label: 1

Sample: [5.1 3.8 1.5 0.3] Actual-label: 0 Predicted-label: 0
Sample: [5.5 4.2 1.4 0.2] Actual-label: 0 Predicted-label: 0

Sample: [6.4 2.9 4.3 1.3] Actual-label: 1 Predicted-label: 1
Sample: [5.8 4. 1.2 0.2] Actual-label: 0 Predicted-label: 0

Sample: [5.8 2.6 4. 1.2] Actual-label: 1 Predicted-label: 1
Classification Accuracy : 1.0

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

9: Implement the non-parametric Locally Weighted Regression algorithm in order to
fit data points. Select appropriate data set for your experiment and draw graphs.
import matplotlib.pyplot as plt

import pandas as pd
import numpy as np

def kernel(point,xmat, k): m,n =
np.shape(xmat) weights =
np.mat(np.eye((m))) for j in
range(m):

diff = point - X[j]

weights[j,j] = np.exp(diff*diff.T/(-2.0*k**2))
return weights

def localWeight(point,xmat,ymat,k):
wei = kernel(point,xmat,k)

W = (X.T*(wei*X)).I*(X.T*(wei*ymat.T))
return W

def localWeightRegression(xmat,ymat,k):

m,n = np.shape(xmat)
ypred = np.zeros(m)

for i in range(m):
ypred[i] = xmat[i]*localWeight(xmat[i],xmat,ymat,k)

return ypred

def graphPlot(X,ypred):

sortindex = X[:,1].argsort(0) #argsort - index of the
smallest xsort = X[sortindex][:,0]
fig = plt.figure()

ax = fig.add_subplot(1,1,1)
ax.scatter(bill,tip, color='maroon')

ax.plot(xsort[:,1],ypred[sortindex], color = 'green', linewidth=5)
plt.xlabel('Total bill')

plt.ylabel('Tip')
plt.show();

# load data points

data = pd.read_csv('data10_tips.csv')

bill = np.array(data.total_bill) # We use only Bill amount and Tips
data tip = np.array(data.tip)

mbill = np.mat(bill) # .mat will convert nd array is converted in 2D
array mtip = np.mat(tip)
m= np.shape(mbill)[1]

one = np.mat(np.ones(m))

X = np.hstack((one.T,mbill.T)) # 244 rows, 2
cols # increase k to get smooth curves
ypred = localWeightRegression(X,mtip,9)

graphPlot(X,ypred)

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering

OUTPUT:

Artificial Intlligence and Machine Learning
Laboratory

Computer Science and Engineering
Tags