ML-2 Folder

Overview

The ml-2 folder contains scripts for machine learning tasks, focusing on exploratory data analysis and classification.

File: 5.py

Description: Performs EDA on the Iris dataset, generating pairplots and profiling reports.

Dependencies: pandas, seaborn, matplotlib, ydata_profiling

Code:

                
                import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from ydata_profiling import ProfileReport

data = sns.load_dataset('iris')
print("Missing Values:\n", data.isnull().sum())
numeric_cols = data.select_dtypes(include='number').columns
data[numeric_cols] = data[numeric_cols].fillna(data[numeric_cols].mean())
sns.pairplot(data, hue='species', markers=["o", "s", "D"])
plt.show()
profile = ProfileReport(data, title="Iris Dataset EDA Report", explorative=True)
profile.to_file("Iris_EDA_Report.html")
            

File: 6.py

Description: Logistic regression and random forest classifiers for binary classification on Iris dataset.

Dependencies: scikit-learn, matplotlib, seaborn

Code:

                
                from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt

iris = load_iris()
X = iris.data[iris.target != 2]
y = iris.target[iris.target != 2]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

log_reg = LogisticRegression()
log_reg.fit(X_train, y_train)
rf = RandomForestClassifier(n_estimators=100, random_state=42)
rf.fit(X_train, y_train)