Principal Components Analysis (PCA) Implementation

Applies PCA to the scikit-learn breast cancer dataset to reduce feature dimensionality and visualize principal component structure.

Category
Machine Learning
Completion Date
April 2025
Technologies Used
Python 3 & Jupyter Notebook NumPy & pandas Matplotlib
Project File
Downloading is only permitted with permission from Ameen Qahtan. Contact him to get permission.

Project Overview

The notebook loads the Breast Cancer dataset via&nbsp;<code data-start=\"4904\" data-end=\"4926\" style=\"font-family: SFMono-Regular, Menlo, Monaco, Consolas, &quot;Liberation Mono&quot;, &quot;Courier New&quot;, monospace; color: rgb(214, 51, 132);\">load_breast_cancer()</code>, scales all features with&nbsp;<code data-start=\"4953\" data-end=\"4969\" style=\"font-family: SFMono-Regular, Menlo, Monaco, Consolas, &quot;Liberation Mono&quot;, &quot;Courier New&quot;, monospace; color: rgb(214, 51, 132);\">StandardScaler</code>, then fits&nbsp;<code data-start=\"4981\" data-end=\"4986\" style=\"font-family: SFMono-Regular, Menlo, Monaco, Consolas, &quot;Liberation Mono&quot;, &quot;Courier New&quot;, monospace; color: rgb(214, 51, 132);\">PCA</code>&nbsp;to capture principal components. It examines explained-variance ratios, projects the data onto the first two components for scatter‐plot visualization, and discusses how much variance is retained by successive components.

Project File

Preview of the project's File