t-Distributed Stochastic Neighbor Embedding

Definition:

t-Distributed Stochastic Neighbor Embedding (t-SNE) is a non-linear dimensionality reduction technique primarily used for visualizing high-dimensional datasets. It maps multi-dimensional data points into a lower-dimensional space (typically 2D or 3D) in a way that preserves the local structure of the data, meaning similar points in the high-dimensional space remain close in the low-dimensional map.

Unlike linear methods like PCA, t-SNE can capture complex, non-linear relationships, making it highly effective at revealing hidden clusters and patterns in data.

Why Use t-SNE?

Real-World Example: Handwritten Digits

This is a t-SNE visualization of hundreds of handwritten digits. Each image is 64-dimensional data, reduced to 2D. The algorithm groups similar-looking numbers together. Hover over a point to see the original image!

t-SNE Visualization Simulation 🧠

This plot simulates the output of a t-SNE algorithm. Notice how distinct classes are **tightly clustered** in the 2D space, demonstrating t-SNE's ability to preserve **local data structure** from high dimensions. **Zoom** and **pan** using the interactive controls!