CSU Capstone Project

Explore songs through learned similarity.

This project uses machine learning embeddings and nearest-neighbor search to surface songs that are acoustically or stylistically related. Browse the catalog, search by song title, view spectrograms, and inspect each track’s 20 most similar songs.

106,401
Tracks with similarity neighbors
20
Nearest neighbors per track
PANNs
Embedding model family
FAISS
Similarity retrieval engine

About the project

Dataset

The catalog metadata comes from the Free Music Archive. The frontend uses static assets hosted in the music-sim-capstone-data S3 bucket.

Similarity

Tracks are embedded into a shared vector space. A FAISS index is then used to retrieve the 20 nearest neighbors for each track.

Visualization

Each song page can display a precomputed spectrogram and a short audio clip so that visual patterns and audible texture can be compared side by side.

How it works

Audio features are embedded

Each track is passed through the trained audio pipeline to produce a normalized embedding vector.

Nearest neighbors are precomputed

For every track, the top 20 closest songs are stored so the application can return results quickly.

The browser connects to the API

Static pages call the backend for search, genre listings, track metadata, and similar song results.