vdb

CUDA K-Means Clustering with GPU Acceleration for High-Dimensional Data

This project leverages CUDA-based GPU acceleration to optimize vector database operations, focusing on index building performance.


Features


Implementation Details

The core implementation uses CUDA to parallelize key operations such as:


KMeans Data Structure

Original CUDA K-means is an optimized implementation that parallelizes the traditional K-means algorithm using CUDA.

Main Characteristics

This implementation provides significantly improved clustering performance compared to CPU-based approaches.


KMeans_dim Data Structure

In GPU-based parallel K-means, each thread (representing a data point) requires repeated access to centroid data stored in global memory.
Global memory access overhead increases with data dimensionality, so optimization is required.

Key Features