DBSCAN - Space complexity of O(n)?
According to Wikipedia, "the distance matrix of size $\frac{(n^2-n)}{2}$ can be materialized to avoid distance recomputations, but this needs $O(n^2)$ memory, whereas a non-matrix based implementation of DBSCAN only needs $O(n)$ memory."
$\frac{(n^2-n)}{2}$ is basically the triangular matrix. However, it says that a non-matrix based implementation only requires $O(n)$ memory. How does that work? Regardless of what data structure you use, don't you always have to have $\frac{(n^2-n)}{2}$ distance values? It would still be $O(n^2)$ space complexity, no? Is there something I'm missing here? I'm working with a huge dataset and I would really like to cut down on memory usage.
Topic dbscan clustering scalability
Category Data Science