Separating Gaussians

Data ScienceSVDMathematicsDatasets

How to distinguish points from same gaussian and different gaussians.

Let us say we are given a cluster of points and asked which points belong to same gaussian and which to different gaussians. Obviously, we would be saying if they are near, then they belong to sam gaussian and if far away, they belong to different gaussians.

Good, but that is so vague and no preciseness is seen in there. So, if we are talking in math, then we need to specifiy the distance between two points so that we are saying they don't belong to the same gaussian.

Let us get into it.

In general, the points in a dd-dimensional space looks like all clustered near the equator, if we say that we are keeping a direction to be north. That looks something like below.

Points Clustered Near the Equator
In high dimensions, most of the volume of a sphere is concentrated near its equator.

Let us take a point xx at random from a spherical unit variance gaussian. Now rotate the coordinate system in such a way that the point is aligned to the north pole. Now independently generate another point yy. We know that the mass is concentrated in the annulus near the equator of width O(1)O(1) and at a distance of O(d)O(\sqrt{d}). Hence we know that the distance of the points xx and yy are at O(d)O(\sqrt{d}) and the distance between them is defined as

xyx2+y1 |x - y| \approx \sqrt{|x|^2 + |y|^1}

Now let us consider points from different gaussians. Let us have 2 spherical gaussians of unit variance and seperated by a distance Δ\Delta. The point xx is from left gaussian and the point yy is from right gaussian. Now the distance between them is close to Δ2+2d\sqrt{\Delta^2 + 2d}. The following figure is directly taken from the book "Foundations of data science".

Points Clustered Near the Equator
Image of points from different gaussians with a distance of Delta

To undersand the reason behind why the distance between points xx and yy is Δ2+2d\sqrt{\Delta^2 + 2d} let us rotate the image and see another image.