The 2D Kernel Density plot is a smoothed color density representation of the scatterplot, based on kernel density estimation, a nonparametric technique for probability density functions. The goal of density estimation is to take a finite sample of data and to infer the underyling probability density function everywhere, including where no data point are presented. In kernel density estimation, the contribution of each data point is smoothed out from a single point into a region of vicinity. These smoothed density plot shows an average trend for the scatter plot.
To create a 2D Kernel Density plot:
Input Data |
Specify the input data. |
Settings |
|
Density Estimation data |
This determines where the calculated data for the graph is stored. |
Displaying Data |
This determines where the data of the displayed scatter plot is stored. Only available when Number of Points to Display is not 0. |
Kernel density estimation is a nonparametric technique to estimate density of scatter points. The goal of density estimation is to estimate underlying probability density function everywhere, including where no data are observed, from the existing scatter points. A kernel function is created with the datum at its center – this ensures that the kernel is symmetric about the datum. Kernel density estimation smooths the contribution of data points to give overall picture of the density of data points.
Specify a method to calculate the kernel density for defined xy grids.
Density values are calculated based on the equation below
where n is the number of elements in vector vX or vY, is ith element in vector vX and
is ith element in vector vY.
and
is the optimal bandwidths values.
Speed up the density calculation by an approximation to the exact estimation of 2D kernel density.
First 2D binning is performed on the (x, y) points to obtain a matrix with the bin counts. Then 2D Fast Fourier Transform is utilized to perform discrete convolutions for calculating density values of each grid.
4th root of density values is calculated to map the density scale to the color scale
Calculate bandwidth based on linear diffusion process.
The estimation of wx and wy simply can be calculated by:
where n is the size of vector vX or vY, is the sample standard variation for dataset vX, and
for dataset vY accordingly.
Specify the calculation method to decide which points to superimpose on the density image.
If the option is selected, kernel density of points are calculated by the interpolation on the density matrix for defined XY grids. If number of source data is very large, selecting the option can greatly improve the speed.
If the option is not selected, the density values will be calculated by the Exact Estimation method.