1-D Kernel Density Estimation For Image Processing

#### Introduction

In the article we will look at the basics methods for Kernel Density Estimation.#### Non Parametric Methods

The idea of the non-parametric approach is to avoid restrictive assumptions about the form of f(x) and to estimate this directly from the data rather than assuming some parametric form for the distribution eg gaussian,expotential,mixture of gaussian etc.#### Kernel Density Estimation

kernel density estimation (KDE) is a non-parametric way to estimate the probability density function where the estimation about the population/PDF is performed using a finite data sample.A general expression for non parametric density estimation is \[ p(x) = \frac{k}{NV} \]

- where k is number of examples inside V
- V is the volume surrounding x
- N is total number of examples

To construct a histogram, we divide the interval covered by the data values into equal sub-intervals, known as bins. Every time, a data value falls into a particular sub-interval/bin the count associated with bin is incremented by 1.

For histogram V can be defined $WxH$ where W is bin width and H is unbounded

In the figure fig:image1 the hue histogram of rectangular region of image is shown.

Histograms are described by bin width and range of values. In the above the range of Hue values is $0-180$ and the number of bins are 30

We can see that histograms are discontinuous ,which may not necessarily be due to underlying discontinuity of underlying PDF but also due to discretization due to bins and Inaccuracies may also exist in the histogram due to binning . Histograms are not smooth and depend on endpoints and width of the bins This can be seen in figure fig:image1 b.

Typically estimate becomes better as we increase the number of points and shrink the bin width and this is true in case of general non parametric estimation as seen in figure fig:image1 c.

In practice the number of samples are finite,thus we not observe samples for all possible values,in such case if the bin width is small,we may observe that bin does no enclose any samples and estimate will exhibit large discontinuties. For histogram we group adajcent sample values into a bin.

#### Kernel Density Estimation

Kernel density estimation provides another method to arrive at estimate of PDF under small sample size.The density of samples about a given point is proportional to its probability. It approximate the probability density by estimating the local density of points as seen in figure fig:image3#### Parzen window technique

Parzen-window density estimation is essentially a data-interpolation technique and provide a general framework for kernel density estimation.Given an instance of the random sample, ${\bf x}$, Parzen-windowing estimates the PDF $P(X)$ from which the sample was derived It essentially superposes kernel functions placed at each observation so that each observation $x_i$ contributes to the PDF estimate.

Suppose that we want to estimate the value of the PDF $P(X)$ at point $x$. Then, we can place a window function at $x$ and determine how many observations $x_i$ fall within our window or, rather, what is the contribution of each observation $x_i$ to this windowing

The PDF value $P (x)$ is then the sum total of the contributions from the observations to this window

Let $(x_1,x_2,\ldots, x_n)$ be an iid sample drawn from some distribution with an unknown density $\mathcal{f}$. We are interested in estimating the probability distribution $\mathcal{f}$. Its parzen window estimate is defined as \[ \hat{f}_h(x) = \frac{1}{n}\sum_{i=1}^n K_h (x - x_i) \quad = \frac{1}{nh} \sum_{i=1}^n K\Big(\frac{x-x_i}{h}\Big) \] Where $\mathcal{K}$ is called the kernel,$h$ is called its bandwidth,$k_h$ is called a scaled kernel

Kernel density estimates are related to histograms,but possess properties like smoothness or smoothness by using a suitable kernel.

Commonly used kernel functions are uniform,gaussian,Epanechnikov etc

Superposition of kernels centered at each data point is equivalent to convolving the data points with the kernel.we are smoothing the histogram by performing convolution with a kernel. Different kernels will produce different effects.

#### Rectangular windows

For univariate case the rectangular windows encloses k examples about a region of width h centered about x on the histogram.To find the number of examples that fall within this region ,the kernel function is defined as

\begin{equation*} k(x) = \begin{cases} 1 & |x| \le h,\\ 0 & otherwise \end{cases} \end{equation*} hence total number of bins of histogram be 180,hence bin width is 1.Let us apply a window function with bandwidth 6,12,18 etc and observe the effect on histogram

The kernel density estimate using parzen window of bandwidth 6,12 and 18 are shown in figure fig:image2.

#### Gaussian Windwos

The kernel function for the gaussian window is defined as \begin{eqnarray*} k(x) = C*exp\Big(-\frac{x^2}{2*\sigma^2}\Big) \end{eqnarray*} Instead of a parze rectangular window let us apply a gaussian window of width 6,12 and 18 and observe the effects on the histogramThe bandwidth of the kernel is a free parameter which exhibits a strong influence on estimate of the PDF.Selecting bandwidth is a tradeoff between accuracy and generality.