EXTRACTING ARCHIVE FILES IN LINUX

ZIP : unzip a.zip

.gz : gunzip a.gz

.tar : tar -xvf a.tar

.tar.gz : tar -xzvf a.tar.gz

.tar.bz2 : tar -xjvf a.tar.bz2

Extracting a single file : tar -xvf a.tar a.file

Histogram Comparision

The histogram of image shows the frequency distribution of different pixel intensities.

The value of histogram for pixel level/intensity $\mathcal{g}$ shows the total count of pixels in the image with pixel intensity $\mathcal{g}$.Let this be denoted by $h_a(g)$

\[ h_a(g) = \sum_{i,j \in x} \delta(g-A(i,j)) \] \subsection{Minkowski norm} The histogram can be viewed as vectors and similariy is calculated using Minkowski norm $(L_p)$.

The manhattan distance $p=1$ and euclidean distance $p=2$ are special cases of minkowski norm.

\begin{eqnarray*} d_{L_1}(h_a,h_b) = \sum_{g \in G} | h_a(g) - h_b(g) |

d_{L_2}(h_a,h_b) = \sum_{g \in G} | h_a(g) - h_b(g) |^2 \end{eqnarray*} A high value indicates a mismatch while low value indicates a better match.0 indicates a perfect mismatch while total mismatch is 1 for normalized histogram. \subsection{Histogram Intersection} For 2 histograms $h_a$ and $h_b$ the histogram intersection is defined as \begin{eqnarray*} d_I(h_a,h_b) = \sum_{g \in G} min(h_a(g),h_b(g)) \end{eqnarray*} A high score indicates a good match while low score indicates bad match.if both histograms are normalized then 1 is a perfect match while 0 indicates total mismatch indicating no regions of histogram overlap. \subsection{Correlation} For 2 histograms $h_a$ and $h_b$ the histogram correlation is defined as ,where histograms are simple considered as discrete 1-D signals. \begin{eqnarray*} d_C(h_a,h_b) = \frac{\sum_{g \in G} (h_a(g)-\bar{h_a})(h_b(g)-\bar{h_b})}{\sqrt{ \sum_{g \in G} (h_a(g)-\bar{h_a})^2 \sum_{g \in G} (h_b(g)-\bar{h_b})}}

where \bar{h_k} = \frac{1}{N} \sum_j H_k(j)

N is the total number of bins \end{eqnarray*} If the histogram is normalized ,then \[ \sum_j H_k(j) = 1 ,\bar{h_k} = \frac{1}{N} \] The value of 1 represents a perfect match ,-1 indicates a ideal mismatch while 0 indidates no correlation.

For correlation a high score represents a better match than a low score. \subsection{Chi-Square} For 2 histograms $h_a$ and $h_b$ the histogram similarity using Ch-Squared method is defined as \[ d(h_a,h_b) = \sum_{g\in G} \frac{\left(H_a(g)-H_b(g)\right)^2}{H_a(g)} \] A low score represents a better match than a high score.0 is a perfect match while mismatch is unbounded.

The chi-square test is testing the null hypothesis, which states that there is no significant difference between the expected and observed result \subsection{Bhattacharya Distance} In statistics, the Bhattacharyya distance measures the similarity of two discrete or continuous probability distributions. It is closely related to the Bhattacharyya coefficient which is a measure of the amount of overlap between two statistical samples or populations \[ d(h_a,h_b) = \sqrt{1 - \frac{1}{\sqrt{\bar{h_a} \bar{h_b} N^2}} \sum_{g\in G} \sqrt{h_a(g) \cdot h_b(g)}} \] For Bhattacharyya matching , low scores indicate good matches and high scores indicate bad matches. A value of 0 is perfect match while 1 indicates a perfect mismatch.

The code is present in files histogram.cpp and histogram.hpp which are wrapper classes for calling opencv histogram functions and performing comparison operations etc.

Let us consider the Hue histogram for following images and below are the similarity score using the bhattachrya distance method,chi-square,intersection and correlation methods.

Consider the sets of image

1-2 | 0.5912 | 12.221 | 0.2798 | 0.2342 |

1-3 | 0.9841 | 3209.8 | 0.0027 | -0.0775 |

1-4 | 0.7662 | 288.80 | 0.1975 | 0.0814 |

We can see that images 1 and 2 will have similar histograms ,for correlation and intersection for image pair 1-2 will be greater than 1-3 .Indicating the similarity of histograms 1-2 incomparison to 2-3.

Chi square mismatch is unbounded as can be seen by high value.A large value can be observed in 2-3 compared to 1-2

The bhattachrya distance is close to 0 for better match,and nearly 1 for mismatch, for 1-3 value is almost 1 indicating a ideal mismatch.

In case of image pair 1-2 and 1-4 are similar,but coefficients indicate that histogram of 1-2 is more similar than 1-4.

Color Constancy Algorithms : Minkowski P Norm Method

The gray world algorithm is a special case of with norm =1.

\[ e_i = \left( \frac{1}{N} \sum_i p_i \right)^{1/p} \] where summation if over all the pixels.

The average illuminatio Gray world algorithm is obtained by setting p=1 The shades of gray, is given by Finlayson and Trezzi concluded that using Minkowski norm with p = 6 gave the best estimation results on their data set. One of the methods of normalization is that the mean of the three components is used as illumination estimate of the image. To normalize the image of channel i ,the pixel value is scaled by $s_1 = \frac{avg}{avg_i} $ where $avg_i$ is the channel mean and $avg$ is the illumination estimate .

Another method of normalization is normalizing to the maximum channel by scaling by $s_i$ \[ r_i = \frac{max(avg_R,avg_G,avg_B)}{avg_i} \]

Another method of normalization is normalizing to the maximum channel by scaling by norm $m_i$ \[ m_i = \sqrt{(avg_r*avg_r+avg_g*avg_g+avg_b*avg_b)} \] \[ r_i = \frac{max(m_R,m_G,m_B)}{m_i} \]

Below are some output images on which norm 6 was applied.

The class for applying color constancy using minkowski p norm technique

The norm factor is $p=1$ for gray world algorithm and $p=6$ for shades of gray algorithm . Various normalization techniques can be passed as $m=(1,2,3)$.

Subscribe to:
Posts (Atom)