C++ Virtual Functions,Abstract Class,Templates,Overloading,Special Member Function
Introduction
This article describes basic concepts of C++ Virtual functions and abstract class
Virtual Functions
By default, C++ matches a function call with the correct function definition at compile time.This is called static binding
You can specify that the compiler match a function call with the correct function definition at run time; this is called dynamic binding
You declare a function with the keyword virtual if you want the compiler to use dynamic binding for that specific functions
The virtual keyword indicates to the compiler that it should choose the appropriate definition of f() not by the type of reference, but by the type of object that the reference refers to.
a virtual function is a member function you may redefine for other derived classes, and can ensure that the compiler will call the redefined virtual function for an object of the corresponding derived class, even if you call that function with a pointer or reference to a base class of the object.
A class that declares or inherits a virtual function is called a polymorphic class.
You redefine a virtual member function, like any member function, in any derived class.
Any member function declared as virtual with same name in derived class is also considered
as virtual,irrespective of the parameter type
A virtual function cannot be global or static because, by definition, a virtual function is a member function
of a base class and relies on a specific object to determine which implementation of the function is called.
You can declare a virtual function to be a friend of another class.
If a function is declared virtual in its base class, you can still access it directly using the scope resolution (::) operator.
In this case, the virtual function call mechanism is suppressed and the function implementation defined in the base class is used.
In addition, if you do not override a virtual member function in a derived class,
a call to that function uses the function implementation defined in the base classes
The return type of an overriding virtual function may differ from the return type of the overridden virtual function.
The derived class return a pointer or reference to a class which is a direct or indirect base class
of type returned by the Base class.
You cannot override one virtual function with two or more ambiguous virtual functions.
This can happen in a derived class that inherits from two nonvirtual bases that are derived from a virtual base class.
As long as ambiguous function is not called,the compiler will not give error
The access for a virtual function is specified when it is declared.
The access rules for a virtual function are not affected by the access rules for the function
that later overrides the virtual functions
If a virtual function is called with a pointer or reference to a class object,
the type of the class object is not used to determine the access of the virtual function.
Instead, the type of the pointer or reference to the class object is used.
An abstract class is a class that is designed to be specifically used as a base class.
. An abstract class contains at least one pure virtual function
You declare a pure virtual function by using a pure specifier (= 0) in the declaration of a virtual member function in the class declaration.
You cannot use an abstract class as a parameter type, a function return type,
or the type of an explicit conversion, nor can you declare an object of an abstract class.
You can, however, declare pointers and references to an abstract classes
A function declaration cannot have both a pure specifier and a definition.A pure declaration forces
a definition in the derived class.
Virtual member functions are inherited.
A class derived from an abstract base class will also be abstract unless you override each pure virtual
function in the derived class.
Note that you can derive an abstract class from a nonabstract class,
and you can override a non-pure virtual function with a pure virtual function.
You can call member functions from a constructor or destructor of an abstract class.
However, the results of calling (directly or indirectly) a pure virtual function from its constructor are undefined.
If you declare a base class destructor as virtual,
a derived class destructor will override that base class destructor, even though destructors are not inherited.
This article describes basic concepts of C++ Inheritance mechanism.
Inheritance
Inheritance is a mechanism of reusing and extending existing classes without modifying them, thus producing hierarchical relationships between them.
Inheritance is implemented in C++ through the mechanism of derivation. Derivation allows you to derive a class, called a derived class, from another class, called a base classes
The class whose members you want to include in your new class is called a base class. Your new class is derived from the base class.
You can also add new data members and member functions to the derived class.
When you derive a class, the derived class inherits class members of the base class.
You can refer to inherited members (base class members) as if they were members of the derived classes
You can modify the implementation of existing member functions or data by
overriding base class member functions or data in the newly derived class.
If a member of a derived class has the same name as a base class, the base class name is hidden in the derived class.
You may derive classes from other derived classes, thereby creating another level of inheritance.
The number of levels of inheritance is only limited by resources.
The new class contains a subobject of the type of the base classes
Multiple Inheritance
Multiple inheritance allows you to create a derived class that inherits properties from more than one base classes
Because a derived class inherits members from all its base classes, ambiguities can result
In case of ambigious data members a solution would be to override the base class members
we can still access the override members of the base class using scope resolution operator
A direct base class cannot appear in the base list of a derived class more than once:
A derived class can inherit an indirect base class more than once.However this may result in
ambiguities since it can be considered that 2 subobjects of Parent class exist which are both accessible throught
the derived class.
Virtual Base Class
A derived can have virtual and non virtual base classes
If a derived classes ,inherits from same base class via multiple inheritance,it will have
2 subobjects of the base class leading to ambiguity,hence the virtual base class keyword can
be used to specify that derived object contain only one copy of the inherited base class.
In an inheritance containing virtual base classes,
a name of class member that can be reached through more than one path
of inheritance is accessed through the path that gives the most access.
Ambiguous base classes
The order of derivation is relevant only to determine the order of default initialization by
constructors and cleanup by destructors.
Using Keyword Declaration
A member function in the derived class will hide all other members with same
name in the base class regardless of type of arguments or return type
A using declaration in a definition of a class A allows you to introduce a name of a data member or
member function from a base class of A into the scope of A
If the function with same names as that of base class declared with using keyword
is present in the derived class the function in base class will be hidden and no conflict
arises.
Using Declaration cannot resolve ambiguities due to same inherited members
The declaration of a member with an ambiguous name in a derived class is not an error. The ambiguity is only flagged as an error if you use the ambiguous member name.
Pointer Derived Class and Base Class
Pointer to a Derived class can be converted to pointer of the base class
A pointer to base class cannot be derived to case class.
Members and friends of a class can implicitly convert a pointer to an object of that class to a pointer to either
protected or public direct or indirect base class and direct private base class.
A direct base class is a base class that appears directly as a base specifier in the declaration of its derived class.
An indirect base class is a base class that does not appear directly in the declaration of the derived class but is available to the derived class through one of its base classes
For a given class, all base classes that are not direct base classes are indirect base classes. The following example demonstrates direct and indirect base classes
Classes that are declared but not defined are not allowed in base lists.
Inherited member access
while inheriting the members from base class,we can hide some information of the base class.This can be
done by specifying the access specifier.
An access specifier can be public, private, or protected.
In a public base class access the public and protected members of the base class remain
public and protected members of the derived class respectively.
In a protected base class access the public and protected members of the base class
are protected members of the derived class.
In a private base class access the public and protected members of the base class
become the private members of the derived class.
The private members of the base class remain private members of derived classes
Private members of the base class cannot be used by the derived class unless friend declarations within the base class explicitly grant access to them.
The default access specifier is private
Increasing Access
Suppose class B is a direct base class of class A. To restrict access of class B to the members of class A, derive B from A using either the access specifiers protected or private.
To increase the access of a member x of class A inherited from class B, use a using declaration.You cannot restrict the access to x with a using declarations
Access can be increase to member inherited as private or declared/inherited as protected
Access to members declared as private in the base class cannot be increased
Access to a member cannot be reduced with a using declaration
This article describes basic concepts of C++ Storage Access Specifiers.
Storage Access Specifiers
A storage class specifier is used to refine the declaration of a variable, a function, and parameters.
A storage class specifier do not specify the scope but combined with scope to determine
the storage duration and visibility of items.
The storage class specifier used within the declaration determines whether:
The object is to be stored in memory or in a register, if available
The object has internal, external, or no linkage
The object can be referenced throughout a program or only within the function, block, or source file where the variable is defined
The storage duration for the object is static (storage is maintained throughout program run time) or automatic (storage is maintained only during the execution of the block where the object is defined)
- In C / C++ there are 4 different storage classes available: auto,extern,register,static,mutable
How these specifiers affect objects depend also on the scope in which they appear.
Storage class specifiers are keywords you can use in declarations to control storage duration and linkage.
The linkage determines if declaration in different scopes can refer to the same object.
- Storage class specifiers tell compiler the duration and visibility of the variables or objects declared, as
well as, where the variables or objects should be stored.
Static Storage Specifier
Class members can be declared using the storage class specifier static in the class member list
The declaration of a static data member in the member list of a class is not a definition.
It must be defined outside the class declaration.
When you declare an object of a class having a static member, the static member is not part of the class object.
Only one copy of the static member is shared by all objects of a class in a program.
You can access a static member of a class by qualifying the class name using the :: (scope resolution) operator
Once you define a static data member, it exists even though no objects of the static data member's class exist.
The progam will print value 10.
Static data members of a class in namespace scope have external linkage
The following command shows the symbols with extern linkage and we can find
the static data member in this list.The $-g$ indicates only symbols with extern
linkages are displayed.We can also see that symbol is initialized in the data section
which is indicated by $\mathcal{D}$
A static data can be initialized while defining in in namespace scope.
A static data member can also be initialized while declaring
the class only if it is also declared as const.
A static data member cannot be declared as mutable
Local Classes cannot have static
data members.
static data member can be of any type except void or void qualified with const or volatile.
Static data members and their initializers can access other static private and protected members of their classes
Static Member function
You cannot have static and nonstatic member functions with the same names and the same number and type of arguments.
Like static data members, you may access a static member function of a class without the object of the class
A static member function does not have a this pointer
A static member function can be declared with const,volatile type qualifiers
A static member function cannot be declared as virtual function
A static member function can access only the names of static members, enumerators,
and nested types of the class in which it is declared
A static member function cannot access the non static members of the class or base class
class X{
public:
int b;
static int a;
enum mode{a1,a2,a3};
static int inc()
{
return b;//this will give an error
}
X(){};
};
Static data members of class have extern linkage,they can be accessed
and duration is the lifetime of the program.
This can be used for class members that are required to be used
in all the files and its value retained across all the files,
for example logging class,monitoring class etc.
If a class object is defined as static,then this serves
as a static storage access specifier,which defines the scope
of the item to be file scope,duration is program execution
and linkage is internal linkage.
This article describes basic concepts of C++
const and volatile type qualifiers.
Type Qualifiers
Type specifiers indicate the type of the object or function being declared.
Type Qualifiers adds refinement to the type of the object or function being declared.
C++ recognizes const and volatile type qualifiers.
Const Type Qualifiers
C const qualifier explicitly declares a data object as something that cannot be changed
A const object or variable must be initialized and its value is set at initialization.
In case of class object,all the member variables must be initialized in the constructor
A default user defined constructor will initialize all the values to random value.
When object is defined as constant,no data members of class can be modified,or any non-constant
member function cannot be called.
You cannot use const data objects in expressions requiring a modifiable lvalue,ie left side of assignment
An object that is declared const is guaranteed to remain constant for its lifetime, not throughout the entire execution of the program.
And thus const variables cannot be used in place of constant or in a constant expression.
The const variable k is const for the lifetime of the function,it is initialized to value of j
at the start of the function and remains a const till the function returns only.
A const variable has internal linkage by default,storage qualifiers need to be specified
explicitly. Const Objects can be used in header files,since they have internal linkages
and can be used instead of \#define for constant values;
For a const pointer, you must put the keyword between the * and the identifier
In this case the the value of pointer/address cannot be changed but the value
the pointer is pointing to can be changed.
To define a pointer to a const value,the keyword must be placed before the type
specifier.
We can change the values of a,b,c and since in the last line
pointer points to address of b,we can deference the pointer and obtain
the value of new variable.
we can see that the pointer can be changed,however changing the value
the pointer y points gives a error.It is a pointer to a const integer
Any expression of the form $*y=1$ will give an error.We cannot change
the value of the variable via pointer.
\\
In the below expression neither the value pointed to or the pointer
can be changed
Declaring a member function as const means that the this pointer
is a pointer to a const object.
\\
There are reasons that you will want the compiler to prevent a member function from being able to
change any private member variables of the calling objects
\\
The data member of class will be constant within that function.
A const member function cannot call any non-const member function
If a object of class is delared as const,it can only call const functions
of the class,but not non-const functions of the class.Delaring the object
as const means that this pointer is a pointer to a const object.
If a function is defined as const we can still change the value using const_cast
A better approach would be to declare the variable desired to be changed as mutable.
The mutable storage class specifier is used only on a class data member to make it modifiable even though the
member is part of an object declared as const or function is declared as const.
- Pointer to constant data can be used as function parameters to prevent the function from modifying a
parameter passed through a pointer or passed as a reference.
The member function can also return a const value or reference to const value.
A return of a object invokes a copy constructor while returning a reference
does not invoke a copy constructor.Thus for efficiency it can be used.However the
return by reference supplies the local address of a variable,not if the main object
has been deallocated,the value which address points to is undefined.
\\
The reference should be to an object that exists.
we can see that value pointed by x4 is difference after the object has been deallocated.
This is inherent drawback of call be reference,where user must ensure that the object
is not being referenced.
however in case of applications like streaming camera frames,sensor data etc
we do not want to allocate a new address for every new frame of data .
In this case returning a constant reference or pointer provides a effcient way
of providing the data without copying it or allocating it everytime.
Volatile Type Qualifier
The volatile qualifier maintains consistency of memory access to data objects
Volatile objects are read from memory each time their value is needed, and written back to memory each time they are changed.
The volatile qualifier declares a data object that can have its value changed in ways outside the control or detection of the compiler
and the compiler is thereby notified not to apply certain optimizations to code referring to the object.
When applied to a class ,all members of the class have the same type qualifiers.
An item can be both volatile and const.In which case the item is modified by some asynchronous process.
You can define or declare any function to return a pointer to a volatile or const function.
You can declare or define a volatile or const function only if it is a nonstatic member function.
This article tells us how to grant access to its non public members
by the use of friend mechanism.
Friend Mechanism
A friend of a class X is a function or class that is not a member of X, but is granted the same access to X as the members of X
Functions declared with the friend specifier in a class member list are called friend functions of that class
Classes declared with the friend specifier in the member list of another class are called friend classes of that class.
A class Y must be defined before any member of Y can be declared a friend of another class.
We can define a function as a friend or an entire class as friend.
The function can be a member of a class or declared in the global namespace.
The function needs to be declared before it can be used as friend.
If a friend function is a member of another class,a scope resolution operator
needs to be used ,for example Y::print(X\& x);
however a function
can be defined provided function is unqualified and has namespace scope
(function is not a member of function of another class or declared in another workspace)
and class is non local.
If a class F is a friend of class A,then every member function and static data member definition of
class F has access to class A.
For using the class as a friend,in the declaration must be elaborated class name ,
for example friend class F
A friend class can be defined after it is declared as friend in other class
A friend cannot be declared with a storage class specifier.
A class cannot be defined within a friend declaration,
The scope of a friend class name is the first nonclass enclosing scope
In the example the scope of the friend class name is class Z1 and not class
Friends of a base class are inherited by any classes derived from that base class
The functionality that the friend of the base class are inherited by the derived
by the base class may be compiler dependent.The g++ compiler on Linux
supports this feature,while many online references suggest this inheritance does not hold.
If a class is derived from a friend class ,then it is also a friend of the original
base class.This feature may also be compiler dependent.The g++ compiler on Linux
supports this feature,while many online references suggest this inheritance does not hold.
Friendships are not transitive
if A is a friend of B,and C is a friend of A,it does not imply that C is a friend of B
unless explicitly specified.
In the article we will look at the application of Mean Shift Tracking for color based
tracking.
Mean shift
The object model used in mean shift tracking is color probability distribution.
Now we have a object model,given an image we can compute the likelihood image
Each pixel in likelihood image represents the likelihood that pixel belongs
to the object model/histogram.
\[
L_u(y_i) = p_u*\delta(b(y_i)-u)
\]
This likelihood image assigns to each pixel a similarity measure wrt object model.
It is reasonable to assume that the region in which the highest similarity measure
or highest density is observed is a good estimate of object location.
Thus if we consider a small window and move towards the mean value
ie along the mean shift vector we should eventually reach the region of maximum similarity.
The likelihood surface is not smooth,we can give it properties
of smoothness using kernel density estimation.
Now we can find the modes of the similarity surface using standard
mean shift algorithm.
Let us assume that current estimate of mode of function is at $y$.
Thus we consider a small rectangular window about $y$,compute
the mean shift vector and take a small step along the mean shift vector.
In principle this should enable to find the local maximum.
Similarity surface is discontinuous and to do this we need to perform KDE over entire image
consisting of dense grid of points ,which is a expensive operation.(We need to perform
convolution at each point with Gaussian with suitably large aperture)
Using Similarity for tracking
The concept of similarity surface can be made useful in tracking application.
Let us consider a small region of interest about a present location y,
we can compute the similarity score about this region,perform KDE
on this small region,obtain a similarity surface and compute the mean shift vector.
If object is not present in region,similarity surface will be flat and mean shift
vector will be zero.
If there is object present in some part of region,it will correspond
to modes of similarity surface .The mean shift vector will give us
direction to move along.
Now instead of trying to estimate the mode,say we translate the region
of interest along direction provided by the mean shift vector.
This would typically lead to large portion of object being visible
and would expose the region of global similarity surface where
a large maximum would lie.
This is the basis of mean shift tracking,keen on translating the region
of interest ,till we reach local maximum of similarity surface.
For tracking applications ,since fast computation is required,
we can consider a rectangular window with bandwidth equal to that
of the region of interest.The present location of point point
is the center of the rectangular region.
Implementation
A video of mean shift tracking is shown below.A naive object model based
on color probability in HS color space using first frame of the video
Another issue is that if the object is moving too fast and significant part of
the object moves out of ROI in successive frames,the object will not be tracked.
This can be seen in the following videos
If we encounter a larger object or object that exhibits higher density ,tracking will
be lost.In the below case the tracking is lost when object passes over a large blue
background which is similar to object color.
There are many other cases where mean shift tracking will fail
As will all tracking approaches ,the performance heavily depends on the object model.
The better we are able to model the object and obtain a likelyhood/similarity which
does not show high probability for background or other objects in the scene,the more
accurate will be the tracking
Code
For further image processing application a library consisting of high level
interface to opencv will be used.The library is called OpenVisionLibrary.
https://github.com/pi19404/OpenVision
The project cmake file is included in the repository.
the build will create the library and test files in the bin directory
To run demo program for mean shift run the binary meanShiftTest
The files for mean shift algorithm are meanshift.cpp and meanshift.hpp
https://github.com/pi19404/OpenVision/tree/master/ImgProc/
repository.
To run the test program :
Select the region of interest and click the build model
button to start tracking.
For video file initially only the first frame is show ,select
the ROI in the first frame and then click on build model button
to start the tracking.
The button is shown upon clicking on Display properties
button on the window.
In the article we will look at the basics of Mean Shift Algorithm.
Kernel density Estimation
let us first consider a univariate gaussian PDF and sampled data from the PDF.
The kernel density estimation uses fact that the density of samples about a given point is proportional to its probability.
It approximates the probability density by estimating the local density of points
as seen in figure fig:image1 is resonable.
Large density of points are observed near the maximum of PDF.
original PDFSampled data
Density estimate
Density estimationfig:image3
The KDE estimate the PDF by superposition of kernel at each data point,which is equivalent
to convolving the data points with a gaussian kernel.
Mean Shift
Mean shift is a tool for finding the modes in a set of data samples
which is sampled from underlying PDF.The aim of the mean
shift algorithm is to find the densest region in given set of data samples.
Data points density implies PDF value .
Let us consider a 2D region.The points in the 2D region are sampled
from a underlying unknown PDF.
Let $X=[x,y] $be random variables associated with a multi-variate
PDF $P(X)$.
Thus sampling a point $P(X)$ will give us a vector $X'=[x',y']$
For example let us consider a multi-variate gaussian distribution
where the random variables x and y take values in the range
-3 to 3.
Modes of Smooth function
Let us say we want to find the modes of PDF.The PDF is approximated using kernel density
estimation.Modes are the points at which PDF exhibits local maximum .
Dense regions in PDF corresponds to modes or local maxima.
Since the kernel is smooth,its differentiable.It gives to a smooth PDF.The gradient of density estimate is given by
\begin{eqnarray*}
\hat{f}_h(x) = \frac{1}{n}\sum_{i=1}^n K_h (x - x_i) \quad = \frac{1}{nh} \sum_{i=1}^n K\Big(\frac{x-x_i}{h}\Big)\\
\nabla \hat{f}_h(x) = \frac{1}{nh}\sum_{i=1}^n \nabla K\Big(\frac{x-x_i}{h}\Big) \\
for gaussian kernel \\
\nabla \hat{f}_h(x) = \frac{C}{nh}\sum_{i=1}^n \nabla exp\Big(\frac{-(x-x_i)^2}{2*h^2}\Big) \\
\nabla \hat{f}_h(x) = \frac{C}{nh*h^2}\sum_{i=1}^n exp\Big(\frac{-(x-x_i)^2}{2*h^2}\Big)*(-(x-xi)) \\
\nabla \hat{f}_h(x) = \frac{1}{nh*h^2}\sum_{i=1}^n K\Big(\frac{x-x_i}{h}\Big)*(-(x-xi)) \\
equating the gradient to 0 \\
\sum_{i=1}^n K\Big(\frac{x'-x_i}{h}\Big)*(x-xi)=0 \\
\sum_{i=1}^n K\Big(\frac{x'-x_i}{h}\Big)*x = \sum_{i=1}^n K\Big(\frac{x-x_i}{h}\Big)x_i\\
x'=\frac{\sum_{i=1}^n K\Big(\frac{x-x_i}{h}\Big)x_i}{\sum_{i=1}^n K\Big(\frac{x-x_i}{h}\Big)}
\end{eqnarray*}
The estimate is $x'=m(x)$ is called the sample mean at x with kernel K.
This will always be biased towards region with high density.
Thus if we were to move along the vector $m(x)-x$,we would reach the region with higher
density.The density at $m(x)$ will be greater than density at $x$.
This forms the basis of mean shift algorithm.
The vector $m(x)-x$ is called the mean shift vector which always points in the direction of
local maximum or mode.
\begin{eqnarray*}
m(x)-x = \frac{\sum_{i=1}^n K\Big(\frac{x-x_i}{h}\Big)(x_i-x)}{\sum_{i=1}^n K\Big(\frac{x-x_i}{h}\Big)} \\
m(x)-x = \frac{-h^2\nabla f_h(x)}{f_h(x)}
\end{eqnarray*}
This is a estimate of normalize gradient of $f_h(x)$
Given any point x,we can take a small step in the direction of vector
$m(x)-x$ to reach the local maximum.
Let us consider that the present estimate of the mode is $x$,
then we compute $m(x)$ at this point.
For examples let initial estimate of the location of mode be $(-0.96,-2.01)$
The density at this point can be approximated by interpolation
or computed again using non parametric density estimation
The plot for this is show in fig:image5.The estimate clearly does
not lie at maximum.
To find the direction of the mean shift vector we find the gradient
of the normalize density estimate and take a smalll step in that
direction.This is perform till gradient magnitude is very small
Mean Shiftfig:image5
A video for mean shift algorithm using KDE is shown in
In this case we scale the gradient by the estimated PDF values to obtain normalize
gradient values.
\begin{eqnarray*}
m(x)-x = \frac{-h^2\nabla f_h(x)}{f_h(x)}
\end{eqnarray*}
This enabled us to adaptively change the step size based on estimated
PDF value.The step size magnitude is iversly proportional to estimated PDF values.
if the estimated PDF values is small,we are far away from the maximum and the
step size will be large.
If the estimate PDf value is large,we are close to maximum and the step
size will be small.
Using the Gradient
to find the modes of the PDF,we do not actually required to estimate
the PDF,we require just the gradient of the PDF to move in the direction
of the mean shift vector.
The gradient of superposition of kernels centered at each data point
is equivalent to convolving the data points with gradient of the kernel.
Instead of convolving with gaussian kernel,we can convolve with
gradient of gaussian kernel.
\begin{eqnarray*}
k(X) = C exp\Big(-\frac{x^2+y^2}{2*h}\Big) \\
\nabla k(X) = \Big[-\frac{x}{h}*k(x) ; -\frac{y}{h}*k(x) \Big]
\end{eqnarray*}
Thus given a intial point ${X}$ ,we estimate the value at ${X}$ using the kernels
$-\frac{x}{h}*k(x)$ and $-\frac{x}{h}*k(x)$ which gives us the direction of gradient at the point
$X$
Since we do not actually estimate the PDF at a point,but estimate the gradient of PDF
each time during the mean shift iteration we need to take a step in direction
of mean shift vector,in the earlier case ,we used the scale the gradient
by the estimated PDF values to obtained a normalized measure.
However in the present case we do not adaptively change the step size
but take a step of fixed size in direction of the gradient.
This still incorporates some adaptive behavior,since mean shift vector magnitude
depends on the gradient magnitude.
If gradient magnitude is large,step size take will be large else
step take will be small and refined ,near the maximum.
video of mean shift algorithm using gradient estimates is shown in
This iterative algorithm is a standard gradient descent algorithm
and the convergence is guranteed for infinately small step size.
Since the algorithm depends on kernel density estimate,
the bandwith of kernel will play a important role in mean shift
algorithm as well.
Local Maxima
If we reach a region,where local density is flat or we have
reached a local maximum.The algorithm will terminate.
this is a problem in case of all algorithms trying to reach a global
maximum.The animation for the same is shown in
The Code is written in matlab and
available in repository https://github.com/pi19404/m19404/tree/master/meanshift
the file mean_shift.m is the main file.The file kgde2 implements kernel density estimator
using bivariate gaussian windows for 2D distributions.The file kgde2x implements
estimation of gradient on KDE .The dim parameter decides the computation of gradient
along x and y directions.
1-D Kernel Density Estimation For Image Processing
Introduction
In the article we will look at the basics methods for Kernel Density Estimation.
Non Parametric Methods
The idea of the non-parametric approach is to avoid restrictive assumptions about the
form of f(x) and to estimate this directly from the data rather than assuming some
parametric form for the distribution eg gaussian,expotential,mixture of gaussian etc.
Kernel Density Estimation
kernel density estimation (KDE) is a non-parametric way to estimate the probability density function
where the estimation about the population/PDF is performed using a finite data sample.
A general expression for non parametric density estimation is
\[
p(x) = \frac{k}{NV}
\]
where k is number of examples inside V
V is the volume surrounding x
N is total number of examples
Histograms are most simplest form of non-parametric method to estimate the PDF .
To construct a histogram, we divide the interval covered by the data values into equal sub-intervals, known as bins.
Every time, a data value falls into a particular sub-interval/bin the count associated with bin is incremented by 1.
For histogram V can be defined $WxH$ where W is bin width and H is unbounded
original image
\\
Hue histogram bin width 6bin width 1
Object modelfig:image1
In the figure fig:image1 the hue histogram of rectangular region of image is shown.
Histograms are described by bin width and range of values.
In the above the range of Hue values is $0-180$ and the number of bins are 30
We can see that histograms are discontinuous ,which may not necessarily be due
to underlying discontinuity of underlying PDF but also due to discretization due
to bins and Inaccuracies may also exist in the histogram due to binning .
Histograms are not smooth and depend on endpoints and width of the bins
This can be seen in figure fig:image1 b.
Typically estimate becomes better as we increase the number of points
and shrink the bin width and this is true in case of general non parametric
estimation as seen in figure fig:image1 c.
In practice the number of samples are finite,thus we not observe
samples for all possible values,in such case if the bin width is small,we may observe
that bin does no enclose any samples and estimate will exhibit large discontinuties.
For histogram we group adajcent sample values into a bin.
Kernel Density Estimation
Kernel density estimation provides another method to arrive at estimate of PDF
under small sample size.The density of samples about a given point is proportional to its probability.
It approximate the probability density by estimating the local density of points
as seen in figure fig:image3
Parzen-window density estimation is essentially a data-interpolation technique
and provide a general framework for kernel density estimation.
Given an instance of the random sample, ${\bf x}$, Parzen-windowing estimates the PDF $P(X)$ from which the sample was derived
It essentially superposes kernel functions placed at each observation so that each observation
$x_i$ contributes to the PDF estimate.
Suppose that we want to estimate the value of the PDF $P(X)$ at point $x$.
Then, we can place a window function at $x$ and determine how many observations $x_i$ fall within our window or,
rather, what is the contribution of each observation $x_i$ to this windowing
The PDF value $P (x)$ is then the sum total of the contributions from the observations to this window
Let $(x_1,x_2,\ldots, x_n)$ be an iid sample drawn from some distribution with an unknown density $\mathcal{f}$.
We are interested in estimating the probability distribution $\mathcal{f}$. Its parzen window estimate is defined
as
\[
\hat{f}_h(x) = \frac{1}{n}\sum_{i=1}^n K_h (x - x_i) \quad = \frac{1}{nh} \sum_{i=1}^n K\Big(\frac{x-x_i}{h}\Big)
\]
Where $\mathcal{K}$ is called the kernel,$h$ is called its bandwidth,$k_h$ is called a scaled kernel
Kernel density estimates are related to histograms,but possess properties like smoothness
or smoothness by using a suitable kernel.
Commonly used kernel functions are uniform,gaussian,Epanechnikov etc
Superposition of kernels centered at each data point is equivalent to convolving the data points
with the kernel.we are smoothing the histogram by performing convolution with a kernel.
Different kernels will produce different effects.
Rectangular windows
For univariate case the rectangular windows encloses k examples about a region of width h
centered about x on the histogram.
To find the number of examples that fall within this region ,the kernel function is defined as
\begin{equation*}
k(x) = \begin{cases}
1 & |x| \le h,\\
0 & otherwise
\end{cases}
\end{equation*}
hence total number of bins of histogram be 180,hence bin width is 1.Let us apply a window function with bandwidth
6,12,18 etc and observe the effect on histogram
The kernel density estimate
using parzen window of bandwidth 6,12 and 18 are shown in figure fig:image2.
The kernel function for the gaussian window is defined as
\begin{eqnarray*}
k(x) = C*exp\Big(-\frac{x^2}{2*\sigma^2}\Big)
\end{eqnarray*}
Instead of a parze rectangular window let us apply a gaussian window
of width 6,12 and 18 and observe the effects on the histogram
bandwidth 6bin width 12bin width 18
Gaussian windowfig:image4
It can be seen that estimate of PDF is smooth,however the bandwidth plays an important role
in the estimated PDF.A small bandwidth of 6 estimates a bimodal PDF width peaks well seperated.
A bandwidth of 12,still is bimodal however the peaks are no longer seperated.
A larger bandwidth of 16 estimates a unimodal PDF.
The bandwidth of the kernel is a free parameter which exhibits a strong influence on
estimate of the PDF.Selecting bandwidth is a tradeoff between accuracy and generality.
In this section we will look at techniques for histogram comparison .
Histogram Comparison
In many application histogram is used as object model .Thus to compare
an unknown object with a known object we can compute the similarity between their
histograms.
The histogram of image shows the frequency distribution of different pixel intensities.
The value of histogram for pixel level/intensity $\mathcal{g}$ shows the total
count of pixels in the image with pixel intensity $\mathcal{g}$.Let this be denoted
by $h_a(g)$
\[
h_a(g) = \sum_{i,j \in x} \delta(g-A(i,j))
\]
\subsection{Minkowski norm}
The histogram can be viewed as vectors and similariy is calculated using
Minkowski norm $(L_p)$.
The manhattan distance $p=1$ and euclidean distance $p=2$ are special cases of minkowski norm.
\begin{eqnarray*}
d_{L_1}(h_a,h_b) = \sum_{g \in G} | h_a(g) - h_b(g) |
d_{L_2}(h_a,h_b) = \sum_{g \in G} | h_a(g) - h_b(g) |^2
\end{eqnarray*}
A high value indicates a mismatch while low value indicates a better match.0 indicates a perfect
mismatch while total mismatch is 1 for normalized histogram.
\subsection{Histogram Intersection}
For 2 histograms $h_a$ and $h_b$ the histogram intersection is defined as
\begin{eqnarray*}
d_I(h_a,h_b) = \sum_{g \in G} min(h_a(g),h_b(g))
\end{eqnarray*}
A high score indicates a good match while low score indicates bad match.if both histograms are normalized
then 1 is a perfect match while 0 indicates total mismatch indicating no regions of histogram overlap.
\subsection{Correlation}
For 2 histograms $h_a$ and $h_b$ the histogram correlation is defined as ,where histograms
are simple considered as discrete 1-D signals.
\begin{eqnarray*}
d_C(h_a,h_b) = \frac{\sum_{g \in G} (h_a(g)-\bar{h_a})(h_b(g)-\bar{h_b})}{\sqrt{ \sum_{g \in G} (h_a(g)-\bar{h_a})^2 \sum_{g \in G} (h_b(g)-\bar{h_b})}}
where
\bar{h_k} = \frac{1}{N} \sum_j H_k(j)
N is the total number of bins
\end{eqnarray*}
If the histogram is normalized ,then
\[
\sum_j H_k(j) = 1 ,\bar{h_k} = \frac{1}{N}
\]
The value of 1 represents a perfect match ,-1 indicates a ideal mismatch
while 0 indidates no correlation.
For correlation a high score represents a better match than a low score.
\subsection{Chi-Square}
For 2 histograms $h_a$ and $h_b$ the histogram similarity using Ch-Squared method is defined as
\[
d(h_a,h_b) = \sum_{g\in G} \frac{\left(H_a(g)-H_b(g)\right)^2}{H_a(g)}
\]
A low score represents a better match than a high score.0 is a perfect match while
mismatch is unbounded.
The chi-square test is testing the null hypothesis,
which states that there is no significant difference between the expected and observed result
\subsection{Bhattacharya Distance}
In statistics, the Bhattacharyya distance measures the similarity of two discrete or continuous probability distributions.
It is closely related to the Bhattacharyya coefficient which is a measure of the amount of overlap between two statistical samples or populations
\[
d(h_a,h_b) = \sqrt{1 - \frac{1}{\sqrt{\bar{h_a} \bar{h_b} N^2}} \sum_{g\in G} \sqrt{h_a(g) \cdot h_b(g)}}
\]
For Bhattacharyya matching , low scores indicate good matches
and high scores indicate bad matches. A value of 0 is perfect match while 1 indicates
a perfect mismatch.
multi-channel images
For histogram with independ channels,values of similarity scores for each channel
can be added seperately.
Implementation
The opencv libraries are used to perform histogram related operations.All the histogram
related functions are encapsulate in the class named histogram.
The code is present in files histogram.cpp and histogram.hpp
which are wrapper classes for calling opencv histogram functions and performing
comparison operations etc.
Let us consider the Hue histogram for following images and below are the similarity score using the
bhattachrya distance method,chi-square,intersection and correlation methods.
Consider the sets of image
1-2
0.5912
12.221
0.2798
0.2342
1-3
0.9841
3209.8
0.0027
-0.0775
1-4
0.7662
288.80
0.1975
0.0814
We can see that images 1 and 2 will have similar histograms ,for correlation and intersection
for image pair 1-2 will be greater than 1-3 .Indicating the similarity of histograms 1-2 incomparison
to 2-3.
Chi square mismatch is unbounded as can be seen by high value.A large value can be observed in 2-3
compared to 1-2
The bhattachrya distance is close to 0 for better match,and nearly 1 for mismatch,
for 1-3 value is almost 1 indicating a ideal mismatch.
In case of image pair 1-2 and 1-4 are similar,but coefficients indicate that histogram
of 1-2 is more similar than 1-4.
Histogram Comparison
The similarity measures are crude at best,however it might be useful in case
of discriminative task ,however for identification task it seems like
a poor measure.
Color Constancy Algorithms : Minkowski P Norm Method
Color Constancy
Color constancy is a mechanism of detection of color independent of light source.
The light source many introduce color casts in captured digital images
To solve the color constancy problem a standard method is to estimate the color of the prevailing light
and then, at the second stage, remove it.
Once the color of light in individual channels is obtained the each color pixel is normalized by a scaling factor .
In the previous article we had looked at the gray world algorithm.The present article describes another method
to achive color constancy using normalized minkowski p-norm.
normalized minkowski p-norm
Another variant to estimate illumination vector is to calculate the normalized minkowski p-norm for each color channel
color constancy algorithm which is based on Minkowski norm - for each color channel, the Minkowski p-norm is
calculated and the normalized result forms the estimated illumination vector.
The gray world algorithm is a special case of with norm =1.
\[ e_i = \left( \frac{1}{N} \sum_i p_i \right)^{1/p} \]
where summation if over all the pixels.
The average illuminatio
Gray world algorithm is obtained by setting p=1
The shades of gray, is given by Finlayson and Trezzi concluded that using
Minkowski norm with p = 6 gave the best estimation results on their data set.
One of the methods of normalization is that the mean of the three components is used as illumination estimate of the image.
To normalize the image of channel i ,the pixel value is scaled by
$s_1 = \frac{avg}{avg_i} $
where $avg_i$ is the channel mean
and $avg$ is the illumination estimate .
Another method of normalization is normalizing to the maximum channel by scaling by $s_i$
\[ r_i = \frac{max(avg_R,avg_G,avg_B)}{avg_i} \]
Another method of normalization is normalizing to the maximum channel by scaling by norm $m_i$
\[ m_i = \sqrt{(avg_r*avg_r+avg_g*avg_g+avg_b*avg_b)} \]
\[ r_i = \frac{max(m_R,m_G,m_B)}{m_i} \]
Below are some output images on which norm 6 was applied.
original normalization method 1 normalization method 2 normalization method 3
Example 2.a1:minkowski norm 6
original normalization method 1 normalization method 2 normalization method 3
For code refer to site http://www.gihub.com/pi19404/m19404/master/ColorConstancy/
The files are color_constancy.cpp and color_constancy.hpp.
The class for applying color constancy using minkowski p norm technique
The norm factor is $p=1$ for gray world algorithm and $p=6$ for shades of gray algorithm .
Various normalization techniques can be passed as $m=(1,2,3)$.