Logotipo INPE

Java Image Processing Cookbook

Rafael Santos

How do I compare two images to see if they are equal?

Introduction

Often people ask how can they compare images (probably meaning the contents of images) to see whether they are equal, hoping for a simple function or operator that returns true or a value between 0 and 1 where 0 means different, 1 means equal and intermediate values can be considered different degrees of similarity. Many applications would benefit of this kind of high-level comparison, for example, for detection of image copyright violations, face identification in crowds for security purposes, semantic image searching, visual navigation for robots, etc.

Unfortunately comparison of images contents is not a simple task, and in most cases, a ill-defined one.

For example, consider the images below. Most of those depict the same animal, so those could be considered equal for some purposes (e.g. while searching the web for mammals behind fences), or completely different for other purposes (e.g. looking for squirrels facing up or large mammals behind fences).

 Squirrel, saved (using Gimp) as a jpg with 85% quality.

Squirrel, saved (using Gimp) as a jpg with 85% quality.

 
 Squirrel, saved (using Gimp) as a gif. The colors are quantized.

Squirrel, saved (using Gimp) as a gif. The colors are quantized.

 
 Squirrel, cropped and scaled (using Gimp) and saved as a jpg.

Squirrel, cropped and scaled (using Gimp) and saved as a jpg.

 
 Squirrel, saved (using Gimp) as a jpg with 5% quality.

Squirrel, saved (using Gimp) as a jpg with 5% quality.

 
 The same squirrel, other position.

The same squirrel, other position.

 
 A different animal behind a green fence.

A different animal behind a green fence.

 

Most people think about the contents of the image without realizing how complex its processing by the human visual system and brain is. Do an experiment, show the first image to different people and ask "what's in this image?" -- probably most will describe the animal in the image (since it can be perceived as the subject in it), but the description itself will probably vary in completeness (number of relevant objects perceived in the image) and detail (amount of relevant information about each object). If not even humans can come up with a simple, concise description of what's in an image (so we could compare the descriptions of the contents), how can we expect a computer to do that with a simple algorithm?

The problem is that computers cannot "see" an image and "compare" it to other to determine if they are equal, where "equal" refers to the images' contents. Granted, advances in image processing and computer vision may one day make this task feasible under controlled conditions, but not now, and definitely not with simple algorithms. Consider how many variables we have to deal with: each object on the images have a shape, size, orientation, color (maybe more than one!), texture, etc.; the same object may appear different in images, e.g. it can appear in a different position, scale, orientation or illumination conditions or combinations of those; it can be partially ocluded; should we consider these variations as irrelevant for the comparison or are they important? Should we assign different "weights" for the objects for similarity comparison so similarity measures in some are more important than in others? Small objects can be missing in one of the images; should we ignore them? Is "background" an object and should it be considered in the comparison, and what, exactly, is the background?

Comparing images in regard to their contents

If you want to compare images semantically, so some images on the example above will be considered equal (or very similar), you will face a hard task that cannot be solved with simple algorithms. You will possibly have to process the image to extract a set of features that can be mapped to semantic objects even if those features are different in regard to some aspects (e.g. position, scale) and compare the objects' features instead of the pixel or image features.

You can see that image contents comparison is not a simple task, but it can be made simpler if we decide to reduce the problem to a simpler one -- instead of asking "are the object in these images equal?" we could ask "are some of the regions in this image similar, in some aspect, to regions in the other image?". Now we can deal with the problem with some image processing (and artificial intelligence) techniques.

Some steps that could solve the problem are:

  1. If needed, pre-process the image (e.g. to enhance contrast, filter noise, etc.).
  2. Do an image segmentation, process in which the image is converted to regions which contains pixels that are similar to pixels in the same region and different from pixels to other regions. This can be done using region-growing, mathematical morphology, clustering or classification algorithms. There are many algorithms to do that, just google for "image segmentation" and other keywords to get more information.
  3. With the regions, create descriptors for them. Descriptors are calculated from the region and can include shape, area, perimeter, number of holes, general color of the region, texture, orientation, position, etc.
  4. If needed, do a re-segmentation of the image, process in which regions are merged if they can be considered as belonging to the same object. Note that this step may require some high-level knowledge of the objects and the task in general, seldom being fully automatic and often being task-dependent.
  5. If needed, filter the regions that seem relevant to the task in hand, eliminating small regions or regions which are deemed unrelated to the task (again this may require some knowledge about the task).
  6. Store the image's regions' descriptor for further processing. Repeat those steps for other images.
  7. Use the descriptors for comparison of the contents of the images, using some of many algorithms for pattern matching, classification, clustering, artificial intelligence and data mining in general.

Note that the steps above are as generic as possible, and does not ensure success for a particular task. Note also that there are many possible algorithms that can be used in some steps, and each of those have many variations and parameters, therefore blind application of a off-the-shelf algorithm will probably lead to failure -- it is very important to understand what you are doing and why so you can expect useful, consistent results.

Note also that the process used to extract semantic objects in the human brain is very, very complex and flexible, being able to map not only the semantic objects, but their generic behaviour on the scenes and even their categories as well. It is very easy for us to point to the images that have a "squirrel facing up" or "small rodent hanging in a fence", but those tasks cannot be done with only image processing algorithms.

A very simple algorithm to compare images for possible similarity without considering the high-level contents

In some applications a very superficial similarity measure between images can be useful. For example, consider the task of locating images approximately similar in content to one you have, but without any guarantee whatsoever that they represent or contain the same object -- something similar to a rough, passing similarity. An obvious application would be locate all images in a disk that can be similar to a chosen one.

There are several possible ways to do that, but all of them require the extraction of some feature from the image to comparison with the features from other images and the calculation of a similarity measure. Both the feature extraction and similarity calculation algorithms can be as simple or complex as we want.

As an example, let's choose a simple feature extraction method and similarity measure, both shown below:

 Regions for similarity calculation.

Regions for similarity calculation.

 

The features for our test will be 25 RGB triples, corresponding to the average of the RGB values on the 25 regions marked in the figure on the left. The image will be normalized to 300x300 pixels. No texture or variance feature will be stored, only the color averages. Each region has 30x30 pixels.

Each image will be represented, then, a 25x3 feature vector. To calculate the similarity measure between two images A and B we will take each of the 25 regions, calculate the Euclidean distance between the regions and accumulate.

The distance from A to A will be, by definition, zero. The upper bound (maximum possible distance between two images, using this similarity measure method) is calculated as 25*(Math.sqrt( (255-0)*(255-0)+ (255-0)*(255-0)+ (255-0)*(255-0) )) or a little bit over 11041.

This method was chosen because it is simple to understand and implement and can be easily modified by the reader. It combines color (spectral) information with spatial (position/distribution) information, and is expected to be more robust (i.e. tolerant to differences) than comparing pixel by pixel or the average of the whole image, but, again, it is very simple and cannot be expected for perform well in any circumstances, being shown only as an example.

To test the feature extraction and similarity measure we will use a set of 24 test images, shown below. Some of those images have similar objects on them (wall, trees, sky) but we are not considering meaning on the images, just patches of similar colors. Images are in different sizes, click on the icons to get the full images.

The 16 images on the first two rows are photos of objects, while the last row is of images from the first two rows distorted in scale, color, position, etc.

Similarity pattern test Similarity pattern test Similarity pattern test Similarity pattern test Similarity pattern test Similarity pattern test Similarity pattern test Similarity pattern test
Similarity pattern test Similarity pattern test Similarity pattern test Similarity pattern test Similarity pattern test Similarity pattern test Similarity pattern test Similarity pattern test
Similarity pattern test Similarity pattern test Similarity pattern test Similarity pattern test Similarity pattern test Similarity pattern test Similarity pattern test Similarity pattern test

The application NaiveSimilarityFinder, show below, has a GUI which allows the user to select a reference image A and then get all images in the same directory, normalize them to the same size (300x300), extract the features, calculate the distance from the features of A and show them in order of less distant to most distant.

 downloadNaiveSimilarityFinder.java
   1 /*
   2  * Part of the Java Image Processing Cookbook, please see
   3  * http://www.lac.inpe.br/~rafael.santos/JIPCookbook.jsp
   4  * for information on usage and distribution.
   5  * Rafael Santos (rafael.santos@lac.inpe.br)
   6  */
   7 package howto.compare;
   8  
   9 import java.awt.BorderLayout;
  10 import java.awt.Color;
  11 import java.awt.Container;
  12 import java.awt.Font;
  13 import java.awt.GridLayout;
  14 import java.awt.image.RenderedImage;
  15 import java.awt.image.renderable.ParameterBlock;
  16 import java.io.File;
  17 import java.io.IOException;
  18  
  19 import javax.imageio.ImageIO;
  20 import javax.media.jai.InterpolationNearest; 
  21 import javax.media.jai.JAI;
  22 import javax.media.jai.iterator.RandomIter;
  23 import javax.media.jai.iterator.RandomIterFactory;
  24 import javax.swing.JFileChooser;
  25 import javax.swing.JFrame;
  26 import javax.swing.JLabel;
  27 import javax.swing.JOptionPane;
  28 import javax.swing.JPanel;
  29 import javax.swing.JScrollPane;
  30  
  31 import com.sun.media.jai.widget.DisplayJAI;
  32 /**
  33  * This class uses a very simple, naive similarity algorithm to compare an image
  34  * with all others in the same directory.
  35  */
  36 public class NaiveSimilarityFinder extends JFrame
  37   {
  38   // The reference image "signature" (25 representative pixels, each in R,G,B).
  39   // We use instances of Color to make things simpler.
  40   private Color[][] signature;
  41   // The base size of the images.
  42   private static final int baseSize = 300;
  43   // Where are all the files?
  44   private static final String basePath = 
  45     "/home/rafael/Pesquisa/ImageSimilarity";
  46   
  47  /*
  48   * The constructor, which creates the GUI and start the image processing task.
  49   */
  50   public NaiveSimilarityFinder(File reference) throws IOException
  51     {
  52     // Create the GUI
  53     super("Naive Similarity Finder");
  54     Container cp = getContentPane();
  55     cp.setLayout(new BorderLayout());
  56     // Put the reference, scaled, in the left part of the UI.
  57     RenderedImage ref = rescale(ImageIO.read(reference));
  58     cp.add(new DisplayJAI(ref), BorderLayout.WEST);
  59     // Calculate the signature vector for the reference.
  60     signature = calcSignature(ref);
  61     // Now we need a component to store X images in a stack, where X is the
  62     // number of images in the same directory as the original one.
  63     File[] others = getOtherImageFiles(reference);
  64     JPanel otherPanel = new JPanel(new GridLayout(others.length, 2));
  65     cp.add(new JScrollPane(otherPanel), BorderLayout.CENTER);
  66     // For each image, calculate its signature and its distance from the
  67     // reference signature.
  68     RenderedImage[] rothers = new RenderedImage[others.length];
  69     double[] distances = new double[others.length];
  70     for (int o = 0; o < others.length; o++)
  71       {
  72       rothers[o] = rescale(ImageIO.read(others[o]));
  73       distances[o] = calcDistance(rothers[o]);
  74       }
  75     // Sort those vectors *together*.
  76     for (int p1 = 0; p1 < others.length - 1; p1++)
  77       for (int p2 = p1 + 1; p2 < others.length; p2++)
  78         {
  79         if (distances[p1] > distances[p2])
  80           {
  81           double tempDist = distances[p1];
  82           distances[p1] = distances[p2];
  83           distances[p2] = tempDist;
  84           RenderedImage tempR = rothers[p1];
  85           rothers[p1] = rothers[p2];
  86           rothers[p2] = tempR;
  87           File tempF = others[p1];
  88           others[p1] = others[p2];
  89           others[p2] = tempF;
  90           }
  91         }
  92     // Add them to the UI.
  93     for (int o = 0; o < others.length; o++)
  94       {
  95       otherPanel.add(new DisplayJAI(rothers[o]));
  96       JLabel ldist = new JLabel("<html>" + others[o].getName() + "<br>"
  97           + String.format("% 13.3f", distances[o]) + "</html>");
  98       ldist.setFont(new Font(Font.MONOSPACED, Font.PLAIN, 36));
  99       System.out.printf("<td class=\"simpletable legend\"> "+
 100           "<img src=\"MiscResources/ImageSimilarity/icons/miniicon_%s\" "+
 101           "alt=\"Similarity result\"><br>% 13.3f</td>\n", others[o].getName(),distances[o]);
 102       otherPanel.add(ldist);
 103       }
 104     // More GUI details.
 105     pack();
 106     setVisible(true);
 107     setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
 108     }
 109  
 110  /*
 111   * This method rescales an image to 300,300 pixels using the JAI scale
 112   * operator.
 113   */
 114   private RenderedImage rescale(RenderedImage i)
 115     {
 116     float scaleW = ((float) baseSize) / i.getWidth();
 117     float scaleH = ((float) baseSize) / i.getHeight();
 118     // Scales the original image
 119     ParameterBlock pb = new ParameterBlock();
 120     pb.addSource(i);
 121     pb.add(scaleW);
 122     pb.add(scaleH);
 123     pb.add(0.0F);
 124     pb.add(0.0F);
 125     pb.add(new InterpolationNearest());
 126     // Creates a new, scaled image and uses it on the DisplayJAI component
 127     return JAI.create("scale", pb);
 128     }
 129   
 130  /*
 131   * This method calculates and returns signature vectors for the input image.
 132   */
 133   private Color[][] calcSignature(RenderedImage i)
 134     {
 135     // Get memory for the signature.
 136     Color[][] sig = new Color[5][5];
 137     // For each of the 25 signature values average the pixels around it.
 138     // Note that the coordinate of the central pixel is in proportions.
 139     float[] prop = new float[]
 140       {1f / 10f, 3f / 10f, 5f / 10f, 7f / 10f, 9f / 10f};
 141     for (int x = 0; x < 5; x++)
 142       for (int y = 0; y < 5; y++)
 143         sig[x][y] = averageAround(i, prop[x], prop[y]);
 144     return sig;
 145     }
 146  
 147  /*
 148   * This method averages the pixel values around a central point and return the
 149   * average as an instance of Color. The point coordinates are proportional to
 150   * the image.
 151   */
 152   private Color averageAround(RenderedImage i, double px, double py)
 153     {
 154     // Get an iterator for the image.
 155     RandomIter iterator = RandomIterFactory.create(i, null);
 156     // Get memory for a pixel and for the accumulator.
 157     double[] pixel = new double[3];
 158     double[] accum = new double[3];
 159     // The size of the sampling area.
 160     int sampleSize = 15;
 161     int numPixels = 0;
 162     // Sample the pixels.
 163     for (double x = px * baseSize - sampleSize; x < px * baseSize + sampleSize; x++)
 164       {
 165       for (double y = py * baseSize - sampleSize; y < py * baseSize + sampleSize; y++)
 166         {
 167         iterator.getPixel((int) x, (int) y, pixel);
 168         accum[0] += pixel[0];
 169         accum[1] += pixel[1];
 170         accum[2] += pixel[2];
 171         numPixels++;
 172         }
 173       }
 174     // Average the accumulated values.
 175     accum[0] /= numPixels;
 176     accum[1] /= numPixels;
 177     accum[2] /= numPixels;
 178     return new Color((int) accum[0], (int) accum[1], (int) accum[2]);
 179     }
 180  
 181  /*
 182   * This method calculates the distance between the signatures of an image and
 183   * the reference one. The signatures for the image passed as the parameter are
 184   * calculated inside the method.
 185   */
 186   private double calcDistance(RenderedImage other)
 187     {
 188     // Calculate the signature for that image.
 189     Color[][] sigOther = calcSignature(other);
 190     // There are several ways to calculate distances between two vectors,
 191     // we will calculate the sum of the distances between the RGB values of
 192     // pixels in the same positions.
 193     double dist = 0;
 194     for (int x = 0; x < 5; x++)
 195       for (int y = 0; y < 5; y++)
 196         {
 197         int r1 = signature[x][y].getRed();
 198         int g1 = signature[x][y].getGreen();
 199         int b1 = signature[x][y].getBlue();
 200         int r2 = sigOther[x][y].getRed();
 201         int g2 = sigOther[x][y].getGreen();
 202         int b2 = sigOther[x][y].getBlue();
 203         double tempDist = Math.sqrt((r1 - r2) * (r1 - r2) + (g1 - g2)
 204             * (g1 - g2) + (b1 - b2) * (b1 - b2));
 205         dist += tempDist;
 206         }
 207     return dist;
 208     }
 209  
 210  /*
 211   * This method get all image files in the same directory as the reference.
 212   * Just for kicks include also the reference image.
 213   */
 214   private File[] getOtherImageFiles(File reference)
 215     {
 216     File dir = new File(reference.getParent());
 217     // List all the image files in that directory.
 218     File[] others = dir.listFiles(new JPEGImageFileFilter());
 219     return others;
 220     }
 221  
 222  /*
 223   * The entry point for the application, which opens a file with an image that
 224   * will be used as reference and starts the application.
 225   */
 226   public static void main(String[] args) throws IOException
 227     {
 228     JFileChooser fc = new JFileChooser(basePath);
 229     fc.setFileFilter(new JPEGImageFileFilter());
 230     int res = fc.showOpenDialog(null);
 231     // We have an image!
 232     if (res == JFileChooser.APPROVE_OPTION)
 233       {
 234       File file = fc.getSelectedFile();
 235       new NaiveSimilarityFinder(file);
 236       }
 237     // Oops!
 238     else
 239       {
 240       JOptionPane.showMessageDialog(null,
 241           "You must select one image to be the reference.""Aborting...",
 242           JOptionPane.WARNING_MESSAGE);
 243       }
 244     }
 245   
 246   }
 

The application NaiveSimilarityFinder needs an auxiliary class, JPEGImageFileFilter, shown below.

 downloadJPEGImageFileFilter.java
   1 /*
   2  * Part of the Java Image Processing Cookbook, please see
   3  * http://www.lac.inpe.br/~rafael.santos/JIPCookbook.jsp
   4  * for information on usage and distribution.
   5  * Rafael Santos (rafael.santos@lac.inpe.br)
   6  */
   7 package howto.compare;
   8  
   9 import java.io.File;
  10  
  11 import javax.swing.filechooser.FileFilter;
  12  
  13 /*
  14  * This class implements a generic file name filter that allows the listing/selection
  15  * of JPEG files.
  16  */
  17 public class JPEGImageFileFilter extends FileFilter implements java.io.FileFilter
  18   {
  19   public boolean accept(File f)
  20     {
  21     if (f.getName().toLowerCase().endsWith(".jpeg")) return true;
  22     if (f.getName().toLowerCase().endsWith(".jpg")) return true;
  23     return false;
  24     }
  25   public String getDescription()
  26     {
  27     return "JPEG files";
  28     }
  29  
  30   }
 

Some test runs of the application with that image data set are shown below. The first image is a row is the reference image, and the others image in the row are best seven matches are shown (except for the match for the same image), then the worst two matches.

Original Best 7 matches Worst 2 matches
 Naive similarity comparison results (1).

Naive similarity comparison results (1).

 
Similarity result
0.000
Similarity result
1871.073
Similarity result
1886.521
Similarity result
1889.454
Similarity result
1937.989
Similarity result
2118.453
Similarity result
2220.072
Similarity result
4045.627
Similarity result
4747.496
 Naive similarity comparison results (2).

Naive similarity comparison results (2).

 
Similarity result
0.000
Similarity result
107.113
Similarity result
1727.322
Similarity result
2038.248
Similarity result
2193.063
Similarity result
2298.502
Similarity result
2309.045
Similarity result
3845.437
Similarity result
4613.132
 Naive similarity comparison results (3).

Naive similarity comparison results (3).

 
Similarity result
0.000
Similarity result
1581.717
Similarity result
1749.935
Similarity result
2002.253
Similarity result
2038.248
Similarity result
2403.926
Similarity result
2475.314
Similarity result
3841.409
Similarity result
5091.601
 Naive similarity comparison results (4).

Naive similarity comparison results (4).

 
Similarity result
0.000
Similarity result
164.477
Similarity result
1871.073
Similarity result
1871.564
Similarity result
1879.582
Similarity result
2177.755
Similarity result
2415.764
Similarity result
4583.513
Similarity result
5617.889
 Naive similarity comparison results (5).

Naive similarity comparison results (5).

 
Similarity result
0.000
Similarity result
247.391
Similarity result
1724.108
Similarity result
2084.940
Similarity result
2541.069
Similarity result
2723.918
Similarity result
2780.514
Similarity result
4527.742
Similarity result
4661.249
 Naive similarity comparison results (6).

Naive similarity comparison results (6).

 
Similarity result
0.000
Similarity result
1041.201
Similarity result
1315.953
Similarity result
2067.587
Similarity result
2070.584
Similarity result
2086.411
Similarity result
2118.453
Similarity result
3765.444
Similarity result
4096.909
 Naive similarity comparison results (7).

Naive similarity comparison results (7).

 
Similarity result
0.000
Similarity result
338.169
Similarity result
2086.411
Similarity result
2356.526
Similarity result
2365.054
Similarity result
2377.610
Similarity result
2459.092
Similarity result
4705.509
Similarity result
5408.958

We can see that some results are expected (e.g. the dissimilarity measure between two equal imagens being zero). The algorithm performed quite acceptably, assigning a relatively low measure to the distorted versions when the originals were used as the references, as long as the distortion does not makes the versions very different and the colors are not changed much.

This application could successfully help organize a large collection of digital photographies when there are many repeated shots or when the photographer took many shots of the same subject with small adjustments on the camera (e.g. photos of groups, to make sure no one has blinked).

It should be noted that the application could be modified to calculate the distance from each image to all the others, which could be used then to create a dendogram through a hierarchical clustering algorithm. The dendogram could then be used to separate the images in groups of images which are similar considering the features and measure used.

Many improvements could be done in this simple algorithm: filtering of the regions to reduce influence of noise in the average value, different weights for different regions (e.g. central regions could be considered more important than edges), distance measure in other color space than RGB, etc.

References

This list is far from complete:

For related information, see A Brief Tutorial on Supervised Image Classification and null.

Valid HTML 4.0 Transitional            Valid CSS!