Rayleigh

Search image collections by multiple color palettes or by image color similarity.

Generate Convert Improve

Install / Use

/learn @sergeyk/Rayleigh

About this skill

Quality Score

0/100

README

Rayleigh: search images by multiple colors

Introduction

We present an open-source system for quickly searching large image collections by multiple colors given as a palette, or by color similarity to a query image. The system is running at multicolorsearch.com.

In brief

Images are represented as histograms over a fixed palette of colors in CIELab space.
Sets of colors used for searching are similarly represented as histograms.
Similarity of images to other images and to a query set of colors is formulated in terms of distance between histograms.
To make matching more robust to small differences in color, histograms are smoothed.

System components

The back end processes image URLs to extract color information to store in a searchable database.
The web server provides REST access to the back end.
The client provides the UI for selecting a color palette to search with, or an image, and requests data from the web server.

Documentation

In addition to this high-level presentation, code is documented here.

A public Trello contains current and archived tasks.

Screenshots

<img src="doc/images/black_yellow.png" width="640px" /> Rayleigh in the search-by-palette view.

<img src="doc/images/image_similarity.png" width="640px" /> Rayleigh in the search-by-image view.

Prior Art

The Content-Based Image Retrieval (CBIR) field has done much work on image similarity in the past couple of decades. My brief impression of the field is that color similarity was initially seen as proxy for general visual similarity, not an end to itself. More recent research focuses on closing the "semantic gap" in terms of nouns and verbs.

I was able to find two implementations of multi-color searching, both closed-source:

the excellent Idee Labs Multicolr Search, a rather expensive commercial service;
an experiment called Chromatik.

There is an open-source project on top of Apache Lucene for general image retrieval, called Lire. I have not tried it.

Additionally, a simple open-source implementation of a single-color search was useful to see.

Perception of Color

All visual perception is highly complex. As countless optical illusions show,

we perceive objects of the same actual size in the image as being different sizes;
we see lines where there aren't any;
and most relevantly for this project, we see things that are the same color in the image as being different colors.

To get a machine to "see" the illusory contour of the triangle in the second example is an unsolved problem in practice. If we were developing a system to search images by shape, we would struggle to return the Kanizsa Triangle above as a result for "triangle."

So it is with color.

As artists have long understood, how we see a color depends on what other colors are around it.
As in the linked illusions above, the same color values in an image can be perceived as two different colors.
Or conversely, we may perceive an object as being the same color in different images where its actual color is different (for example, you perceive your face to be the same color in an outdoors photograph, bathroom mirror, fluorescent office, and in the darkness of evening).
On top of all this, there is a layer of language on top of all this mess: different languages deliniate colors in slightly different ways (for example, Russian has two distinct colors of "blue"), and it has been shown that it actually affects color perception.

Representing Color

So the problem of searching images by color is hard, but we will still attempt it. How should we represent "color" in the search engine?

Human eye

In our eyes, there are two types of photorceptive cells: rods and cones. Rods respond only to intensity of light, not its color, and are far more numerous. Cones have three distinct types ('S', 'M', 'L'), each responding strongest to a specific wavelength of light. Our perception of color is derived from the response rates of the photoreceptors, as well as our experience with colors and expectations.

<img src="http://upload.wikimedia.org/wikipedia/commons/thumb/6/65/Cone-response.svg/500px-Cone-response.svg.png" width="320px" /> Human retinal cell response curves. Image from Wikimedia Commons.

RGB color space

On your computer, color is represented as three values, representing intensity of three color channels: red, green, and blue. The pixels in the display are composed of these three basic lights. When all are fully on, the color is white; when all are off, the color is black. All of the millions of colors that a modern computer is able to display come from mixing the three intensities.

The RGB system can be thought of as describing a three dimensional space, with the Red, Green, and Blue dimensions. A point in that space, given by the three coordinates, is a color. We can begin to think of distances between colors in this way, as a distance in 3D space between two points.

<img src="http://upload.wikimedia.org/wikipedia/commons/thumb/8/83/RGB_Cube_Show_lowgamma_cutout_b.png/320px-RGB_Cube_Show_lowgamma_cutout_b.png" width="320px" /> RGB Color Cube. Image from Wikimedia Commons.

HSV color space

An additive mixture of three distinct colors does not match our intutive model of how color works: we can't easily visualize the effect of adding red, green, or blue to a color.

Additionally, the distances in RGB spaces do not match up to perceptual judgements. That is, a color may be quite far away from another one in terms of RGB coordinates, but humans will judge the two colors as quite similar. Or the other way: two colors may look very different but be close together in RGB space.

The rainbow is what we usually visualize when we think of color: hues of the visible spectrum from the almost-infrared to the almost-ultraviolet, roughly divided into less than a dozen words. A given hue can be imagined as more vibrant than baseline---deep red, midnight blue---or as more pastel-like---pink lipstick, robin's egg blue.

The Hue-Saturation-Value color space is informed by this mental model, and strives to have one dimension corresponding to our intuitive notion of hue, and two dimensions which set the vibrancy and lightness. A point in HSV space is therefore more easily interpretable than a point in RGB space.

<img src="http://upload.wikimedia.org/wikipedia/commons/f/f1/HSV_cone.jpg" width="320px" /> HSV Color Cone. Image from Wikimedia Commons.

Perceptually uniform color spaces

Although it better matches our mental model of color, HSV space still suffers from a misalignment of perceptual judgements to 3D distance of colors.

The international standards agency has formulated an alternative color space with the explicit goal of making distances in the color space correspond to human judgments of color similarity. Actually, it couldn't decide between two: CIELab and CIELuv.

CIELab, roughly, is formed by an oval with two axes: a and b, which correspond to the "opponent" colors of Yellow-Blue and Red-Green. The third dimension of L*a*b* space is lightness, which is approximately self-describing.

As an aside, the "opponent" colors are so named because of the opponent process theory, which posits that color perception comes not from the asbolute values but from the difference in activation rates of the three types of cones in the retina. The opponent colors have no in-between point: we can imagine a point between blue and red, but not between blue and yellow; between red and yellow, but not between red and green; and so on.

In the L*a*b* space, simple Euclidean distance (\(\sqrt{L^2 + a^2 + b^2}\) between two colors, which corresponds to the intuitive notion of a distance between 3D points, is a good approximation to perceptual judgements of their difference.

<img src="http://www.couleur.org/spaces/Labspace.jpg" width="320px" /> CIELab color space. Image from couleur.org.

Representing Multiple Colors

At this point, we understand that we should use the CIELab space to represent colors as 3D points, and can treat distances between them as corresponding to perceptual judgements.

But how do we represent a whole image, composed of many colors?

Our solution is to introduce a canonical palette of colors: a small set of colors that approximately cover the color space. The color content of any image can then be represented as a histogram over the palette colors. To construct the histogram, for each color in the palette we find the percentage of pixels in the image that are nearest (in terms of Euclidean distance) to that color.

We can represent this information in a slightly different way, by showing the top colors present in the image in a type of "palette image", with the area of a color in the palette image proport

Related Skills

node-connect

349.9k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

109.8k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

349.9k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

349.9k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。