Bayer Decoding Techniques
Notes for IBM
Initial development tasks should target the
WaspTerrapix camera
subsystem.
We have several years of flight with this camera,
so we can provide you with as many samples of raw data as you
could possibly want.
I
strongly recommend that the first round of work focuses on
the nearest neighbor downsample.
It's an excellent "first" Bayer decoder. It's simple,
easy to understand, easy to test, and easy to replace with other algorithms
later.
But, to get it working, all the other problems of structuring the
Cell implementation (parceling up the data, moving it onto the Cell,
pulling the results off the Cell) have to be completed. With this
skeleton in place, other CFA decoding techniques can be attacked.
At least, that's my $0.02...
Background
Start with
BayerPattern.
It's very important to realize that, in effect,
Bayer encoding deletes 2/3 of the available
color information from every pixel.
Each photon well in the sensor still records a spatially different
location in the target scene, but each well is limited to only one
of the usual three color values due to the filter in front of the sensor.
The whole point of decoding Bayer CFA images is to somehow recover
the missing 2/3 of this information by analyzing surrounding pixels.
There's a certain amount of "fiction" introduced, then, by this
process; the goal is to introduce the "right" fiction so that it looks
pleasing and acceptable to the human eye.
Assume:
- A WaspTerrapix camera, which uses an RG/GB Bayer CFA
- Rows are numbered "from the top" starting at zero, i
- Columns are numbered "from the left" starting at zero, j
- Pixels are referred to as (i,j)
Thus:
- red values are available at (0,0), (0,2), (0,4), ... (0,4094), (2,0), (2,2), (2,4) ...
- green values are available at (0,1), (0,3), (0,5), ... (0,4095), (1,0), (1,2), (1,4), (1,4094), (2,1), (2,3) ...
- blue values are available at (1,1), (1,3), (1,5), ... (1,4095), (3,1), (3,3), (3,5) ...
Nearest Neighbor Downsample
This is probably the easiest and fastest decoding technique for
bilinear arrays.
But it sacrifices 3/4 of your physical resolution,
and that's why everyone hates it.
It produces images that are a quarter the size of the raw sensor data.
It's typically used in image preview applications; nobody in their right
mind wants to get quarter-scale imagery out of their spiffy megapixel sensor.
Each cluster of four source pixels is taken to provide a red, green, and blue value
for one result pixel.
This halves the resolution in each
direction, yielding a quarter sized image. The 4096 x 4096 detector in the
Terrapix yields a 2048 x 2048 image through this technique.
For every destination pixel D
i,j using source pixels S
r,c,
Ri,j = S2i,2j
Gi,j = S2i+1,2j
Bi,j = S2i+1,2j+1
|
|
Note: There is a common minor optimization of this algorithm.
Instead of picking one of the two green values available in every 2x2 pixel set,
both values are averaged together and the result used as the green value in
the result pixel. This is a special case of bilinear interpolation (described
below) that can be performed at fairly low cost.
Bilinear Interpolation
Bilinear interpolation is a way of generating the missing color components
by interpolating nearby instances of the desired color across the target pixel.
The source pixels used in these computations are those that are adjacent
to the target pixel. Thus, for each color being computed, either two or four
source pixels are used in the interpolation. Since all pixels of any color are
equidistant from each other and from the target pixel, these interpolations
become simple averages.
For each destination pixel, we have one color that's known and two that are unknown.
These apply for an RG/GB CFA; other maps should be obvious.
Where you have a choice of axis for interpolation below, choose consistently.
For pixels with a known red component,
R = P
i,j
G = ( P
i,j-1 + P
i,j+1 ) / 2
or
( P
i-1,j + P
i+1,j ) / 2
B = ( P
i-1,j-1 + + P
i+1,j+1 ) / 2
or
( P
i+1,j-1 + P
i-1,j+1 ) / 2
For pixels with a known blue component,
R = ( P
i-1,j-1 + + P
i+1,j+1 ) / 2
or
( P
i+1,j-1 + P
i-1,j+1 ) / 2
G = ( P
i,j-1 + P
i,j+1 ) / 2
or
( P
i-1,j + P
i+1,j ) / 2
B = P
i,j
For pixels with a known green component on
odd rows
(that is, rows also containing blue components),
R = ( P
i-1,j + P
i+1,j ) / 2
G = P
i,j
B = ( P
i,j-1 + P
i,j+1 ) / 2
And, for pixels with a known green component on
even rows
(that is, rows also containing red components),
R = ( P
i,j-1 + P
i,j+1 ) / 2
G = P
i,j
B = ( P
i-1,j + P
i+1,j ) / 2
This is easy, and is pretty much the "fuzziest" looking algorithm.
Also, when you look closely at a long vertical or horizontal
edge in the resulting image, you'll notice a tell-tale "checkerboarding"
artifact in the colors.
Whether or not you'll see these artifacts on vertical or horizontal
edges is determined by which axis you interpolated along in the
equations.
Note: Some implementation average all four available pixels when
they are adjacent, rather than choosing a single axis to interpolate
along. This isn't necessarily better, and makes the resulting image
"too fuzzy" for general use. On the other hand, it
does get rid
of some of the asymmetric edge processing, which tends to stand out
to the human eye. That is to say, it makes edges within the image
similar in processing regardless of whether they were vertical or
horizontal.
Bicubic Interpolation
VNG
ECW
AHD
ACC
Developed by Frank Holub,
http://www.my-spot.com/