Steerable pyramids decomposition (based AI studio)
The steerable filter can provide the image gradient in any orientation/direction.
Image pyramids
The image pyraminds is an efficient multi-scale representation of an image. The core idea is to decompose the original image, which is the base of the pyramid, at multiple resolution by performing a serious of down-sampling operations. This will generate a sequence of iamges with progressively lower resolution, forming a “pyramid” shape from btm to top.
Gaussian pyramid
This decomposition is used for “down sampling”, which is built by Gaussian blurring and downsampling at each level. Each level is an approximation of the previous level but with smaller dimensions and fewer details.
Algorithm steps:
- The bottom layer: $ G_0 = original\ image $
- Apply Gaussian blur: $ G_i^{\prime} = G_i \bigotimes \omega$, where $ \omega $ is the Gaussian kernel, preventing aliasing during downsampling.
- Downsample: (one possible operation) $ G_{i+1}(x,y) = G_i^{\prime}(2x,2y) $, obtain an image is $1/4$ of the original image.
- Repeat until the image becomes small enough, for example, reaching a minimum size of $ 4\times 4 $ pixels.
Laplacian pyramid
This is built from Gaussian pyramid, each Laplacian level is the difference between a Gaussian level and an upsampled version of the next level.
The image pyramid can be applied for image blending (图像融合), image compression, texture analysis, and object detection at various scales