Convolution PCA: Engineering independent intensity and texture features
2025-11-05 , Reston ABC

Raster texture is an important attribute for sample design and raster-based classification, regression, and clustering. We present an automated approach and companion Jupyter notebook that uses principal component analysis and convolution kernels to quantify orthogonal intensity and texture metrics.


Remotely sensed raster datasets of landscapes are commonly used as predictor variables (features) in many geospatial analyses. The information contained within these datasets span from cell values related to the earth’s surface (e.g., reflectance, temperature, and elevation) to patterns in cell values at coarser resolutions (texture) [1-5]. Both cell and texture values have been useful in sample design, classification, segmentation, and regression analyses [1-6] with a recent emphasis placed on texture and how texture can improve modeling results [1,3].
Within remote sensing, texture is quantified using convolution type analyses, which requires specifying weights that are multiplied by the cell values within a defined spatial window and summed together to attribute the center (focal) cell of that window [5]. While many common image filters (kernels) have been defined and are used to enhance, blur, and identify edges and patterns in surrounding cell values [5], the number of kernels that can be defined are infinite, making it difficult to know which kernels best quantify texture. Moreover, many cell and texture values are highly correlated and covary across bands and neighboring cells within a neighborhood window, potentially making those features redundant and less desirable for modeling.
The classical approach to using texture in an analysis is to apply well known filters to input raster surfaces and use convolved outputs as features in a predictive model [7-9]. Alternatively, kernel weights can be optimally determined (learned) for a given task and applied to underlying remotely sensed data [3, 10, 11]. However, neither approach addresses covariance among band and neighboring cell values, which can have adverse effects on modeling and can substantially increase the total amount of processing.
One common approach to address covariance for continuous data is to project data along shared axes of covariance and create independent features using a principal component analysis (PCA) [12]. For many remote sensing projects, PCAs have successfully been used to project multiband raster cell data onto orthogonal axes using component scores. Additionally, some of these same projects have successfully compressed the dimensionality of the data while keeping the majority of variation within the raster surface by selecting subsets of components that account for a known amount of variation in the data [13-16]. However, few remote sensing projects have addressed the issues of covariance in kernel cell values [9] and across image bands. Moreover, no studies have leveraged PCA component scores, derived from both band and neighboring cell values, to define kernel weights.
In this study we evaluate the use of a PCA to project multispectral imagery along orthogonal axes derived from both band and neighboring cell values. Our procedure automates the selection of optimal kernel weights for multidimensional convolution kernels based on principal component scores and the proportion of the variation (information in the data) explained by each component. To evaluate the utility of these components for modeling, we compare model fit and complexity for models derived from our components and models derived from common band and texture transformations. The spatial extent of our study includes portions of the Custer Gallatin National Forest located in southeastern Montana, USA.

Dr. Hogland is a Research Forester working for the Rocky Mountain Research Station. His research interests revolve around quantitative methods within geographic information systems (GIS) and understanding the relationships between landscape patterns and forested ecosystems processes. Current projects include: 1) quantifying forest characteristics at fine spatial scales, 2) designing, developing, and building new procedures that integrate machine learning and statistical modeling with fast raster processing (Function Modeling) to streamline spatial modeling and reduce storage space associated with GIS analyses, and 3) developing sampling strategies focused on reducing the cost of sampling while maintaining the characteristics of a representative sample.

This speaker also appears in: