This paper was submitted on May 14, 1999 as a white paper to NASA HQ's Earth Science Enterprise in response to the Advanced Information Technology Request For Information (RFI) 10-00007.

Utilization of New Data Products based on Hierarchical Segmentation of Remotely Sensed Imagery Data in Earth Science Enterprise Applications

James C. Tilton
National Aeronautics and Space Administration
Goddard Space Flight Center, Code 935
Tel: 301-286-9510; FAX: 301-286-1776
E-mail: James.C.Tilton.1@gsfc.nasa.gov

1. Background, Current State, and Categorization of Proposed Technology

A recently developed fast implementation of a very general Hierarchical Segmentation (HSEG) algorithm on parallel computers has made practical the utilization of this analysis software in a wide variety of Earth Science Enterprise applications. This new software technology has wide application in two technology classes listed in the Advanced Information Systems Technology Request For Information (RFI): #3. Data and Information Production and #4. Analysis, Search, and Display. It also can be used as a component of an image compression scheme, making the technology also applicable to a third technology category: #2. Transmission. This new software is a widely applicable, crosscutting technology.

In combination with a recently developed "Region Labeling Tool," the hierarchical image segmentations produced by the new HSEG algorithm can find immediate application in many Earth science analysis problems where landcover classes need to be identified and mapped. The labeling tool greatly facilitates the identification of region segments and the selection of appropriate levels of segmentation detail. More general applications will be possible later in the near-term and in to the mid-term (after 2 or more years of additional development). These more general applications would be keyed on making a set of hierarchical image segmentations a generally available standard data product. These applications include compression (the standard data product can serve as a compressed form of the imagery data), data archival and browse (based on the coarser segmentations from the standard product), as well as application to content-based search and retrieval, data mining, image registration, data fusion and visualization.

Image segmentation is a partitioning of an image into sections or regions. These regions may be later associated with ground cover type or land use, but the segmentation process simply gives generic labels (region 1, region 2, etc.) to each region. The regions consist of groupings of multispectral or hyperspectral image pixels that have similar data feature values. These data feature values may be the multispectral or hyperspectral data values themselves and/or they may be derived features such as band ratios or textural features.

Image segmentation is a key first step in a number of approaches to image analysis and image compression. In image analysis, the group of pixels contained in each region provides a good statistical sampling of data values for more reliable labeling based on multispectral or hyperspectral feature values. In addition, the region shape can be analyzed as an additional clue for the appropriate labeling of the region. In image compression, the regions form a basis for a compact representation of the image data.

Most image segmentation approaches can be placed in one of three classes: (i) characteristic feature thresholding or clustering, (ii) boundary detection, or (iii) region growing.

Characteristic feature thresholding or clustering is often ineffective because it does not exploit spatial information. Boundary detection does exploit spatial information through examining local edges found throughout the image. For simple noise-free images, detection of edges results in straightforward boundary delineation. However, edge detection on noisy, complex images often produces missing edges and extra edges producing region boundaries that do not necessarily form a set of closed connected curves that surround connected regions. We prefer region growing for application to remotely sensed imagery data because it exploits spatial information and guarantees the formation of closed connected regions. However, region growing is not without its problems.

With region growing, spectrally similar but spatially disjoint regions are never associated together, and it is often not clear at what point the region growing process should be terminated. Also, region growing tends to be a computationally intensive process.

We have developed a hybrid region growing and spectral clustering approach that overcomes these problems. The hybridization with spectral clustering allows association of spectrally similar but spatially disjoint regions. The approach also includes the detection of natural convergence points. The region label maps produced at each natural convergence point are combined to form the hierarchical image segmentation product. Finally, the recursive version of this approach is very effectively implemented on MIMD (Multiple Instruction, Multiple Data stream) or SIMD (Single Instruction, Multiple Data stream) parallel computers, which greatly reduces the amount of time required to segment large images with this approach.

The algorithm is currently implemented on two MIMD computers: the 1024 processor SGI (Cray) T3E supercomputer, and on a Beowulf class multi-computer, the 130 processor HIVE, at NASA's Goddard Space Flight Center. Because of the nature of the calculations performed, the HIVE implementation is somewhat faster for the recursive portion of the algorithm (the initial stages), while the SGI T3E is very much faster for the final stage in which detection of the natural convergence points is performed. A moderately sized 1536x1536 4-band multispectral image can be processed in about 85 minutes with an implementation using the HIVE for the initial stages and the SGI T3E for the final stage. A faster implementation (on the order of 10 to 100 times faster, depending on the level of coding) is possible on a SIMD parallel computer, such as the 4096 processor Gamma II Plus offered by Cambridge Parallel Processing.

A production version of the current HIVE/SGI T3E research implementation is acceptable for immediate near-term applications (0 to 2 year timeframe) such as described in scenario one and two below. A faster implementation will be required, however, for later near-term and mid-term applications (2 to 6 year timeframe) such as envisioned in scenario three below.

2. Applicability of Proposed Technology to Earth Science Systems

The HSEG algorithm enables the reorientation of the analysis of remotely sensed imagery data from a pixel oriented mode to a region oriented mode. This allows the use of more robust labeling and analysis schemes that are less affected by minor data variations due to noise and the natural variability of the scenes imaged, since the labeling and analysis can now be carried out on a region-by-region manner, that is, with a coherent sample of pixels, rather than being carried out on a pixel-by-pixel manner. In addition, it enables archival storage of the imagery data in a hierarchical progressive scheme that facilitates image data browsing and acquisition from the data archives.

To best describe the applicability of the hierarchical image segmentation product to Earth science analysis, we offer three example scenarios. The first two scenarios are specific multispectral analysis problems, and the third is a more general data production and analysis scenario.

The first example scenario is of the use of our Region Labeling tool in labeling the hierarchical set of image segmentations produced by processing a Landsat TM data set with our HSEG algorithm. The Landsat TM data set we are analyzing covers an area from the mid-eastern shore of Maryland on the East, to the eastern edge of West Virginia on the West, just south of the Pennsylvania border on the North, and mid-Virginia on the South. The area covered includes the upper Chesapeake Bay and the Baltimore, MD and Washington, DC metropolitan areas.

With the Region Labeling tool, the analyst displays an RGB rendition of the image data, and chooses to start by selecting a pixel in the middle of the Chesapeake Bay, and requests the region from the finest segmentation from the segmentation hierarchy containing the selected point. In a few seconds, this region is highlighted in purple for the analyst. After a brief inspection, the analyst is satisfied that the highlighted region contains only deep water, so he/she labels the region "Deep Water" and colors it dark blue.

The analyst then selects a pixel from the Chesapeake Bay bridge and requests the region from the finest segmentation from the segmentation hierarchy containing this point. The analyst finds that the region he selected contains many other bridge and highway pixels throughout the scene, plus many other urban pixels (mainly central urban areas). He labels this region "Urban/Roads" and colors it bright yellow.

Similarly selecting other previously unlabeled pixels from the Chesapeake Bay bridge, an other recognizable regions are found and labeled through an inspection of regions formed from the finest segmentation from the segmentation hierarchy containing the selected pixels. In this manner the analyst found a region that contains other bridge and highway pixels plus other urban pixels, with a large number of the urban pixels are in industrial areas. He/she labels this region "Industrial/Roads" and colors it a lighter yellow. Another region so found contains a large number of lake and river pixels in additions to a significant number of bridge pixels. This region turns out to be predominantly water, with a large number of mixed water/bridge pixels. The analyst labels this region "Water/Mixed" and colors it light blue.

The analyst then selects a pixel in the middle of a heavily wooded area in the western portion of the scene, and requests the region from the finest segmentation from the segmentation hierarchy containing this point. He then explores coarser segmentations from the segmentation hierarchy, and finds that the segmentation at the seventh level from the twenty-one level hierarchy provides a more complete labeling of the wooded areas. He labels the resulting region "Wooded" and colors it green.

The analyst then notes that the imprints of several highways are evident through the wooded areas. He selects a pixel from one of these highways and requests the region from the finest segmentation from the segmentation hierarchy containing this point. He notes that the highlighted region actually is predominantly grassy areas (the pixel he selected along the highway primarily covered the grassy right-of-way rather than the road itself), and that the eighth level from the twenty-one level hierarchy provides a more complete labeling of these grassy areas. He labels the resulting region "Grasslands" and colors it light green.

The analyst continues in the same manner to label agricultural fields, marshlands, etc. The analyst can "correct" mislabelings caused by selecting a too coarse of a level from the segmentation hierarchy by selecting a pixel from a mislabeled area and requested a finer segmentation from the segmentation hierarchy to define a region for relabeling. Also, if the finest segmentation in from the hierarchy is not detailed enough, the analyst can either relabel a local group of spatially disjoint regions by drawing a region of interest around the set of regions to be relabeled, or relabel individual spatially disconnected regions.

The second example scenario involves leveraging limited ground reference data into more wide-spread "ersatz"-ground reference data. In this scenario, a detailed analysis along the lines described in the first example is applied to high spatial resolution imagery producing a highly accurate labeling of the scene. This labeling is then used as ersatz-ground reference data for validating global scale classifications of moderate resolution imagery data. In this way, for example, a limited amount of conventional ground reference can be extended through a detailed analysis of data for high resolution sensors such as Landsat TM, SPOT or ASTER to produce sufficient ersatz-ground reference data to validate the analysis of data from moderate resolution sensors such as AVHRR or MODIS.

The third example is a more general data production and analysis scenario. In this scenario all imagery data from a particular instrument (e.g., Landsat TM or MODIS) is processed with the HSEG algorithm to produce segmentation hierarchies for each data set. Each data set is then stored (losslessly) in a progressively compressed form based on the segmentation hierarchies. Users then can browse the data by viewing the region mean images produced from a user selected level from the segmentation hierarchy. Once the user finds the desired data set(s), the data can then be downloaded, progressively, until the full resolution data is obtained, or until the desired level of spatial detail is obtained, whichever comes first. The user can then analyze the data with something similar to the Region Labeling tool described in the first example, or with more sophisticated analysis tools that not only utilize the spectral information from the multispectral imagery data, but also utilizes spatial information through such measures as the shapes and sizes of the regions. In addition, other spatial information measures, such as local texture measures, can now be calculated for individual, arbitrarily shaped, regions, rather than the commonly used square windows.

3. Timeline, Cost to Develop, and Commercial Application

A prototype generic Region Labeling tool has been implemented under the Khoros Pro 2000 software system, making it portable across a wide range of UNIX workstations. This tool has been developed using Landsat TM and MSS data, with urban land use change detection as the target application. Further development of this tool is necessary to make it useful in a wider range of applications. It may be advantageous to develop separate versions of the tool for each imaging sensor.

A prototype version of the HSEG algorithm is also already implemented on the HIVE and on the SGI T3E in "C" using "PVM" for MIMD parallel control. As noted earlier, a combined implementation of HSEG on the HIVE and SGI T3E provides the best processing time performance. A full Landsat MSS scene can be processed in about 6 hours with this combination of computing platforms. A processing time of about 24 hours is projected for a full Landsat TM scene. This processing time is acceptable for handling only a selected number of data sets from each particular imaging sensor, and would be acceptable for analysis scenarios one and two given above.

Some minor upgrading of the currently implemented Region Labeling Tool code and HSEG algorithm code would be required transform this "research" quality code into "production" quality code. The production versions of the Region Labeling Tool and the HSEG program would be available for immediate near-term applications (0 to 2 year timeframe).

A faster implementation is required to enable the more widespread use of this analysis technology as envisioned in scenario three above, where hierarchical segmentation are routinely produced as standard data products for a number of imaging sensors. Dr. Stewart Reddaway of Cambridge Parallel Processing projects that one to two orders of magnitude improvement in processing time (depending on the extent to which high performance implementation techniques are employed) is obtainable on CPP?s Gamma II Plus 4096 processor SIMD parallel computer (personal communication). This would bring the processing time down to between 14 and 140 minutes for a full Landsat TM scene, making near real-time processing of remotely sensed imagery data sets possible. This would require the procurement of a SIMD parallel computer and a significant, but well defined, programming effort to achieve the required processing speed. This development could be completed in time for later near-term and mid-term applications (2 to 6 year timeframe).

Another development that could be made available for the later near-term and mid-term is software for producing progressively compressed forms of multispectral or hyperspectral imagery data, based on hierarchical image segmentations produced by the HSEG algorithm, so that this representation can be used in imagery data archives. In addition, software for browsing and progressively downloading imagery data stored in such a fashion could be developed in the mid-term timeframe.

Other tools for data mining (searching for particular shapes of objects with certain spectral characteristics), and for data fusion (based on matching region features between data sets from different times and/or different sensors) can also be developed for the mid-term to exploit the hierarchical image segmentation representation.

Commercial applications of this technology would arise wherever detailed, accurate image analysis results are required. Examples include a number of medical applications (analysis of various imagery from body scans) and image analysis for nondestructive evaluation in manufacturing quality control. A particularly timely military application would be land mine detection.


James C. Tilton/NASA's GSFC (James.C.Tilton.1@gsfc.nasa.gov), May 14, 1999.