SECTION II - TECHNICALLY COMPLETE AND EASILY UNDERSTANDABLE DESCRIPTION OF INNOVATION DEVELOPED TO SOLVE THE PROBLEM OR MEET THE OBJECTIVE

SECTION II. TECHNICALLY COMPLETE AND EASILY UNDERSTANDABLE DESCRIPTION OF INNOVATION DEVELOPED TO SOLVE THE PROBLEM OR MEET THE OBJECTIVE

A. - Purpose and description of innovation:

The purpose of the software/innovation described herein is to provide hierarchical segmentations of data elements. While the initial application is to Earth science analysis of single-band, multispectral or hyperspectral remotely sensed imagery data of the Earth, the software/innovation also has applications to image data compression for image data archives, data mining (searching for particular shapes of objects with certain feature vector characteristics), data fusion (based on matching region features between data sets from different times and/or different sensors). Applications outside of remote sensing are the analysis of imagery for medical applications, and for nondestructive evaluation in manufacturing quality control. A particularly timely military application is land mine detection.

B. - Identification of component parts or steps, and explanation of mode of operation of innovation/software preferably referring to drawings, sketches, photographs, graphs, flow charts, and/or parts or ingredient lists illustrating the components.

Given here is a high level description of the hybrid hierarchical stepwise optimal region growing and spectral clustering approach called HSEG. This description is slightly different from the description previously given by Tilton [1].

HSEG Basic Algorithm Description:

Give each image pixel a region label and set the global criterion value, critval, equal to zero. If a pre-segmentation is provided, label each image pixel according to the pre-segmentation. Otherwise, label each image pixel as a separate region.
Calculate the dissimilarity criterion value between each spatially adjacent region.
Find the smallest dissimilarity criterion value, and merge all pairs of spatially adjacent regions with this criterion value.
Calculate the dissimilarity criterion value between all pairs of non-spatially adjacent regions.
Merge all pairs of non-spatially adjacent regions with dissimilarity criterion value less than or equal to the criterion value found in step 3.
If the number of regions remaining is less than the preset value chkregions, go to step 7. Otherwise go to step 2.
Let prevcritval = critval. Calculate the current global criterion value and set critval equal to this value. If prevcritval = zero, go to step 2. Otherwise calculate cvratio = critval/prevcritval. If cvratio is greater than the preset threshold convfact, save the region label map from the previous iteration as a "raw" segmentation result. Also store the region number of pixels list, region mean vector list and region criterion value list for this previous iteration. (Note: The region criterion value is the portion of the global criterion value contributed by the image pixels covered by the region.) If the number of regions remaining is two or less, save the region label map from the current iteration as the coarsest instance of the final hierarchical segmentation result, and stop. Otherwise go to step 2.

Dissimilarity Criterion: The dissimilarity criterion can be any measure of distance between two vectors. The widely used vector norms, 1-norm, 2-norm and Infinity-norm [2], are implemented as options. Discussed later will be some additional innovative variations on these widely used norms that are also implemented as options. Other possible dissimilarity criterion based on minimizing image mean square error, minimizing change in image entropy, and other criterion are discussed in Section II. E. below.

Global Criterion: The global criterion is used to identify significant changes in the segmentation results from one iteration to the next. This criterion is same as the dissimilarity criterion, except that it compares the original image data with the region mean image from the current segmentation. The value of this criterion is calculated at each image point and averaged over the entire image.

Recursive Implementation: The recursive, divide-and-conquer, implementation of HSEG, which called RHSEG, is as follows (modified slightly from Tilton [1]):

Recursive HSEG Algorithm (RHSEG) Description:

Specify the number of levels of recursion required (nblevels), and pad the input image, if necessary, so that the width and height of the image can be evenly divided by 2^nblevels-1. (A good value for nblevels results in an image section at level = nblevels consisting of roughly 500 to 2000 pixels.) Set level = 1.
Call recur_hseg(level,image).

Outline of recur_hseg(level,image):

If level < nblevels, divide the image data into quarters (in half in the width and height dimensions) and call recur_hseg(level+1,image/4) for each image quarter (represented as image/4). Otherwise go to step 3.
After all four calls to recur_hseg() from step 1 complete processing, reassemble the image segmentation results.
Execute the HSEG algorithm as described in the HSEG Basic Algorithm Description above (except that the reassembled segmentation results are used as the pre-segmentation when level < nblevels), but with the following modification: If level > 1, terminate the algorithm when the number of regions reaches the present constant minregions, and do not check for critval or output any "raw" segmentation results.

The above divide-and-conquer approach limits the number of regions that are processed at any time in step 4 of the HSEG algorithm. This limit leads to a significant reduction in processing time versus the non-recursive approach for even relatively small data sets (see Section I, Table 2 above). An efficient parallel implementation of RHSEG leads to additional significant reduction in processing time. Such a parallel implementation is the subject of a patent application under NASA Case No. GSC 14,305-1.

The high-level description of the HSEG and RHSEG above is similar to descriptions previously disclosed (see Tilton [1]). The following full description of the algorithm should be sufficient for duplication of the algorithm by individuals with adequate background in image processing and "C" language programming. The detail disclosed here has never before been disclosed outside of NASA GSFC Code 935.

Both HSEG and RHSEG require the specification of the name of an input parameter file as a UNIX command line argument (fully qualified with path, if not in the local directory). This input parameter file must specify of a number of file names and several parameters. A user's guide-like description of these file names and parameters is given below:

input

(required input file)

Input image data file

The input image data file from which a hierarchical image segmentation is to be produced. This image data file is assumed to be a headerless binary image file in band sequential format. The number of columns, rows, spectral bands and the data type are specified by other required parameters (see below) Data types "unsigned char" and "unsigned short" are supported.

mask

(optional input file)

Input data mask

(default = {none})

The optional input data mask must match the input image data in number of columns and rows. Even if the input image data has more than one spectral band, the input data mask need only have one spectral band. If the input data mask has more than one spectral band, only the first spectral band is used, and is assumed to apply to all spectral bands for the input image data. If the data value of the input data mask is "1" (TRUE), the corresponding value of the input image data object is taken to be a valid data value. If the data value of the input data mask object is "0" (FALSE), the corresponding value of the input image data object is taken to be invalid, and a region label of "0" is assigned to that spatial location in the output region label map data. The input data mask data type is assumed to be "unsigned char."

rlblmap_in

(optional input file)

Input region label map

(default = {none})

The optional region label map must match the input image data in number of columns and rows. If provided, the image segmentation is initialized according to the input region label map instead of the default of each pixel as a separate region. Wherever a region label of "0" is given by the input region label map, the region labeling is assumed to be unknown, and the region label map is initialized to one pixel regions at those locations. The input region label map data type is assumed to be "unsigned short."

rlblmap

(required output file)

Output region label map data

(default = {none})

The hierarchical set of region label maps which are the main output of HSEG and RHSEG. Region label values of "0" correspond to invalid input data values in the image data object. Valid region label values range from 1 through 65535. The data is of data type "unsigned short" and is stored in band sequential format, where band corresponds to the segmentation hierarchy level. If the optional region merges list, regmerges, is specified, only the first, most detailed, region label map from the segmentation hierarchy is stored.

rnpixlist

(required output file)

Output region number of pixels list

The region number of pixels list is a required output of HSEG and RHSEG. This list consists of the number of pixels (of data type "unsigned int") in each region stored as rows of values, with the column location (with counting starting at 1) corresponding to the region label value and the row location corresponding to the segmentation hierarchy level (with counting starting at 0).

regmerges

(optional output file)

Output region merges list

(default = {none})

The region merges list is an optional output of HSEG and RHSEG. This list consists of the renumberings of the region label map required to obtain the region label map for the second through the last (coarsest) level of the segmentation hierarchy from the region label map for the first (most detailed) level of the segmentation hierarchy (data type "unsigned short"). The data is stored as rows of values, with the column location (with counting starting at 1) corresponding to the region label value in the first (most detailed) level of the segmentation hierarchy and the row location corresponding to the segmentation hierarchy level (the l^th row is the renumberings required to obtain the (l+1)^th level of the segmentation hierarchy).

rmeanlist

(optional output file)

Output region mean list

(default = {none})

The region mean list is an optional output of HSEG and RHSEG. This list consists of the region mean value (of data type "float") of each region stored as rows of values and groups of rows, with the column location (with counting starting at 1) corresponding to the region label value, the row location (in each row group) corresponding the spectral band, and row group corresponding to the segmentation hierarchy level (with counting starting at 0).

rcvlist

(optional output file)

Output region criterion
value list

(default = {none})

The region criterion value list is an optional output of HSEG and RHSEG. This list consists of the region's contribution to the global criterion value (of data type "float") stored as rows of values, with the column location (with counting starting at 1) corresponding to the region label value and the row location corresponding to the segmentation hierarchy level (with counting starting at 0).

oparam

(required output file)

Output parameter file

The required output parameter file contains (in binary form) the number of hierarchical segmentation levels (nslevels) in the hierarchical segmentation output, and the number of regions (onregions) in the hierarchical segmentation with the finest segmentation detail. These values are required to interpret the rnpixlist, regmerges, rmeanlist, and rcvlist output files.

log_file

(required output file)

Output log file

The required output log file records program parameters and the number of regions and global criterion ratio value for each level of the region segmentation hierarchy.

ncols

(required integer)

Number of columns in input image data

nrows

(required integer)

Number of rows in input image data

nbands

(required integer)

Number of spectral bands in input image data

dtype

(required integer)

Data type of input image data
dtype = 4 designates "unsigned char"
dtype = 16 designates "unsigned short"

normind	(optional list selection)	Image normalization type
	"No Normalization", "Normalize Across Bands", "Normalize Bands Separately"	[default: 2. "Normalize Across Bands"]

Let x_bi be the original value for the i^th pixel (out of N pixels) in the b^th band (out of B bands). The mean and variance of the b^th band are

, and

(1)

respectively. To normalize the data to have mean = M_b and variance = S², set

(2)

For convenience, the data is normalized so that S² (= S) = 1. Since the entropy criterion requires that all data values be strictly positive, we set the mean value, M_b, of the normalized data to be the value that will produce a minimum value of 2 (so as to avoid computational problems calculating ln(y_bi)). That is,

(3)

The above description of image normalization applies to option 3: "Normalize Bands Separately." For option 2: "Normalize Across Bands," use for in Equations (2) and (3), and perform the minimization in Equation (3) across all bands as well as across all image pixels.

maxmdir

(optional integer)

Value equals number of nearest neighbors
(Permissible values: 4, 8, 12 or 20; default = 8)

simcrit	(optional list selection)	Dissimilarity criterion
	"1-Norm", "2-Norm", "Infinity Norm"	[default: 1. "1-Norm"]

Criterion for evaluating the dissimilarity of one region from another. The 1-Norm for regions j and k is (for B spectral bands)

1-Norm =

(4)

where and are the mean values for regions j and k, respectively, in spectral band b. Similarly, the 2-Norm is

2-Norm =

(5)

and the Infinity-Norm is

Infinity-Norm = max {

: b=1,2, ...,B}.

(6)

NOTE: Other dissimilarity criterion can be included as additional options without changing the nature of the HSEG or RHSEG implementation.

extmean

(optional Boolean)

Flag for extracting mean

(default = TRUE)

If this flag is set TRUE, the vector mean is subtracted from the spectral vectors before the dissimilarity criterion is calculated between two spectral vectors. The absolute difference between the two vector means is subsequently added back to the dissimilarity criterion. In the case of the 1-Norm, this is represented mathematically for regions j and k as follows (for B spectral bands):

Let

and

, where

and

Then

1-Norm (extmean == TRUE) =

(7)

The extracted mean versions for the 2-Norm, Infinity-Norm and other dissimilarity criterion are defined similarly.

spclust

(optional Boolean)

Spectral clustering flag

(default = TRUE)

If the spectral clustering flag is TRUE, spectral clustering is performed in-between each iteration of region growing. Otherwise, no spectral clustering is performed.

spatial_wght

(optional float)

Weight for spatial feature
(spatial_wght >= 0.0, default = 0.0)

If the value of spatial_wght is more than 0.0, the spatial_type parameter (below) determines whether the band maximum variance or standard deviation is used as a feature in the dissimilarity criterion in combination with the spectral band features. spatial_wght = 1.0 weights the spatial feature equally with the spectral band features, spatial_wght < 1.0 weights the spatial feature less and spatial_wght > 1 weights the spatial feature more. If D is the dissimilarity function value before combination with the spatial feature value, the combined dissimilarity function value (comparing regions j and k), D^c, is:

(8)

where sf_j and sf_k are the spatial feature values for regions j and k, respectively.

spatial_type	(optional list selection)	Spatial feature type
	"Variance", "Standard Deviation"	[default: 2 "Standard Deviation"]

The (optional) spatial feature is either the spectral band maximum region variance or spectral band maximum region standard deviation. This parameter is ignored if spatial_wght = 0.0.

For regions consisting of 9 or more pixels, the region variance for spectral band b is:

(9)

where N is the number of pixels in the region, and is the region mean for spectral band b:

The region spatial feature value is then defined as:

(10)

where B is the number of spectral bands.

For regions consisting only 1 pixel, the minimum local band maximum variance () calculated over all possible 3x3 windows containing the pixel is used as a substitute for the band maximum region variance for spectral band b.

For regions consisting of 2 up through 8 pixels, a weighted average of the minimum local band maximum variance and the band maximum region variance is substituted for the band maximum region variance as follows:

2 pixel regions:

3 pixel regions:

4 pixel regions:

5 pixel regions:

6 pixel regions:

7 pixel regions:

8 pixel regions:

(11)

If the spatial_type "Standard Deviation" is chosen, substitute the region standard deviation () for the region variance () in the above discussion.

rnb_levels

(optional integer)

Total number of recursive levels (only for RHSEG)
((1 <= rnb_levels <= 9, default = 4)

minregions

(optional integer)

Number of regions for convergence in recursive stages (only for RHSEG)
((2 <= minregions <= 4096, default = 384)

chkregions

(optional integer)

Number of regions at which convergence factor checking is initiated
((2 <= minregions <= 4096, default = 512)

convfact

(optional float)

Convergence factor
(1 <= convfact <= 100, default = 1.01)

tmpdir

(optional string)

Path name to directory in which temporary files are to be stored (default = $TMPDIR)

tempfile

(optional string)

Prefix for temporary file names (default = HSEG########## for HSEG and RHSEG######### for RHSEG, where the # are random characters)

Previous disclosures also did not reveal the variations on the dissimilarity criterion discussed above under the extmean, spatial_type and spatial_wght optional parameters. These dissimilarity criterion variations facilitate a more effective separation of regions of interest in certain Earth science applications.

An optional graphical user interface (GUI) program, implemented under the Khoros Pro 2000 Software Developer's Kit is available. This GUI program runs on any workstation running Unix or many of its variants. It creates the parameter file and the headerless binary input data files required by the HSEG and RHSEG programs. This GUI program is totally optional, as the parameter file and the headerless binary input data files required by HSEG and RHSEG can be created manually. The optional user interface and data reformatting program outputs the results in the form of Khoros data objects. These Khoros data objects can be input to separate programs for further analysis. (See, for example, the "Region Labeling Tool" described under NASA Case No. 14,331-1. A description of an earlier version of the Region Labeling Tool can be found at http://code935.gsfc.nasa.gov/code935/tilton/index.html#REGLBL_TOOL.) This optional GUI program is not a part of this "Disclosure of Invention and New Technology."

C. - Functional operation:

The HSEG and RHSEG programs were implemented in "C" under the gcc version 2.8.1 compiler. They were developed under the Solaris 7 operating system on a SUN Workstation, but they should both compile and run with other "C" compilers and under other UNIX-type operating systems, possibly with minor modification.

In the this section, the HSEG and RHSEG programs main programs are described, followed by description of functions called by the HSEG and RHSEG main programs. RHSEG calls all the functions called by HSEG, but not vice versa.

Functional Description of the HSEG Main Program: The HSEG main program initially determines the name of the parameter file from the command line input, and reads in the parameters (see Section II. B. above) from the file using the function read_param(). If a prefix for the temporary filenames in not provided in the parameter file, the program generates a character string to used at a prefix for file names for temporary files required by the program. The "C" library function tempnam() is used for this purpose. For the discussion below, this prefix is assumed to be "HSEG0000."

At this point the HSEG main program opens the log file and outputs to it the program parameters using the function print_param(). The following parameters are declared as external variables, and can be utilized in the main program and any "C" functions called subsequently.

inputf	A char array containing the name of the input image data file.
maskf	A char array containing the name of the mask data file, if it exists. If no mask data file exists, maskf has the value "NULL."
rlblmap_inf	A char array containing the name of the input region label map data file, if it exists. If no mask data file exists, rlblmap_inf has the value "NULL."
rlblmapf	A char array containing the name of the output region label map data file.
rnpixlistf	A char array containing the name of the output region number of pixels list data file.
regmergesf	A char array containing the name of the output region merges list data file, if requested. If the output region merges list data file is not requested, regmergesf has the value "NULL."
rmeanlistf	A char array containing the name of the output region mean list data file, if requested. If the output region mean list data file is not requested, rmeanlistf has the value "NULL."
rcvlistf	A char array containing the name of the output region criterion value list data file, if requested. If the output region criterion value list data file is not requested, rcvlistf has the value "NULL."
oparamf	A char array containing the name of the output parameter file.
log_file	A char array containing the name of the output log file.
tmpdir	A char array containing the path name to the directory where temporary files are to be stored.
tempfile	A char array containing the prefix for temporary file names.
mask_flag	An int variable which has value TRUE if mask data exist (FALSE otherwise).
rlblmap_in_flag	An int variable which has value TRUE if input region label map data is given (FALSE otherwise).
regmerges_flag	An int variable which has value TRUE if the output region merges list is requested (FALSE otherwise).
rmeanlist_flag	An int variable which has value TRUE if the output region mean list is requested (FALSE otherwise).
rcvlist_flag	An int variable which has value TRUE if the output region criterion value list is requested (FALSE otherwise).
spclust_flag	An int variable that is TRUE if spectral clustering is to be performed (FALSE otherwise).
nbands	An int variable containing the number of spectral bands in the input image data.
dtype	An int variable containing an indicator of the data type of the input image data. (dtype == 4 designates unsigned char and dtype == 16 designates unsigned short).
maxmdir	An int variable containing the number of nearest neighbors considered.
normind	An int variable containing an indicator of the type of normalization requested. (normind == 1 designates "No Normalization," normind == 2 designates "Normalization Across Bands," and normind == 3 designates "Normalize Bands Separately").
simcrit	An int variable indicating the dissimilarity criterion to be used.
extmean	An int variable that is TRUE if the vector mean is subtracted from the spectral vectors before the dissimilarity criterion is calculated, FALSE otherwise.
spatial_type	An int variable indicating the spatial feature type employed (1=> "Variance", and 2=> "Standard Deviation").
chkregions	An int variable containing the number of regions at which convergence factor checking is initiated.
max_nregions	An int variable containing the maximum number of regions that will be encountered in processing the data.
spatial_wght	A float variable which contains the value of the weight for the spatial feature (a value of 0.0 indicates the spatial feature is not employed).
convfact	A float variable containing the value of the convergence factor for detecting natural convergence points.
scale	A float array containing the multispectral scale factor(s) for normalization.
offset	A float array containing multispectral offset factor(s) for normalization.

The HSEG main program now allocates memory for the input_data array and, if necessary, for the mask_data and rlblmap_data arrays. These data arrays are declared and allocated as follows:

Data Array	Data Type	Size	To Contain:
input_data	unsigned short	ncolsnrowsnbands	Input image data
mask_data	unsigned char	ncols*nrows	Input mask data (if any)
rlblmap_data	unsigned short	ncols*nrows	Input region label map data (if any) and output region label map data

The HSEG main program then calls the read_data() function with the following input variables:

proc_nsecs	An int variable containing the number of sections in which the input data is to be processed in memory. Since HSEG will always be used with relatively small data sets, this variable is set to "1".
ncols	An int variable containing the number of columns in the input image data.
nrows	An int variable containing the number of rows in the input image data.
pad_ncols	An int variable containing the number of columns in the padded input image data. Since padding is not required for HSEG, pad_ncols = ncols.
pad_nrows	An int variable containing the number of rows in the padded input image data. Since padding is not required for HSEG, pad_nrows = nrows.

The global variables inputf, mask_flag, rlblmap_in_flag, nbands, dtype and spatial_wght are also used as inputs by read_data(). The following arrays are output by read_data():

input_data	An unsigned short array containing the input image data.
mask_data	An unsigned char array containing the mask data (if any).
rlblmap_data	An unsigned short array containing the input region label map data (if any).
spatial_init	A float array containing the input spatial feature data (local image variance or standard deviation - if any). Since no spatial feature data has been generated at this point in the program, this is just a NULL array here.

The purpose of the read_data() function is to read the input data into the allocated data arrays.

The HSEG main program next computes image statistics. If image normalization is requested, the image sum, sum of squares and minimum values in each band are computed, along with the number of image pixels. If image normalization is not requested, only the number of image pixels is found. If image normalization is requested, the sum_stat, sumsq_stat and min_stat arrays are declared and allocated as follows:

Data Array	Data Type	Size	To Contain:
sum_stat	double	nbands	Sum of the input image data in each band.
sumsq_stat	double	nbands	Sum of the squares of input image data in each band.
min_stat	double	nbands	The minimum input image data value in each band.

If required, the sum_stat and sumsq_stat arrays are initialized to "0", and the min_stat array is initialized to "MAXFLOAT" (the maximum floating point value on the system). Then the find_stats() function is called with the following parameters and array variables:

input_data	An unsigned short array containing the input image data.
mask_data	An unsigned char array containing the mask data (if any).
ncols	An int variable containing the number of columns in the input image data.
nrows	An int variable containing the number of rows in the input image data.

The global variables mask_flag, nbands and normind are also used as inputs by find_stats(). The following arrays are output by find_stats():

sum_stat	A double array containing the sum of the (possibly masked) input image data in each band.
sumsq_stat	A double array containing the sum of squares of the (possibly masked) input image data in each band.
min_stat	A double array containing the minimum value of the (possibly masked) input image data in each band.

The number of input pixels, npixels, is returned as the value of the function find_stats(). If image normalization is not requested, and no mask data is provided, the find_stats() function simply calculates the image number of pixels as ncols*nrows and returns. If image normalization is not requested, and mask data is provided, the find_stats() function finds the image number of pixels by counting the number of non-zero entries in mask_data and returns. If image normalization is requested, the function find_stats() scans through the input image data, taking the image masking in account if provided, and finds the sum, sum of squares and minimum data values for each band, along with the image number of pixels, and then returns.

The HSEG main program now allocates memory for the scale and offset arrays as follows:

Data Array	Data Type	Size	To Contain:
scale	float	nbands	Scale factors for the input image data in each band.
offset	float	nbands	Offset values for the input image data in each band.

If image normalization is not requested each element of the scale array is set to "1" and each element of the offset array is set to "0." If image normalization is requested, the find_scale_offset() function is called with the following parameter and array variables:

npixels	An int variable containing the number of pixels in the input image data (after masking, if provided).
sum_stat	A double array containing the sum of the (possibly masked) input image data in each band.
sumsq_stat	A double array containing the sum of squares of the (possibly masked) input image data in each band.
min_stat	A double array containing the minimum value of the (possibly masked) input image data in each band.

The global parameters nbands and normind are also used as inputs to find_scale_offset(). The find_scale_offset() function first computes the image variance, var_stat (), for each image band through the formula:

(12)

where ss is sum of the squares of the data (sumsq_stat), s is the sum of the data (sum_stat) and N is the number of pixels in the data (npixels).

If normind == 2 ("Normalize Across Bands") is chosen, the minimum of the band minimum value (min_stat) is set as the minimum for each band, and the maximum of the band variance value (var_stat) is set as the variance value for each band. Then the values for the scale and offset arrays are calculated such that the normalized image data will have unitary variance and minimum value of "2." (Minimum value of "2" is used rather than "0" because certain dissimilarity criterion require the computation of logarithms, which encounter computational difficulties for arguments near the value "1".) This is done for each band through the equations:

scale[band] = ((float) (1.0/sqrt(var_stat[band])));

(13a)

offset[band] = ((float) (2.0 - (min_stat[band]/sqrt(var_stat[band]))));

(13b)

(The normalized input image data is not calculated at this time.) The main outputs of the find_scale_offset() function are the scale and offset global array parameters, which are the factors required to normalize the input data as designated by the normind global parameter.

Next, the input variable spatial_wght is multiplied by nbands so that the spatial feature will have the same weighting and the spectral features when the initial input value of spatial_wght is 1.0. If spatial_wght > 0.0, the spatial_init array and the following temporary arrays are allocated:

Data Array	Data Type	Size	To Contain:
spatial_init	float	ncols*nrows	Spatial feature data (local image variance or standard deviation, if any).
input_sub	unsigned short	3ncolsnbands	Temporary array to buffer 3 lines of input data.
mask_sub	unsigned char	3*ncols	Temporary array to buffer 3 lines of mask data.
spatial_init_sub	float	3*ncols	Temporary array to buffer 3 lines of spatial feature data.

If spatial_wght > 0.0, the maximum over spectral bands of the local spatial feature values over 3x3 pixel areas is now calculated by calling the function compute_spatial() with the following input arrays and parameters:

input_data	An unsigned short array containing the input image data.
input_sub	An unsigned short array to buffer 3 lines of input image data.
mask_data	An unsigned char array containing the mask data (if any).
mask_sub	An unsigned char array to buffer 3 lines of mask data (if any).
init_flg	An int variable that is TRUE if this is the first section of data processed, and FALSE otherwise. TRUE in this case.
last_flg	An int variable that is TRUE if this is the last section of data processed, and FALSE otherwise. TRUE in this case.
ncols	An int variable containing the number of columns in the input image data.
maxrow	An int variable containing the number of rows processed in this section of the input image data. Equal to nrows in this case.
num_points	An int variable containing the size of data element offset between bands of input image data. Equal to ncols*nrows in this case.

The global parameters mask_flag, nbands, scale, and offset are also used as inputs to compute_spatial(). (The compute_spatial() function is designed to process the data in sections as required for large input data sets. Since practical processing times for HSEG and can only be obtained with small and moderately sized data sets, we assume the entire data is processed in one section in this case.) The following data array is output by compute_spatial():

spatial_init

A float array containing the maximum over spectral bands of the spatial feature calculated over a 3x3 window.

Note that when mask_data is provided, the value of spatial_init is set equal to MAXFLOAT (the maximum floating point value) whenever ANY data point in the 3x3 window is masked out. Also, spatial_init is set equal to MAXFLOAT for the first and last data rows and columns.

If spatial_wght > 0.0, the minimum over 3x3 pixel areas of maximum over spectral bands of the local spatial feature values over 3x3 pixel areas is found by calling the function find_min() with the following input arrays and parameters:

spatial_init	A float array containing the maximum over spectral bands of the spatial feature calculated over a 3x3 window.
spatial_init_sub	An float array to buffer 3 lines of spatial feature data.
init_flg	An int variable that is TRUE if this is the first section of data processed, and FALSE otherwise. TRUE in this case.
last_flg	An int variable that is TRUE if this is the last section of data processed, and FALSE otherwise. TRUE in this case.
ncols	An int variable containing the number of columns in the input image data.
maxrow	An int variable containing the number of rows processed in this section of the input image data. Equal to nrows in this case.

(The find_min() function is designed to process the data in sections as required for large input data sets. Since practical processing times for HSEG and can only be obtained with small and moderately sized data sets, we assume the entire data is processed in one section in this case.) The following data array is output by find_min():

spatial_init

A float array containing the minimum over a 3x3 window of the maximum over spectral bands of the spatial feature calculated over a 3x3 window.

The maximum number of regions, max_nregions, that will be encountered in processing any subsection of data is now computed as equal to ncols*nrows.

The HSEG main program continues by allocating various additional data arrays as listed below:

Data Array	Data Type	Size	To Contain:
rlabel_data	unsigned short	max_nregions	Region label values.
npix_data	unsigned int	max_nregions	Region number of pixels.
sum_data	double	max_nregions*nbands	Region sum of normalized input data values in each spectral band.
sumsq_data	double	max_nregions*nbands	Region sum of the square of the normalized input data values in each band (if spatial_wght > 0.0).
spatial_data	double	max_nregions	Region spatial feature value (if spatial_wght > 0.0).
nghbr_mask	unsigned char	max_nregions²	Region adjacency mask.
nghbr_tmp	unsigned char	max_nregions	Temporary region adjacency mask.
relabel_list	unsigned short	2*max_nregions	Region label relabelings (renumberings).
region_store	reg_struct	max_nregions	Structure containing information about each region.
region_list	reg_list	max_nregions+1	Structure for linking region_store structures together.
sort_store	reg_list	max_nregions	Structure for ordered linking of region_store structures
nbsort_store	reg_list	max_nregions	Structure for ordered linking of region_store structures

The reg_struct (REGION) and reg_list (R_LIST) structures listed above have the following declarations:

typedef struct REGION
{
        unsigned short  label;
        struct REGION   *bnghbr;
        float            bncomp;
        struct REGION   *bregion;
        float            brcomp;
        struct REGION   *mgreg;
} reg_struct;
 

typedef struct R_LIST
{
        struct REGION   *link;
        struct R_LIST   *next;
        unsigned char    active_flag;
} reg_list;

The members of the reg_struct (REGION) structure are:

label	region label
bnghbr	pointer to structure containing information for the most similar neighboring region
bncomp	dissimilarity value between the region and the most similar neighboring region
bregion	pointer to structure containing information for the most similar non-neighboring region
brcomp	dissimilarity value between the region and the most similar non-neighboring region
mgreg	if the region was merged into another region, this is a pointer to the structure containing information for the region this region was merge with

The members of the reg_list (R_LIST) structure are:

link	pointer to REGION structure containing information for a particular region
next	pointer to R_LIST structure for next region in the list
active_flag	TRUE only if the R_LIST element is valid.

Upon declaration, the link members of the region_list structure variable are initialized to point one after the other to the elements of the region_store structure (i.e., region_list[j].link = &region_store[j]), and the next members of the region_list structure variable are initialized to point to the next member of the region_list structure (i.e., region_list[j].next = &region_list[j+1]). The elements region_list[max_nregions].link and region_list[max_nregions].next are set to "NULL" initially. All of the active_flag members of region_list are set initially to "FALSE."

The function region_data_init() is now called with the following input variable and array data:

input_data	An unsigned short array containing the input image data
mask_data	An unsigned char array containing the mask data (if any)
rlblmap_data	An unsigned short array containing the input region label map data (if any). (Will be overwritten by the output region label map data.)
spatial_init	A float array containing the input spatial feature data (local image variance or standard deviation - if any)
ncols	An int variable containing the number of columns in the input image data
nrows	An int variable containing the number of rows in the input image data

The global variables mask_flag, rlblmap_in_flag, nbands, max_nregions, scale, offset, maxmdir, spatial_wght and spatial_type are also used as inputs by region_data_init(). The following variables and arrays are output by region_data_init():

nregions	An int variable containing the initial number of regions in the section of data processed.
rlblmap_data	An unsigned short array containing the initial output region label map. This region label map specifies the initial spatial distribution of the regions.
npix_data	An unsigned int array containing the region number of pixels list stored as a vector. The j^th element of the vector is the number of pixels in region j+1.
sum_data	A double array containing the sum of the normalized input data values over all pixels in each region. Element j+nregions*b is the sum value for region j+1 and band b (first band is band number 0).
sumsq_data	A double array containing the sum of the square of the normalized input data values over all pixels in each region. (Actually contains values only if spatial_wght > 0.0.) Element j+nregions*b is the sum square value for region j+1 and band b (first band is band number 0).
spatial_data	A double array containing the region spatial feature value list stored as a vector. (Actually contains values only if spatial_wght > 0.0.) The j^th element of the vector is the spatial feature value for region j+1.
nghbr_mask	An unsigned char array of numbers designating whether or not a pair of regions are spatially adjacent (i.e., neighbors). The value of the mask at array element j+max_nregions*k is TRUE if and only if regions j+1 and k+1 are spatially adjacent.
region_list	A reg_list (R_LIST) structure array pointing to a sequence of reg_struct (REGION) structure elements containing label and comparative information about each region.

The array rlabel_data is also passed to the function region_data_init() for use as work space. See a later section for a functional description of the region_data_init() function. The purpose of the region_data_init() function is to initialize the variables and arrays output by the region_data_init() function based on the input variable and array data.

After the call to region_data_init() is completed, the following parameters are set: init_flg = TRUE, onregions = nregions (as output from region_data_init()) and cvnbregs = chkregions (note that chkregions is a user set parameter). If nregions <= chkregions, the calls below to the functions lhseg() and read_region_list() are skipped and, if nregions > 2, the program proceeds to call the function fhseg() (see below). Otherwise, call the function lhseg() with the following input variable and array data:

init_flag	An int variable which indicates whether or not the region_list has just been initialized (in this case init_flag = TRUE).
onregions	An int variable containing the number of regions existing when lhseg() is initially called.
cvnbregs	An int variable indicating the number of regions at which that this call to lhseg() will converge (exit).
npix_data	An unsigned int array containing the region number of pixels list stored as a vector. The j^th element of the vector is the number of pixels in region j+1.
sum_data	A double array containing the sum of the normalized input data values over all pixels in each region. Element j+nregions*b is the sum value for region j+1 and band b (first band is band number 0).
sumsq_data	A double array containing the sum of the square of the normalized input data values over all pixels in each region. (Actually contains values only if spatial_wght > 0.0.) Element j+nregions*b is the sum square value for region j+1 and band b (first band is band number 0).
spatial_data	A double array containing the region spatial feature value list stored as a vector. (Actually contains values only if spatial_wght > 0.0.) The j^th element of the vector is the spatial feature value for region j+1.
nghbr_mask	An unsigned char array of numbers designating whether or not a pair of regions are spatially adjacent (i.e., neighbors). The value of the mask at array element j+max_nregions*k is TRUE if and only if regions j+1 and k+1 are spatially adjacent.
region_list	A reg_list (R_LIST) structure array pointing to a sequence of reg_struct (REGION) structure elements containing label and comparative information about each region.

The global variables nbands, max_nregions, simcrit, extmean, spclust, spatial_wght, and spatial_type are also used as inputs to lhseg(). The following variables and arrays are modified and output by lhseg():

nregions	An int variable containing the number of regions in the section of data processed.
npix_data	An unsigned int array containing the region number of pixels list stored as a vector. The j^th element of the vector is the number of pixels in region j+1.
sum_data	A double array containing the sum of the normalized input data values over all pixels in each region. Element j+nregions*b is the sum value for region j+1 and band b (first band is band number 0).
sumsq_data	A double array containing the sum of the square of the normalized input data values over all pixels in each region. (Actually contains values only if spatial_wght > 0.0.) Element j+nregions*b is the sum square value for region j+1 and band b (first band is band number 0).
spatial_data	A double array containing the region spatial feature value list stored as a vector. (Actually contains values only if spatial_wght > 0.0.) The j^th element of the vector is the spatial feature value for region j+1.
nghbr_mask	An unsigned char array of numbers designating whether or not a pair of regions are spatially adjacent (i.e., neighbors). The value of the mask at array element j+max_nregions*k is TRUE if and only if regions j+1 and k+1 are spatially adjacent.
region_list	A reg_list (R_LIST) structure array pointing to a sequence of reg_struct (REGION) structure elements containing label and comparative information about each region.

The arrays sort_store, and nbsort_store are also passed to the lhseg() function to use as work space. See a later section for a functional description of the lhseg() function. The purpose of the lhseg() function is to perform multiple iterations of region growing intermixed with spectral clustering (if requested) until the number of regions becomes less than or equal to cvnbregs (in this case, chkregions).

After the lhseg() function exits, the parameter init_flg is set to FALSE, and the function read_region_list() is called with the following input variable and array data:

nregions	An int variable containing the number of regions in the section of data processed.
npix_data	An unsigned int array containing the region number of pixels list stored as a vector. The j^th element of the vector is the number of pixels in region j+1.
sum_data	A double array containing the sum of the normalized input data values over all pixels in each region. Element j+nregions*b is the sum value for region j+1 and band b (first band is band number 0).
sumsq_data	A double array containing the sum of the square of the normalized input data values over all pixels in each region. (Actually contains values only if spatial_wght > 0.0.) Element j+nregions*b is the sum square value for region j+1 and band b (first band is band number 0).
spatial_data	A double array containing the region spatial feature value list stored as a vector. (Actually contains values only if spatial_wght > 0.0.) The j^th element of the vector is the spatial feature value for region j+1.
nghbr_mask	An unsigned char array of numbers designating whether or not a pair of regions are spatially adjacent (i.e., neighbors). The value of the mask at array element j+max_nregions*k is TRUE if and only if regions j+1 and k+1 are spatially adjacent.
region_list	A reg_list (R_LIST) structure array pointing to a sequence of reg_struct (REGION) structure elements containing label and comparative information about each region.

The global variables nbands, max_nregions and spatial_wght are also used as input by read_region_list(). The following variables and arrays are modified and/or output by read_region_list():

npix_data	An unsigned int array containing the region number of pixels list stored as a vector. The j^th element of the vector is the number of pixels in region j+1.
sum_data	A double array containing the sum of the normalized input data values over all pixels in each region. Element j+nregions*b is the sum value for region j+1 and band b (first band is band number 0).
sumsq_data	A double array containing the sum of the square of the normalized input data values over all pixels in each region. (Actually contains values only if spatial_wght > 0.0.) Element j+nregions*b is the sum square value for region j+1 and band b (first band is band number 0).
spatial_data	A double array containing the region spatial feature value list stored as a vector. (Actually contains values only if spatial_wght > 0.0.) The j^th element of the vector is the spatial feature value for region j+1.
nghbr_mask	An unsigned char array of numbers designating whether or not a pair of regions are spatially adjacent (i.e., neighbors). The value of the mask at array element j+max_nregions*k is TRUE if and only if regions j+1 and k+1 are spatially adjacent.
region_list	A reg_list (R_LIST) structure array pointing to a sequence of reg_struct (REGION) structure elements containing label and comparative information about each region.
relabel_list	An unsigned short array consisting of pairs of region labelings representing a renumbering of the region map labeling.

The array nghbr_tmp is also passed to the read_region_list() function for use as work space. The read_region_list() function finds a renumbering (relabel_list) of the current region map labeling into the smallest possible range of labels starting at label "1," and adjusts the various data arrays and structures accordingly (npix_data, sum_data, nghbr_mask, region_list, and, if necessary, sumsq_data and spatial_data).

After the read_region_list() function exits, the variable nelements is found as the number of non-zero elements in the relabel_list array output by read_region_list(), and the function update_rlblmap() is called with the following input variable and array data:

ncols	An int variable containing the number of columns in the input image data
nrows	An int variable containing the number of rows in the input image data
nelements	An int variable containing the number of non-zero elements in the relabel_list array
relabel_list	An unsigned short array consisting of pairs of region labelings representing a renumbering of the region map labeling.
rlblmap_data	An unsigned short array containing the region label map data as it existed prior to the call to lhseg().

The following data array is modified by update_rlblmap():

rlblmap_data

An unsigned short array containing the region label map data renumbered according to relabel_list.

The update_rlblmap() function renumbers the rlblmap_data array according to the input relabel_list array. This updates the rlblmap_data to reflect the merges performed by the call to lhseg() and the compacting of the region labeling representation performed by the function read_region_list().

After update_rlblmap() exits, the variable cvnbregs is set equal to 2. Then if cvnbregs > nregions, the function fhseg() is called with the following input variable and array data:

init_flag	An int variable which is TRUE if the region_list has just been initialized and FALSE otherwise.
input_data	An unsigned short array containing the input image data
mask_data	An unsigned char array containing the mask data (if any)
rlblmap_data	An unsigned short array containing the region label map. This region label map specifies the initial spatial distribution of the regions.
spatial_init	A float array containing the input spatial feature data (local image variance or standard deviation - if any)
ncols	An int variable containing the number of columns in the input image data
nrows	An int variable containing the number of rows in the input image data
onregions	An int variable containing the number of regions existing when lhseg() is initially called.
cvnbregs	An int variable indicating the number of regions at which that this call to fhseg() will converge (exit).
npix_data	An unsigned int array containing the region number of pixels list stored as a vector. The j^th element of the vector is the number of pixels in region j+1.
sum_data	A double array containing the sum of the normalized input data values over all pixels in each region. Element j+nregions*b is the sum value for region j+1 and band b (first band is band number 0).
sumsq_data	A double array containing the sum of the square of the normalized input data values over all pixels in each region. (Actually contains values only if spatial_wght > 0.0.) Element j+nregions*b is the sum square value for region j+1 and band b (first band is band number 0).
spatial_data	A double array containing the region spatial feature value list stored as a vector. (Actually contains values only if spatial_wght > 0.0.) The j^th element of the vector is the spatial feature value for region j+1.
nghbr_mask	An unsigned char array of numbers designating whether or not a pair of regions are spatially adjacent (i.e., neighbors). The value of the mask at array element j+max_nregions*k is TRUE if and only if regions j+1 and k+1 are spatially adjacent.
region_list	A reg_list (R_LIST) structure array pointing to a sequence of reg_struct (REGION) structure elements containing label and comparative information about each region.

The global variables mask_flag, regmerges_flag, nbands, scale, offset, simcrit, extmean, spclust, spatial_wght, spatial_type, convfact, max_nregions, tmpdir and tempfile are also used as inputs to fhseg(). The following variable is returned by fhseg():

nslevels

An int variable containing the number of hierarchical segmentation levels found by the fhseg() function.

The arrays rlabel_data, sort_store, and nbsort_store as also passed to the fhseg() functions for use as work space. The fhseg() function also outputs data to a number of files. These data output are the region number of pixels list (npix_data), the region means list, the region criterion value list, and, if requested, the region merges list for the iterations corresponding to the detected natural convergence points. If the region merges list is requested, the region label map data (rlblmap_data) is output only for the first detected natural convergence point. Otherwise the region label map data is output for all the detected natural convergence points.

See a later section for a functional description of the fhseg() function. The purpose of the fhseg() function is to perform multiple iterations of region growing intermixed with spectral clustering (if requested) until the number of regions becomes less than or equal to cvnbregs (in this case, 2), and output segmentation results when natural convergence points are detected.

The HSEG main program is now nearly finished. After the fhseg() function exits, the program writes into a output parameter file, HSEG0000.oparam, the number of hierarchical segmentation levels, nslevels, and the number of regions that existed prior to the call to the fhseg() function. These are the values that define the dimensions of the output files. Finally, the output files rlblmap, rnpixlist, regmerges (if requested), rmeanlist (if requested), and rcvlist (if requested) are created by concatenating together the corresponding output files from fhseg() for all the hierarchical segmentation levels. The output log_file is now closed and the program exits.

Functional Description of the RHSEG Main Program: The RHSEG main program initially determines the name of the parameter file from the command line input, and reads in the parameters (see Section II. B. above) from the file using the function read_param(). If a prefix for the temporary filenames in not provided in the parameter file, the program generates a character string to used at a prefix for file names for temporary files required by the program. The "C" library function "tempnam" is used for this purpose. For the discussion below, this prefix is assumed to be "RHSEG0000."

At this point the RHSEG main program opens the log file and outputs to it the program parameters using the function print_param(). In addition to the parameters declared as external variables for the HSEG program, the following parameters as declared as external variables for the RHSEG program:

rnb_levels	An int variable containing the total number of recursive levels to be utilized
minregions	An int variable containing the number of regions for convergence in the recursive stages
sub_ncols	An int variable containing the number of columns of input data to be processed at the highest level of recursion
sub_nrows	An int variable containing the number of rows of input data to be processed at the highest level of recursion

The program next sets the number of columns (sub_ncols) and number of rows (sub_nrows) of the image data sections to be processed at the highest level of recursion, based on the value of rnb_levels. If the full image size (ncols by nrows) is not an integer multiple of sub_ncols by sub_nrows, the values of pad_ncols ( > ncols) and pad_nrows ( > nrows) are set to be the number of columns and rows the full image would have to have in order to be an integer multiple size of the subimage size at the highest level of recursion. If the image size is already an integer multiple size of the subimage size at the highest level or recursion, pad_ncols and pad_nrows are set to ncols and nrows, respectively. If sub_ncols = pad_ncols, this particular run of RHSEG degenerates to a run of HSEG (see above).

The RHSEG main program now continues in the same manner as the HSEG main program by reading in the input data. If an input data mask is not provided and ncols != pad_ncols, the mask_data array is initialized to all "1's" (TRUE) for upper left ncols*nrows section of the array, and mask_flag is set to TRUE. If ncols != pad_ncols, the mask_data array and all other input arrays are padded with "0" values for columns greater than ncols and rows greater than nrows.

The RHSEG main program continues on as in the HSEG main program by finding npixels and values for the scale and offset arrays with calls to the find_stats() and find_scale_offset() functions. If spatial_wght > 0.0, calls are also made to the compute_spatial() and find_min() functions.

The maximum number of regions, max_nregions, that will be encountered in processing any subsection of data is now computed. This is the maximum of 4*minregions and sub_ncols*sub_nrows.

The various data arrays are now allocated as for the HSEG main program, except that pad_ncols and pad_nrows replaces ncols and nrows in the allocations. Also, the arrays npix_data, sum_data, sumsq_data, and spatial_data are allocated with size rnb_levels*max_nregions rather than just max_nregions. In the RHSEG main program, the following additional data arrays are allocated:

Data Array	Data Type	Size	To Contain:
input_sub	unsigned short	(pad_ncolspad_nrowsnbands)/2	Subset of input image data
mask_sub	unsigned char	(pad_ncols*pad_nrows)/2	Subset of input mask data (if any)
rlblmap_sub	unsigned short	(pad_ncols*pad_nrows)/2	Subset of input region label map data (if any) and output region label map data
spatial_init_sub	float	(pad_ncols*pad_nrows)/2	Subset of input spatial feature data (local image variance or standard deviation, if any)
nghbr_sub	unsigned char	(rnb_levels-1)*max_nregions²	Region adjacency mask at other levels of recursion
rlblmap_4cols	unsigned.short	4*pad_nrows	Selected four columns of region label map data
rlblmap_4rows	unsigned short	4*pad_ncols	Selected four rows of region label map data

The function lrhseg() is now called with the following input variable and array data:

input_data	An unsigned short array containing the current section of input image data
mask_data	An unsigned char array containing the current section of mask data (if any)
rlblmap_data	An unsigned short array containing the current section of input region label map data (if any). (Will be overwritten by the output region label map data.)
spatial_init	A float array containing the current section of input spatial feature data (local image variance or standard deviation - if any)
ncols	An int variable containing the number of columns in the current section of input image data (initially pad_ncols)
nrows	An int variable containing the number of rows in the current section of input image data (initially pad_nrows)
level	An int variable containing the value of the current level of recursion (initially equal to 1)

The global variables mask_flag, rlblmap_in_flag, nbands, sub_ncols, sub_nrows, max_nregions, scale, offset, maxmdir, simcrit, extmean, spclust, spatial_wght, spatial_type, rnb_levels and minregions are also used as inputs by lrhseg(). Certain variables and arrays are output by lrhseg(). They are as follows:

nregions	An int variable containing the number of regions in the segmentation of the section of data processed.
rlblmap_data	An unsigned short array containing the output region label map for the current section of data. This region label map specifies the spatial distribution of the regions.
npix_data	An unsigned int array containing the current region number of pixels list stored as a vector. The j^th element of the vector is the number of pixels in region j+1.
sum_data	A double array containing the current sum of the normalized input data values over all pixels in each region. Element j+nregions*b is the sum value for region j+1 and band b (first band is band number 0).
sumsq_data	A double array containing the current sum of the square of the normalized input data values over all pixels in each region. (Actually contains values only if spatial_wght > 0.0.) Element j+nregions*b is the sum square value for region j+1 and band b (first band is band number 0).
spatial_data	A double array containing the current region spatial feature value list stored as a vector. (Actually contains values only if spatial_wght > 0.0.) The j^th element of the vector is the spatial feature value for region j+1.
nghbr_mask	An unsigned char array of numbers designating, for the current section of data, whether or not a pair of regions are spatially adjacent (i.e., neighbors). The value of the mask at array element j+max_nregions*k is TRUE if and only if regions j+1 and k+1 are spatially adjacent.

Other data arrays are passed into lrhseg() for use as work space. These are input_sub, mask_sub, rlblmap_sub, spatial_init_sub, rlabel_data, region_list, relabel_list, nghbr_sub, nghbr_tmp, rlblmap_4cols, rlblmap_4rows, sort_store and nbsort_store.

The purpose of the lrhseg() function is to divide up the input data into four equal quadrants and to call the function recur_hseg() for each quadrant. The results from recur_hseg() are assembled together by lrhseg() and returned to the calling function. The internal operations of the function lrhseg() are described in a later section.

The arrays rlblmap_4cols and rlblmap_4rows are now loaded with values from rlblmap_data. The array rlblmap_4cols contains the first, middle two and last columns from rlblmap_data, and the array rlblmap_4rows contains the first, middle two and last rows from rlblmap_data. The function update_nghbr_mask() is now called with the following input variables and data:

ncols	An int variable containing the number of columns in the current section of input image data
nrows	An int variable containing the number of rows in the current section of input image data
nregions	An int variable containing the current value of the number of regions is the current section of segmented data
rlblmap_4cols	An unsigned short array containing the first, middle two and last columns of the current region label map data.
rlblmap_4rows	An unsigned short array containing the first, middle two and last rows of the current region label map data.
nghbr_mask	An unsigned char array of numbers designating, for the current section of data, whether or not a pair of regions are spatially adjacent (i.e., neighbors). The value of the mask at array element j+max_nregions*k is TRUE if and only if regions j+1 and k+1 are spatially adjacent.

The function update_nghbr_mask() modifies the input nghbr_mask array so that it includes the neighbor relationship across the middle two columns and rows of the current region label map. The nghbr_mask array had been previously lacking this information. See a later section for a functional description of update_nghbr_mask().

The function region_list_init() is now called with the following input variable:

nregions

An int variable containing the number of regions in the current section of segmented data

The following data array is output by region_list_init():

region_list

A reg_list (R_LIST) structure array pointing to a sequence of reg_struct (REGION) structure elements containing label and comparative information about each region.

See a later section for a functional description of the region_list_init() function. The purpose of the region_list_init() function is to initialize the region_list structure.

After the call to region_list_init() is completed, the following parameters are set: init_flg = TRUE, onregions = nregions and cvnbregs = chkregions (note that chkregions is a user set parameter). As in the HSEG program, if nregions <= chkregions, the calls to the functions lhseg() and read_region_list() are skipped and, if nregions > 2, the program proceeds to call the function fhseg(). Otherwise, call the function lhseg() is called in exactly same manner it was called in the HSEG program. From this point the RHSEG program proceeds in exactly the same manner as the HSEG program.

Functional description of the region_data_init()function: The first operation performed by this function is to initialize the region label map, rlblmap_data. If input region label map data is provided, the maximum label, max_reglbl, in the input region label map data is found, and the elements of rlblmap_data with input value "0," that are at non-masked out locations (according to mask_data), are sequentially given unique labels higher than max_reglbl. The region label map is then renumbered so that the regions have unique labels running from 1 to nregions (the number of regions in this section of data). If input region label map data is not provided, each non-masked out location in rlblmap_data is given a unique label, running from 1 to nregions.

Next the first nregions elements of region_list, and the reg_struct structures linked to these elements of region_list, are initialized such that the active_flag is set to "TRUE," the label's run sequentially from 1 to nregions, the pointers bnghbr, bregion and mgreg are set to "NULL," and the values bncomp and brcomp are set to "MAXFLOAT" (the maximum floating point value). In addition, the first nregions elements of the npix_data (and spatial_data, if spatial_wght > 0.0) array(s) are initialized to 0.0 and the first nregions of the nbands sum_data (and sumsq_data, if spatial_wght > 0.0) array(s) are initialized to 0.0.

Next the npix_data, sum_data, spatial_data (if spatial_wght > 0.0), and sumsq_data (if spatial_wght > 0.0) are calculated by scanning through rlblmap_data, input_data, spatial_init (if spatial_wght > 0.0) and mask_data (if provided). The value of npix_data[j] is the number of pixels in region j+1 and the value of sum_data[b+j*nbands] is the sum of the normalized input data for band b and region j+1. If (spatial_wght > 0.0), the value of spatial_data[j] is the sum of the input spatial_init values for region j+1 and the value of sumsq_data[b+j*nbands] is the sum of the squares of the normalized input data for band b and region j+1.

In the above initialization, if spatial_wght > 0.0, spatial_data is set to be the sum of the input spatial_init values in each region. However, this is not the correct final value for spatial_data. If npix_data[j] >= 9, spatial_data[j] should be the maximum over the spectral bands of the variance or standard deviation of region j+1, depending on the user specified value of the spatial_type flag. If npix_data[j] < 9, spatial_data[j] should be a weighted average of the maximum over the spectral bands of the variance or standard deviation of region j+1 with the average spatial_init values for the pixels in the region.

For spatial_wght > 0.0, npix_data[j] >= 9, and spatial_type == 1 ("Variance"), spatial_data[j] for region j+1 is calculated as follows (see also Equations (9) and (10) above):

(14)

where B is the number of spectral bands. If spatial_type == 2 ("Standard Deviation"), the square root is taken before the band maximum is found. Note that Equation (14) is mathematically equivalent to the combination of Equations (9) and (10). For npix_data[j] == 1, spatial_data[j] is set equal to the input spatial_init value for the pixel in question. For npix_data[j] > 1 and npix_data[j] < 9, spatial_data[j] is set equal to a weighted sum of the spatial_data[j] value calculated from Equation (14) and the average spatial_init values for the pixels in region j+1 as per Equation (11).

Finally, the ngbr_mask is initialized by scanning rlblmap_data. nghbr_mask[j + k*max_nregions] is set to the value TRUE (or 1) if and only if region j is a neighbor of region k. Whether of not a pixel is considered to be a neighbor of another pixel is determined by the value of maxmdir. If a pixel is one of the maxmdir nearest neighbors of another pixel, then it is a neighbor of that pixel.

Functional description of the lhseg()function: If init_flg == TRUE upon entering the lhseg() function, the dissimilarity function values for all neighboring regions of each region are calculated, and bnghbr and bncomp are updated for each region. Then a sorted structure of type R_LIST, pointed to by the pointer head_nbsorted_list, is created such that is ordered by the value of bncomp. If there is more than one region with the same value of bncomp, the regions are secondarily ordered by region label value (lower to higher). The value of the variable minbcomp is set equal to the value of bncomp of the first element of this sorted list. If init_flg == FALSE upon entering the lhseg() function, the dissimilarity function values for all neighboring regions need not be calculated (they had been previously calculated), and sorted lists of type R_LIST, pointed to by the pointers head_nbsorted_list and head_sorted_list, are sorted by the value of bncomp and brcomp, respectively. When ties occur in the values of bncomp and brcomp, the regions are secondarily ordered by region label value (lower to higher).

Next the region pointed to by the head_nbsorted_list, and its most similar neighboring region, are merged. The merged region is given the label of the region with the lower valued label. The values for the new region in the npix_data, sum_data, and the nghbr_mask arrays are updated, as are the sumsq_data and spatial_data arrays, if necessary. The value of mgreg is updated for the region with the higher region label value. The list pointed to by the pointer head_nbsorted_list is resorted as necessary. If the region pointed to by the resorted head_nbsorted_list has bncomp <= minbcomp, the indicated regions are also merged. This continues until the region pointed to by the resorted head_nbsorted_list has bncomp > minbcomp. This constitutes an iteration of region growing.

If init_flg == TRUE, the dissimilarity function values for all non-neighboring regions of each region are now calculated, and bregion and brcomp are updated for each region. Then a sorted structure of type R_LIST, pointed to by the pointer head_sorted_list, is created such that is ordered by the value of brcomp. If there is more than one region with the same value of brcomp, the regions are secondarily ordered by region label value (lower to higher). The value of init_flg is now set to FALSE. If init_flg was previously equal to FALSE, these dissimilarity function values were previously calculated and sorted.

If the region pointed to by the head_sorted_list, and its most similar neighboring region, are merged if its value of brcomp < minbcomp (minbcomp was obtained from the merges of neighboring regions). The merged region is given the label of the region with the lower valued label. The values for the new region in the npix_data, sum_data, and the nghbr_mask arrays are updated, as are the sumsq_data and spatial_data arrays, if necessary. The value of mgreg is updated for the region with the higher region label value. The lists pointed to by the pointer head_sorted_list and head_nbsorted_list are resorted as necessary. If the region pointed to by the resorted head_sorted_list has brcomp <= minbcomp, the indicated regions are also merged. This continues until the region pointed to by the resorted head_nbsorted_list has brcomp > minbcomp. This constitutes an iteration of spectral clustering.

If the current number of region (nregions) is less than cvnbregs, the lhseg() function exits. Otherwise, additional alternate iterations of region growing (based on bncomp and bnghbr) and spectral clustering (based on brcomp and bregion) are performed until nregions <= minregions. In these additional iterations, the exhaustive calculation of dissimilarity function values for neighboring and non-neighboring regions is not required, as these values are updated at each merge. Note that lhseg() does not exit if nregions < cvnbregs after a region growing iteration. The function can only exit after a spectral clustering iteration.

Functional description of the fhseg()function: The fhseg() function is identical to the lhseg() function except for additional code for detecting the natural convergence points and code to output the segmentation results at the natural convergence points.

In Section II, part B the basic HSEG algorithm was described. In step 7, it was noted that at certain iterations, the region information from the previous iteration must be saved. Accordingly, fhseg() allocates and initializes and updates data arrays to store this information. The data array prev_rlblmap is initialized to equal the current value of rlblmap_data, and prevcritval is initialized to 0.0, and the current value of the global criterion value, critval, is calculated through a call to the function compute_cv().

In lhseg() the rlblmap_data is not updated. However, in fhseg() the rlblmap_data is updated after each group of region growing and spectral clustering merges (steps 2 through 5 of the basic HSEG algorithm description in Section II, part B). In addition, the region number of pixels and region sum vectors are updated. If spatial_wght > 0.0, the region sum square vectors and region spatial feature values are also updated.

After each region growing and spectral clustering (if requested) iteration, the value of cvratio = critval/prevcritval is calculated. If cvratio > convfact, then the region information for the previous iteration (prior to the current group of region growing and spectral clustering merges) is sent to files. This information includes the number of regions, the region number of pixels list, the region mean vector list and the region criterion value list, and, if requested, the region merges list. If the region merges list is not requested, the region label map is output each time. If the region merges list is requested, the region label map is output only the first time region information is output.

If after an iteration of spectral clustering merges, the number of regions is less than cvnbregs, the fhseg() function outputs the current region segmentation information to files and then exits.

Functional description of the lrhseg()function: Upon entering the lrhseg() function, the ncols and nrows variables are divided by 2 and the level variable is incremented by one. (This sets these variables to the proper values for the next set of recursive calls to recur_hseg(), described below.) Then the nghbr_mask array is initialized to all FALSE values, and the npix_sub, sum_sub, sumsq_sub, and spatial_sub arrays are initialized to npix_data, sum_data, sumsq_data, and spatial_data, respectively. Next, the data array starting points for the next level of recursion are saved into "_sub_sub" arrays as follows:

        input_sub_sub = input_sub + ncols*nrows*nbands;
        if (mask_flag)
          mask_sub_sub = mask_sub + ncols*nrows;
        if (spatial_wght > 0.0)
          spatial_init_sub_sub = spatial_init_sub + ncols*nrows;
        rlblmap_sub_sub = rlblmap_sub + ncols*nrows;
        nghbr_sub_sub = nghbr_sub + max_nregions*max_nregions;

Processing is now initiated for the first subsection of data (out of four). The data arrays input_sub, mask_sub (if provided), spatial_init_sub (if provided), and rlblmap_sub (if provided) and loaded with data from the upper left quadrant of the input data arrays input_data, mask_data, spatial_init, and rlblmap_data, respectively.

The function recur_hseg() is now called with the following input variable and array data:

input_sub	An unsigned short array containing the current section of input image data. Is renamed input_data internal to recur_hseg().
mask_sub	An unsigned char array containing the current section of mask data (if any). Is renamed mask_data internal to recur_hseg().
rlblmap_sub	An unsigned short array containing the current section of input region label map data (if any). (Will be overwritten by the output region label map data.) Is renamed rlblmap_data internal to recur_hseg().
spatial_init_sub	A float array containing the current section of input spatial feature data (local image variance or standard deviation - if any). Is renamed spatial_init internal to recur_hseg().
ncols	An int variable containing the number of columns in the current section of input image data
nrows	An int variable containing the number of rows in the current section of input image data
level	An int variable containing the value of the current level of recursion
roffset	An int variable containing the offset value required for the region labels to keep them distinct from the region labels in the previous quarter(s) of data (for the first quarter, roffset = 0)

The global variables mask_flag, rlblmap_in_flag, nbands, sub_ncols, sub_nrows, max_nregions, scale, offset, maxmdir, simcrit, extmean, spclust, spatial_wght, spatial_type, rnb_levels and minregions are also used as inputs by recur_hseg(). Certain variables and arrays are output by recur_hseg(). They are as follows:

sub_nregions[q]	An int variable containing the number of regions in the segmentation of the section of data processed. (For the first quarter of data, q = 0.)
rlblmap_sub	An unsigned short array containing the output region label map for the current section of data. This region label map specifies the spatial distribution of the regions.
npix_sub	An unsigned int array containing the current region number of pixels list stored as a vector. The j^th element of the vector is the number of pixels in region j+1.
sum_sub	A double array containing the current sum of the normalized input data values over all pixels in each region. Element j+sub_nregions[q]*b is the sum value for region j+1 and band b (first band is band number 0).
sumsq_sub	A double array containing the current sum of the square of the normalized input data values over all pixels in each region. (Actually contains values only if spatial_wght > 0.0.) Element j+sub_nregions[q]*b is the sum square value for region j+1 and band b (first band is band number 0).
spatial_sub	A double array containing the current region spatial feature value list stored as a vector. (Actually contains values only if spatial_wght > 0.0.) The j^th element of the vector is the spatial feature value for region j+1.
nghbr_sub	An unsigned char array of numbers designating, for the current section of data, whether or not a pair of regions are spatially adjacent (i.e., neighbors). The value of the mask at array element j+max_nregions*k is TRUE if and only if regions j+1 and k+1 are spatially adjacent.

Other data arrays are passed into recur_hseg() for use as work space. These are input_sub_sub, mask_sub_sub, rlblmap_sub_sub, spatial_init_sub_sub, rlabel_data, region_list, relabel_list, nghbr_sub_sub, nghbr_tmp, rlblmap_4cols, rlblmap_4rows, sort_store and nbsort_store. The arrays input_sub_sub, mask_sub_sub, rlblmap_sub_sub, spatial_init_sub_sub, and nghbr_sub_sub, are renamed to input_sub, mask_sub, rlblmap_sub, spatial_init_sub, and nghbr_sub internal to recur_hseg().

If level == rnb_levels, the function recur_hseg() performs hierarchical segmentation on the specified section of data. Otherwise, it makes a further recursive call to lrhseg(), and then performs hierarchical segmentation on the resulting data. The internal operations of the function recur_hseg() are described in a later section.

The outputs from recur_hseg() are now translated into the data arrays for the current reassembled section of data. The nghbr_mask is updated as follows (for the first quarter of data, q = 0 and roffset = 0):

        for (index = 0; index < sub_nregions[q]; index++)
          for (subindex = 0; subindex < sub_nregions[q]; subindex++)
            nghbr_mask[(index+roffset)*max_nregions + (subindex+roffset)] = 
                                        nghbr_sub[index*max_nregions + subindex];

The rlblmap_sub data is also loaded into the upper left quarter of the rlblmap_data array. The npix_data, sum_data, sumsq_data and spatial_data, arrays are updated automatically by the way the npix_sub, sum_sub, sumsq_sub and spatial_sub arrays are specified.

Processing is now initiated for the second subsection of data (out of four). The data arrays input_sub, mask_sub (if provided), spatial_init_sub (if provided), and rlblmap_sub (if provided) and loaded with data from the upper right quadrant of the input data arrays input_data, mask_data, spatial_init, and rlblmap_data, respectively, and the npix_sub, sum_sub, sumsq_sub, and spatial_sub arrays are incremented as follows (here q = 0):

        npix_sub += sub_nregions[q];
        sum_sub += sub_nregions[q]*nbands;
        if (spatial_wght > 0.0)
        {
          sumsq_sub += sub_nregions[q]*nbands;
          spatial_sub += sub_nregions[q];
        }

The function recur_hseg() is now called in the same manner as for the first quarter of data except that q = 1 and roffset = sub_nregions[0].

The outputs from recur_hseg() are now translated into the data arrays for the current reassembled section of data. The nghbr_mask is updated as follows (for the second quarter of data, q = 1 and roffset = sub_nregions[0]):

    for (index = 0; index < sub_nregions[q]; index++)
          for (subindex = 0; subindex < sub_nregions[q]; subindex++)
            nghbr_mask[(index+roffset)*max_nregions + (subindex+roffset)] =
                                        nghbr_sub[index*max_nregions + subindex];

The rlblmap_sub data is also loaded into the upper right quarter of the rlblmap_data array. The npix_data, sum_data, sumsq_data and spatial_data, arrays are updated automatically by the way the npix_sub, sum_sub, sumsq_sub and spatial_sub arrays are specified.

Processing is now initiated for the third subsection of data (out of four). The data arrays input_sub, mask_sub (if provided), spatial_init_sub (if provided), and rlblmap_sub (if provided) and loaded with data from the lower left quadrant of the input data arrays input_data, mask_data, spatial_init, and rlblmap_data, respectively, and the npix_sub, sum_sub, sumsq_sub, and spatial_sub arrays are incremented as follows (here q = 1):

        npix_sub += sub_nregions[q];
        sum_sub += sub_nregions[q]*nbands;
        if (spatial_wght > 0.0)
        {
          sumsq_sub += sub_nregions[q]*nbands;
          spatial_sub += sub_nregions[q];
        }

The function recur_hseg() is now called in the same manner as for the first quarter of data except that q = 2 and roffset = sub_nregions[0] + sub_nregions[1].

The outputs from recur_hseg() are now translated into the data arrays for the current reassembled section of data. The nghbr_mask is updated as follows (for the third quarter of data, q = 2 and roffset = sub_nregions[0] + sub_nregions[1]):

        for (index = 0; index < sub_nregions[q]; index++)
          for (subindex = 0; subindex < sub_nregions[q]; subindex++)
            nghbr_mask[(index+roffset)*max_nregions + (subindex+roffset)] =
                                        nghbr_sub[index*max_nregions + subindex];

The rlblmap_sub data is also loaded into the lower left quarter of the rlblmap_data array. The npix_data, sum_data, sumsq_data and spatial_data, arrays are updated automatically by the way the npix_sub, sum_sub, sumsq_sub and spatial_sub arrays are specified.

Processing is now initiated for the fourth subsection of data (out of four). The data arrays input_sub, mask_sub (if provided), spatial_init_sub (if provided), and rlblmap_sub (if provided) and loaded with data from the lower right quadrant of the input data arrays input_data, mask_data, spatial_init, and rlblmap_data, respectively, and the npix_sub, sum_sub, sumsq_sub, and spatial_sub arrays are incremented as follows (here q = 2):

        npix_sub += sub_nregions[q];    
        sum_sub += sub_nregions[q]*nbands;
        if (spatial_wght > 0.0)
        {
          sumsq_sub += sub_nregions[q]*nbands;
          spatial_sub += sub_nregions[q];
        }

The function recur_hseg() is now called in the same manner as for the first quarter of data except that q = 3 and roffset = sub_nregions[0] + sub_nregions[1] + sub_nregion[2].

The outputs from recur_hseg() are now translated into the data arrays for the current reassembled section of data. The nghbr_mask is updated as follows (for the fourth quarter of data, q = 3 and roffset = sub_nregions[0] + sub_nregions[1] + sub_nregion[2]):

        for (index = 0; index < sub_nregions[q]; index++)
          for (subindex = 0; subindex < sub_nregions[q]; subindex++)
            nghbr_mask[(index+roffset)*max_nregions + (subindex+roffset)] =
                                        nghbr_sub[index*max_nregions + subindex];

The rlblmap_sub data is also loaded into the lower right quarter of the rlblmap_data array. The npix_data, sum_data, sumsq_data and spatial_data, arrays are updated automatically by the way the npix_sub, sum_sub, sumsq_sub and spatial_sub arrays are specified.

Finally, the function lrhseg() sets nregions = sub_nregions[0] +sub_nregions[1] +sub_nregions[2] +sub_nregions[3] and returns to the calling function.

Functional description of the recur_hseg()function: If level == rnb_levels, the function recur_hseg() calls the region_data_init() function with the following input variable and array data:

input_data	An unsigned short array containing the current section of input image data
mask_data	An unsigned char array containing the current section mask data (if any)
rlblmap_data	An unsigned short array containing the current section of input region label map data (if any). (Will be overwritten by the output region label map data.)
spatial_init	A float array containing the current section of input spatial feature data (local image variance or standard deviation - if any)
ncols	An int variable containing the number of columns in the current section of input image data
nrows	An int variable containing the number of rows in the current section of input image data

nregions	An int variable containing the initial number of regions in the current section of data.
rlblmap_data	An unsigned short array containing the initial output region label map for the current section of data. This region label map specifies the initial spatial distribution of the regions.
npix_data	An unsigned int array containing the initial region number of pixels list stored as a vector for the current section of data. The j^th element of the vector is the number of pixels in region j+1.
sum_data	A double array containing the initial sum of the normalized input data values over all pixels in each region for the current section of data. Element j+nregions*b is the sum value for region j+1 and band b (first band is band number 0).
sumsq_data	A double array containing the initial sum of the square of the normalized input data values over all pixels in each region for the current section of data. (Actually contains values only if spatial_wght > 0.0.) Element j+nregions*b is the sum square value for region j+1 and band b (first band is band number 0).
spatial_data	A double array containing the initial region spatial feature value list stored as a vector for the current section of data. (Actually contains values only if spatial_wght > 0.0.) The j^th element of the vector is the spatial feature value for region j+1.
nghbr_mask	An unsigned char array of numbers designating whether or not a pair of regions are spatially adjacent (i.e., neighbors) for the current section of data. The value of the mask at array element j+max_nregions*k is TRUE if and only if regions j+1 and k+1 are spatially adjacent.
region_list	A reg_list (R_LIST) structure array pointing to a sequence of reg_struct (REGION) structure elements containing label and comparative information about each region for the current section of data.

The array rlabel_data is also passed to the function region_data_init() for use as work space. See above for a functional description of the region_data_init() function. The purpose of the region_data_init() function is to initialize the variables and arrays output by the region_data_init() function based on the input variable and array data.

The function lrhseg() is now called with the following input variable and array data:

input_data	An unsigned short array containing the current section of input image data
mask_data	An unsigned char array containing the current section of mask data (if any)
rlblmap_data	An unsigned short array containing the current section of input region label map data (if any). (Will be overwritten by the output region label map data.)
spatial_init	A float array containing the current section of input spatial feature data (local image variance or standard deviation - if any)
ncols	An int variable containing the number of columns in the current section of input image data
nrows	An int variable containing the number of rows in the current section of input image data
level	An int variable containing the value of the current level of recursion

nregions	An int variable containing the number of regions in the segmentation of the section of data processed.
rlblmap_data	An unsigned short array containing the output region label map for the current section of data. This region label map specifies the spatial distribution of the regions.
npix_data	An unsigned int array containing the current region number of pixels list stored as a vector. The j^th element of the vector is the number of pixels in region j+1.
sum_data	A double array containing the current sum of the normalized input data values over all pixels in each region. Element j+nregions*b is the sum value for region j+1 and band b (first band is band number 0).
sumsq_data	A double array containing the current sum of the square of the normalized input data values over all pixels in each region. (Actually contains values only if spatial_wght > 0.0.) Element j+nregions*b is the sum square value for region j+1 and band b (first band is band number 0).
spatial_data	A double array containing the current region spatial feature value list stored as a vector. (Actually contains values only if spatial_wght > 0.0.) The j^th element of the vector is the spatial feature value for region j+1.
nghbr_mask	An unsigned char array of numbers designating, for the current section of data, whether or not a pair of regions are spatially adjacent (i.e., neighbors). The value of the mask at array element j+max_nregions*k is TRUE if and only if regions j+1 and k+1 are spatially adjacent.

ncols	An int variable containing the number of columns in the current section of input image data
nrows	An int variable containing the number of rows in the current section of input image data
nregions	An int variable containing the current value of the number of regions is the current section of segmented data
rlblmap_4cols	An unsigned short array containing the first, middle two and last columns of the current region label map data.
rlblmap_4rows	An unsigned short array containing the first, middle two and last rows of the current region label map data.
nghbr_mask	An unsigned char array of numbers designating, for the current section of data, whether or not a pair of regions are spatially adjacent (i.e., neighbors). The value of the mask at array element j+max_nregions*k is TRUE if and only if regions j+1 and k+1 are spatially adjacent.

The function update_nghbr_mask() modifies the input nghbr_mask array so that it includes the neighbor relationship across the middle two columns and rows of the current region label map. The nghbr_mask array had been previously lacking this information. See a previous section for a functional description of update_nghbr_mask().

The function region_list_init() is now called with the following input variable:

nregions

An int variable containing the number of regions in the current section of segmented data

The following data array is output by region_list_init():

region_list

A reg_list (R_LIST) structure array pointing to a sequence of reg_struct (REGION) structure elements containing label and comparative information about each region.

See a previous section for a functional description of the region_list_init() function. The purpose of the region_list_init() function is to initialize the region_list structure.

After the call to either region_data_init() or lrhseg(), update_nghbr_mask() and region_list_init() are completed, the following parameters are set: init_flg = TRUE, onregions = nregions and cvnbregs = minregions (note that minregions is a user set parameter). If nregions <= minregions, the calls below to the functions lhseg() and read_region_list() are skipped. Otherwise, the function recur_hseg() calls the function lhseg() with the following input variable and array data:

init_flag	An int variable which indicates whether or not the region_list has just been initialized (in this case init_flag = TRUE).
onregions	An int variable containing the number of regions existing when lhseg() is initially called.
cvnbregs	An int variable indicating the number of regions at which that this call to lhseg() will converge (exit).
npix_data	An unsigned int array containing the region number of pixels list stored as a vector. The j^th element of the vector is the number of pixels in region j+1.
sum_data	A double array containing the sum of the normalized input data values over all pixels in each region. Element j+nregions*b is the sum value for region j+1 and band b (first band is band number 0).
sumsq_data	A double array containing the sum of the square of the normalized input data values over all pixels in each region. (Actually contains values only if spatial_wght > 0.0.) Element j+nregions*b is the sum square value for region j+1 and band b (first band is band number 0).
spatial_data	A double array containing the region spatial feature value list stored as a vector. (Actually contains values only if spatial_wght > 0.0.) The j^th element of the vector is the spatial feature value for region j+1.
nghbr_mask	An unsigned char array of numbers designating whether or not a pair of regions are spatially adjacent (i.e., neighbors). The value of the mask at array element j+max_nregions*k is TRUE if and only if regions j+1 and k+1 are spatially adjacent.
region_list	A reg_list (R_LIST) structure array pointing to a sequence of reg_struct (REGION) structure elements containing label and comparative information about each region.

nregions	An int variable containing the initial number of regions in the section of data processed.
npix_data	An unsigned int array containing the region number of pixels list stored as a vector. The j^th element of the vector is the number of pixels in region j+1.
sum_data	A double array containing the sum of the normalized input data values over all pixels in each region. Element j+nregions*b is the sum value for region j+1 and band b (first band is band number 0).
sumsq_data	A double array containing the sum of the square of the normalized input data values over all pixels in each region. (Actually contains values only if spatial_wght > 0.0.) Element j+nregions*b is the sum square value for region j+1 and band b (first band is band number 0).
spatial_data	A double array containing the region spatial feature value list stored as a vector. (Actually contains values only if spatial_wght > 0.0.) The j^th element of the vector is the spatial feature value for region j+1.
nghbr_mask	An unsigned char array of numbers designating whether or not a pair of regions are spatially adjacent (i.e., neighbors). The value of the mask at array element j+max_nregions*k is TRUE if and only if regions j+1 and k+1 are spatially adjacent.
region_list	A reg_list (R_LIST) structure array pointing to a sequence of reg_struct (REGION) structure elements containing label and comparative information about each region.

The arrays sort_store, and nbsort_store are also passed to the lhseg() function to use as work space. See a previous section for a functional description of the lhseg() function. The purpose of the lhseg() function is to perform multiple iterations of region growing intermixed with spectral clustering (if requested) until the number of regions becomes less than or equal to cvnbregs (in this case, minregions).

After the lhseg() function exits, the function read_region_list() is called with the following input variable and array data:

nregions	An int variable containing the number of regions in the section of data processed.
npix_data	An unsigned int array containing the region number of pixels list stored as a vector. The j^th element of the vector is the number of pixels in region j+1.
sum_data	A double array containing the sum of the normalized input data values over all pixels in each region. Element j+nregions*b is the sum value for region j+1 and band b (first band is band number 0).
sumsq_data	A double array containing the sum of the square of the normalized input data values over all pixels in each region. (Actually contains values only if spatial_wght > 0.0.) Element j+nregions*b is the sum square value for region j+1 and band b (first band is band number 0).
spatial_data	A double array containing the region spatial feature value list stored as a vector. (Actually contains values only if spatial_wght > 0.0.) The j^th element of the vector is the spatial feature value for region j+1.
nghbr_mask	An unsigned char array of numbers designating whether or not a pair of regions are spatially adjacent (i.e., neighbors). The value of the mask at array element j+max_nregions*k is TRUE if and only if regions j+1 and k+1 are spatially adjacent.
region_list	A reg_list (R_LIST) structure array pointing to a sequence of reg_struct (REGION) structure elements containing label and comparative information about each region.

The global variables nbands, max_nregions and spatial_wght are also used as input by read_region_list(). The following variables and arrays are modified and/or output by read_region_list():

npix_data	An unsigned int array containing the region number of pixels list stored as a vector. The j^th element of the vector is the number of pixels in region j+1.
sum_data	A double array containing the sum of the normalized input data values over all pixels in each region. Element j+nregions*b is the sum value for region j+1 and band b (first band is band number 0).
sumsq_data	A double array containing the sum of the square of the normalized input data values over all pixels in each region. (Actually contains values only if spatial_wght > 0.0.) Element j+nregions*b is the sum square value for region j+1 and band b (first band is band number 0).
spatial_data	A double array containing the region spatial feature value list stored as a vector. (Actually contains values only if spatial_wght > 0.0.) The j^th element of the vector is the spatial feature value for region j+1.
nghbr_mask	An unsigned char array of numbers designating whether or not a pair of regions are spatially adjacent (i.e., neighbors). The value of the mask at array element j+max_nregions*k is TRUE if and only if regions j+1 and k+1 are spatially adjacent.
region_list	A reg_list (R_LIST) structure array pointing to a sequence of reg_struct (REGION) structure elements containing label and comparative information about each region.
relabel_list	An unsigned short array consisting of pairs of region labelings representing a renumbering of the region map labeling.

ncols	An int variable containing the number of columns in the input image data
nrows	An int variable containing the number of rows in the input image data
nelements	An int variable containing the number of non-zero elements in the relabel_list array
relabel_list	An unsigned short array consisting of pairs of region labelings representing a renumbering of the region map labeling.
rlblmap_data	An unsigned short array containing the region label map data as it existed prior to the call to lhseg().

The following data array is modified by update_rlblmap():

rlblmap_data

An unsigned short array containing the region label map data renumbered according to relabel_list.

At this point the function recur_hseg() exits returning the number of regions, nregions, to the calling function.

E. - Supportive theory:

Definition of image segmentation

In Section I, Part C we gave the following classic definition of image segmentation (following [3]):

Let X be a two-dimensional array representing an image. A segmentation of X can be defined as a partition of X into disjoint subsets X₁, X₂, ..., X_N, such that

X_i, i = 1, 2, ..., N is connected.
for i = 1, 2, ..., N, and
for i != j, where X_i and X_j are adjacent.

P(X_i) is a logical predicate which assigns the value TRUE or FALSE to X_i, depending on the image data values in X_i.

Zucker [4] summarized the above definition as follows: The first condition requires that every picture element (pixel) must be in a region. The second condition requires that each region must be connected, i.e. composed of contiguous image pixels. The third condition determines what kind of properties each region must satisfy, i.e. what properties the image pixels must satisfy to be considered similar enough to be in the same region. The fourth condition specifies that, in the final segmentation result, any merging of any adjacent regions would violate the third condition.

A problem with this classic definition of image segmentation is that the segmentation so defined is not unique. The number, N, and shape of the partitions, X₁, X₂, ..., X_N, depend on the order in which the image pixels are processed. In addition, there is no concept of optimality contained in this definition of image segmentation. Under this classic definition, all partitions that satisfy the conditions represent equally good or valid segmentations of the image.

An ideal definition of image segmentation would be as follows:

Let X be a two-dimensional array representing an image. A segmentation of X into N regions can be defined as a partition of X into disjoint subsets X₁, X₂, ..., X_N, such that

X_i, i = 1, 2, ..., N is connected.
over all partitions into N regions, and
for i != j, where X_i and X_j are adjacent.

G(X_i) is a function that assigns a cost to partition X_i, depending on the image data values in X_i.

These conditions can be summarized as follows: The first condition requires that every picture element (pixel) must be in one of N regions. The second condition requires that each region must be connected, i.e. composed of contiguous image pixels. The third condition states that the partition must produce a minimum cost aggregated over all N regions. The fourth condition specifies that, in the final segmentation result, any merging of adjacent regions increases the minimum cost obtained in the third condition.

As a result of these conditions, the order dependence problem is eliminated because the global minimum solution is found, and this solution is the optimal solution. In practice, this ideal image segmentation is difficult, if not impossible, to find. The third condition implies that all possible image partitions consisting of N regions must be searched to find the minimum cost. Further, the question of the proper value for N is left undetermined.

Schachter, et al [5] suggest that an iterative parallel region growing process be used to eliminate the order dependence problem. Kettig and Landgrebe [6] suggest an alternative partitioning logic in which the most similar neighboring region is merged first, but found this approach too difficult to implement in a sequential manner with the computing resources they had at that time. Tilton and Cox [7] propose implementing an iterative parallel approach to region growing on parallel processors in order to overcome the computational demands of this approach. In their approach, the most similar pair(s) of spatially adjacent regions are merged at each iteration. This approach solved the order dependence problem (assuming a deterministic tie-breaking method is employed), but did not fully address the optimal segmentation problem. Merging the most similar pair(s) of spatially adjacent regions at each iteration does not guarantee that the segmentation result at a particular iteration is the optimal partition of the image data for the number of partitions obtained at that iteration. Beaulieu and Goldberg [8] provide a theoretical basis for Tilton and Cox's iterative parallel region growing approach in their theoretical analysis of their similar Hierarchical Stepwise Optimization algorithm (HSWO). They show that the HSWO algorithm produces the globally optimal segmentation result if each iteration is statistically independent. Even though each iteration will generally not be statistically independent for natural images, the HSWO approach is shown to still produce excellent results. Beaulieu and Goldberg also point out that the sequence of partitions generated by this iterative approach reflect the hierarchical structure of the imagery data: the partitions obtained in the early iterations preserve the small details and objects in the image, while the partitions obtained in the latter iterations preserve only the most important components of the image. They further note that these hierarchical partitions may carry information that may help in identifying the objects in the imagery data.

The definition of image segmentation as followed by the HSWO algorithm is defined recursively as follows:

Let X be a two-dimensional array representing an image, and let X₁, X₂, ..., X_N-1, X_N be a partition of X into N regions such that

, and
X_i , i = 1, 2, ..., N is connected.

Let G(X_i) be a function that assigns a cost to partition X_i, depending on the image data values in X_i. Reorder the partition X₁, X₂, ..., X_N-1, X_N such that for all i != j. The segmentation of X into N-1 regions is defined as the partition where for i = 1, 2, ..., N-2 and .

The initial partition may assign each image pixel to a separate region, in which case the initial value of N is the number of pixels in the image (N_p). Any other initial partition may be used, such as a partitioning of the image into nxn blocks, where n² << N_p.

The region growing approach utilized by the hierarchical image segmentation algorithm, HSEG, is the same as that employed by Beaulieu and Goldberg's HSWO algorithm except when HSEG alternates spectral clustering iterations with region growing iterations to merge non-adjacent regions. We have found that such spectral clustering adds robustness to the segmentation result, and eliminates the bookkeeping overhead of separately accounting for essentially identical non-adjacent regions.

Review of region growing strategies.

See Section I, Part C for a review of region growing strategies.

Review of region comparison strategies.

All region growing approaches require some criterion to determine whether or not pixels and/or regions should be combined together to form a new region. Muerle and Allen [9] and Kettig and Landgrebe [10] devised similarity criteria that compared aggregated statistical characterizations of the regions. In this approach, the values of the pixels making up regions are assumed to be manifestations of populations from random processes. Under this assumption, statistical measures are devised to compare pairs of regions.

Pavlidis [11] approximated image regions by polynomial functions. Regions were merged if the coefficients of their polynomial function approximation were similar. This approach allows for the possibility of regions in which the pixel values gradually vary.

Brice and Fennema [12] based their similarity criterion on a combination of the "weakness" of the boundary between two regions and the comparative region boundary lengths. They merged two adjacent regions if the boundary between them was weak, and the resulting new region had a shorter boundary than the previous two.

The HSEG algorithm is implemented in a region oriented fashion that requires minimal reference to data in the original full spatial resolution. The polynomial function approximation approach of Pavlidis and the boundary weakness approach of Brice and Fennema are not amenable to this type of implementation since they require repeated reference to the input data at full spatial resolution. However, the type of region comparison strategy that compares aggregated characteristics of regions is an ideal approach for the HSEG implementation. However, since Muerle and Allen, numerous region comparison strategies of this form have been proposed, which amount to different forms for a region similarity or dissimilarity criterion. We discuss many of these region similarity or dissimilarity criteria in the following. Any of these region similarity or dissimilarity criteria can be used in the HSEG algorithm, without changing the nature of the algorithm. Which criteria is best will depend on the characteristics of the image being segmented.

The first dissimilarity criterion tested by Muerle and Allen [9] is the Kolmogorov-Smirnov two-tailed hypothesis test [13]. This is a test of whether two independent sample sets have been drawn from populations having the same statistical distribution. In this test the maximum absolute difference is found between the cumulative distributions of the two data populations. If this maximum absolute difference is below a pre-set threshold, the two populations are taken to be from the same statistical distribution, and the two regions should be merged. However, this test can give inappropriate results. For example, when the two regions consists of data values of constant value differing by only one gray level value, this test will reject the hypothesis that the two regions consist of data from the same population, even for a very high threshold. For this reason, Muerle and Allen abandoned this test for an ad hoc criterion in which two regions are merged when the average or area under the absolute value of the difference between the two distributions is below a pre-set threshold.

Kettig and Landgrebe [6,10] used the Student's t-statistic assuming equal region variances as a dissimilarity criterion. Tilton and Cox [7] also used the Student's t-statistic as a region means comparison test, but allowed for unequal region variances. This version of the Student's t-statistic with unequal region variances is also referred to in the statistical literature (e.g. [14,15]) as the Behrens-Fisher problem. This problem has no exact solution, but according to Sachs [15], it is usually adequate to calculate an approximate degree of freedom value for the Student's t-statistic and perform the standard Student's t-test. For regions 1 and 2, the Student's t-statistic is:

(15)

where m_i is the mean of region i, is the variance of region i, and d_i is the degrees of freedom of region i. The approximate degree of freedom value (rounded to the nearest integer) for the test is:

(16)

Tilton and Cox combined the above region means comparison test with the F-ratio test as a region variance comparison test. The F-ratio for regions 1 and 2 is simply . Tilton and Cox combined the region means and variance comparison tests by taking the geometric mean (or, equivalently, the product) of the probabilities of random occurrence of the calculated t- and F-values. This is different than the usual practice in statistical hypothesis testing. Usually, one selects in advance a particular significance level and accepts or rejects the null hypothesis at that significance level. The tests would be performed separately for each statistic and the two regions would be merged only if both tests indicate the regions are from the same statistical population. Tilton and Cox instead threshold on the geometric mean, which allows for a trade off of similarity in mean with similarity in variance.

Vector norms have long been used to compare vectors such as those contained in multispectral imagery data (e.g. [2]). The 1-Norm for regions j and k is (for B spectral bands)

1-Norm =

(17)

where and are the mean values for regions j and k, respectively, in spectral band b. Similarly, the 2-Norm is

2-Norm =

(18)

and the Infinity-Norm is

Infinity-Norm = max {

: b=1,2, ...,B}.

(19)

The 1-, 2- and Infinity-Norm are currently implemented as dissimilarity criterion in the HSEG algorithm. Any of the other aggregated criteria described in this section could also be implemented as dissimilarity criterion in the HSEG algorithm.

The band average and band maximum normalized mean squared error (BAMSE and BMMSE) criteria were first defined by Tilton [16]. The normalization, performed on the input data prior to performing the mean squared error calculation, is defined as follows:

Let x_bi be the original value for the i^th pixel (out of N pixels) in the b^th band (out of B bands). The mean and variance of the b^th band are

, and

(20)

respectively. To normalize the data to have mean = M_b and variance = S², set

(21)

For convenience, the data is normalized so that S² (= S) = 1. Since an entropy-based criterion (see below) requires that all data values be strictly positive, we set the mean value, M_b, of the normalized data to be the value that will produce a minimum value of 2 (so as to avoid computational problems calculating ln(y_bi)). That is,

(22)

The above description of image normalization is applied to each spectral band separately. Normalization can also be defined to apply equally across all spectral bands. In this case use for in Equations (21) and (22), and perform the minimization in Equation (22) across all bands as well as across all image pixels.

The Mean Squared Error (MSE) of the b^th band of an N-pixel multiband region mean image partitioned into R regions is defined as

(23)

where the y_bi are the normalized values (see Equation (21) above) of the i^th pixel of the b^th band in region r, and is the normalized mean value of the b^th band of region r containing pixel i.

For a particular pair of regions, is the change in MSE_b for band b for the region mean image formed after a pair of regions (originally labeled j and k, respectively) is merged as compared to the region mean image before the merge. The change in mean squared error from merging regions j and k is calculated as follows:

(24)

where n_j and n_k are the number of pixels in regions j and k, respectively, and are the means of the b^th band of regions j and k, respectively, before the regions are merged, and is the mean of the b^th band of region formed from combining regions j and k.

Since

(25)

an alternate form for Equation (24) is:

(26)

Based on the above derivation of (Equation (24) or (26)), the Band Average Normalized Mean Squared Error criterion comparing regions j and k is then

(27)

Based on the derivation of culminating in Equation (23) or (26), the Band Maximum Normalized Mean Squared Error criterion comparing regions j and k is defined as

BMMSE^jk = max {

: b=1,2, ...,B}.

(28)

Cost or dissimilarity functions based on unnormalized are used by many other authors, including Beaulieu and Goldberg [8] and Haris, et al [17].

The Entropy (summed over bands) criterion was first defined in Tilton [16]. In order to avoid computational problems with this criterion, the input data must be normalized when using this criterion (either across bands or over bands separately). See the above description of the BAMSE and BMMSE criteria for a description of the required normalization process.

The entropy of the b^th band of an N-pixel multiband image (B spectral bands) is defined as (from Frieden [18]):

(29)

where y_bi is the normalized data value from band b at pixel i, and M_b is defined in Equation (22) above. We take the total multispectral entropy to be

(30)

This summation is strictly true only if all spectral bands are uncorrelated, which is generally not the case.

For a particular pair of regions, labeled j and k, is the change in H for the multispectral region mean image formed after the pair of regions is merged as compared to the region mean image before the merge. The change in entropy from merging regions j and k is

(31)

This is the Entropy (summed over bands) criterion.

The Normalized Vector Distance (NVD) is taken from Baraldi and Parmiggiani [19]. They define the factors and based on the match, respectively, between the modulus and the angle of the feature vectors being compared. Let and be the mean vectors from class j and k, respectively. Then

(32)

where and are the moduli of and , respectively. Let be defined as follows:

(33)

where is the scalar product between and . Assuming that and always have non-negative valued elements (as is the case with remotely sensed multispectral imagery data), will always be in the range from 0 to , is defined as the normalized value of :

(34)

is defined as the product of and . Finally, NVD is defined as 1-.

Region Growing Threshold Value or Stopping Criteria

All region growing approaches must face the question of how to determine how many regions should exist in the final result. For approaches based on the classic logical predicate definition of image segmentation, this question amounts to determining the appropriate threshold value for the logical predicate test. For approaches based on hierarchical stepwise optimization (HSWO), this question can also depend on the determination of an appropriate threshold value for, in this case, the cost function. However, for HSWO, this question can also be answered in other ways. The simplest is to stop the iterative region growing process when a certain preset number of regions is obtained. A more flexible approach would be to base the stopping point on the dynamical behavior of the cost function. When segmenting a noisy checkerboard image, Beaulieu and Goldberg [8] observed a steep drop in the value of their cost function when a segmentation directly corresponding to the checkerboard squares was obtained. They looked for similar drops in their cost function when segmenting a Landsat satellite picture, but did not find any.

An advantage of the HWSO approach is that the final product need not be a single segmentation. Instead, a hierarchical set of segmentations can be produced. Tilton [20] found that an appropriate hierarchical set of segmentations could be selected by observing the behavior of a global cost function. In this case, the cost function was a function measuring the dissimilarity between the original image data and the region mean image resulting from the current segmentation. Tilton selected a segmentation for inclusion in the output segmentation hierarchy if the ratio of the global dissimilarity for that segmentation to that of the next segmentation exceeded a preset threshold value. The threshold was selected so that the number of segmentation included in the segmentation hierarchy came within a certain range of values (e.g., from 20 to 30). This is the approach used in the current version of the HSEG program discussed in this patent application.

REFERENCES

J. C. Tilton, "A recursive PVM implementation of an image segmentation algorithm with performance comparisons in-between the HIVE and the Cray T3E," Proceedings of the Seventh Symposium on the Frontiers of Massively Parallel Computation (Frontiers ‘99), Annapolis, MD, pp. 146-153, Feb. 21-25, 1999.
G. W. Stewart, Introduction to Matrix Computations, p. 164, Academic Press: New York, NY, 1973.
S. L. Horowitz and T. Pavlidis, "Picture segmentation by a directed split-and-merge procedure," Proc. Second Int. Joint Conf. Pattern Recognition, pp. 424-433, 1974.
S. W. Zucker, "Region growing: childhood and adolescence," Computer Graphics and Image Processing, Vol. 5, pp. 382-399, 1976.
B. J. Schachter, L. S. Davis and A. Rosenfeld, "Some experiments in image segmentation by clustering of local feature vectors," Pattern Recognition, Vol. 11, No. 1, pp. 19-28, 1979.
R. L. Kettig and D. A. Landgrebe, "Computer classification of remotely sensed multispectral iamge data by extraction and classification of homogeneous objects," LARS Information Note 050975, Laboratory for Applications of Remote Sensing, Purdue University, West Lafayette, IN, 1975.
J. C. Tilton and S. C. Cox, "Segmentation of remotely sensed data using parallel region growing," Digest of the 1983 International Geoscience and Remote Sensing Sympoisum, San Francisco, Ca, pp. 9.1-9.6, August 31 – September 2, 1983.
J.-M. Beaulieu and M. Goldberg, "Hierarchy in picture segmentation: A stepwise optimization approach," IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 11, No. 2, pp. 150-163, Feb. 1989.
J. L. Muerle and D. C. Allen, "Experimental evaluation of techniques for automatic segmentation of objects in a complex scene," in Pictorial Pattern Recognition, G. C. Cheng, R. S. Ledley, D. K. Pollack and A. Rosenfeld, editors, pp. 3-13, Thompson: Washington, DC, 1968.
R. L. Kettig and D. A. Landgrebe, "Automatic boundary and sample classification of remotely sensed multispectral data," LARS Information Note 041773, The Laboratory for Applications of Remote Sensing, Purdue Univ., W. Lafayette, IN, 1973.
T. Pavlidis, "Segmentation of pictures and maps through functional approximation," Computer Graphics Image Processing 1, pp. 360-372, 1972.
C. Brice and C. Fennema, "Scene analysis using regions," Artificial Intelligence 1, pp. 205-226, 1970.
S. Siegel, Nonparametric statistics for the behavioral sciences, pp. 127-136, McGraw-Hill Book Company, Inc., New York, 1956.
G. W. Snedecor and W. G. Cochran, Statistical Methods, The Iowa State University Press, Ames, IA, 1967.
L. Sachs, Applied Statistics, Springer-Verlag, New York,NY, 1982.
J. C. Tilton, "Experiences using TAE-Plus command language for am image segmentation program interface," Proceedings of the TAE Ninth Users' Conference, New Carrollton, MD, pp. 297-312, 1991.
K. Haris, S. N. Efstratiadis, N. Maglaveras and A. K. Katsaggelos, "Hybrid image segmentation using watersheds and fast region merging," IEEE Transactions on Image Processing, Vol. 7, No. 12, pp. 1684-1699, December 1998.
B. R. Frieden, "Image enhancement and restoration," in Picture Processing and Digital Filtering, ed. T. S. Huang, Springer-Verlag, New York, 1975.
A. Baraldi and F. Parmiggiani, "A neural network for unsupervised categorization of multivalued input patterns: an application to satellite image clustering," IEEE Transactions on Geoscience and Remote Sensing, vol. 33, no. 2, pp. 305-316, 1995.
J. C. Tilton, "Image segmentation by region growing and spectral clustering with a natural convergence criterion," Proc. of the 1998 International Geoscience and Remote Sensing Symposium, Seattle, WA, July 6-10, 1998.

SECTION II. TECHNICALLY COMPLETE AND EASILY UNDERSTANDABLE DESCRIPTION OF INNOVATION DEVELOPED TO SOLVE THE PROBLEM OR MEET THE OBJECTIVE

A. - Purpose and description of innovation:

B. - Identification of component parts or steps, and explanation of mode of operation of innovation/software preferably referring to drawings, sketches, photographs, graphs, flow charts, and/or parts or ingredient lists illustrating the components.

C. - Functional operation:

E. - Supportive theory:

REFERENCES

Contact Us