University of Houston
Department of Computer Science

In partial fulfillment of the Requirements for the Degree of
Master of Science

Jing Wang
will present her dissertation

Region Discovery Using Hierarchical Supervised Clustering

Abstract
 

The discovery of interesting regions in spatial datasets is an important data mining task. In particular, we are interested in identifying disjoint, contiguous regions that are unusual with respect to the distribution of a given class; e.g. a region that contains an unusually low or high number of instances of a particular class. This thesis centers on the discussion of techniques, methodologies, and algorithms to discover such regions. Measures of interestingness and a supervised clustering framework are introduced for this purpose. Two hierarchical supervised clustering algorithms are designed, implemented, and evaluated in this thesis: an agglomerative hierarchical supervised clustering named SCAH, and an algorithm named SCMRG which searches a multi-resolution grid structure top down for interesting regions. Finally, experimental results of applying the proposed techniques to the problem of identifying hotspots in spatial datasets are discussed which use a benchmark consisting of artificial, geological, and US census datasets. SCAH does well finding small, pure regions but does not do well in discovering larger regions, and the algorithm is quite slow and unsuitable for large spatial datasets. SCMRG, on  the other hand, can be applied to very large spatial datasets, and obtains some reasonable results in discovering regional patterns for some datasets but does poorly in identifying local patterns and if patters do no coincide with the rectangular grid structure that the algorithm employs.

Date: April 27th, 2006
Time: 10:00am
Place: 550-PGH

Faculty, students, and the general public are invited.
Advisor: Dr. Christoph F. Eick