Proximity based dynamic hierarchical clustering of high dimensional data for efficient searching

Date

2014-08

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

With the rapid advancement of various new application domains like computational biology, e-commerce, bioinformatics, genetic engineering and even big data, emphasizes the need for analyzing high dimensional data. A high dimensional data set is an entity which has numerous characteristics in more than one dimensions. For example a certain point in a jpeg image has dimensions like color, brightness, hue, sharpness etc. In modern day technology, as we need to analyze these large set of data, we need to arrange these in such a way that the search becomes efficient in terms of time and hardware as well. There is growing body of research in optimizing search capabilities for larger data-set, however efficiency and dynamicity has not be up to par.

We propose an idea of arranging these data in in hierarchical manner. We take the proximity of one data with another and arrange these in multiple levels and in each level and each level consists of numerous clusters. We also consider overlap of clusters and tried to get the best result with minimized time and resources. We also consider the dynamic nature of data set which is another contribution to this field.

Description

Keywords

High dimensional data, Hierarchical clustering, Dynamic data

Citation