[IEEE 2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA) -...

5
A New Method for Finding Clusters Embedded in Subspaces Applied to Medical Tomography Scan Image Amel Boulemnadjel 1 , Fella Hachouf 1 1 Laboratoire d’automatique et de robotique, Département d’électronique Université Mentouri de Constantine Route Ain El-Bey, 25000 Constantine Algerie e-mail: [email protected], [email protected] Abstract— In this paper a new subspaces clustering algorithm is proposed. This method has two levels, the first one is an iterative algorithm based on the minimization of an objective function. The density is introduced in this objective function where the distances between points become relatively uniform in high dimensional spaces. In such cases, the density of cluster may give better results. The idea of the second level is to find the clusters in each subspace individually. We applied the proposed method to medical tomography scan image without Intravenous or IV contrast dye. Then we compare the results with the same image with IV contrast. However in some cases, there are risks associated with this injection, where the mortality risk is low but not null. This method can reduce the use of this injection. Experimental results on synthetic and real datasets show that the proposed method gives good results in medical tomography image. Keywords— subspaces clustering, cluster, medical Tomography image, IV contrast I. INTRODUCTION In Computed Tomography scan (CT), doctors need always tools used to help in differentiating normal structures from abnormal structures like tumors, aneurysms and infection. One of theses used tools is Intravenous or IV contrast dye. The contrast highlights different tissue types bringing out vessels, tumors, inflammation, cysts, etc. The risk of serious complications exists regardless the type of the used contrast. Patients should inform the radiologist or technologist if they have a history of allergies (especially to medications, previous iodine injections, or shellfish), diabetes, asthma, a heart condition, kidney problems, or thyroid conditions. These conditions may indicate a higher risk of iodine reactions or problems with eliminating the iodine after the exam. The most common side effect of iodine includes a warm or hot "flushed" sensation during the actual injection of the iodine and a "metallic" taste in the mouth, which usually lasts less than a minute or so. This can vary depending on the type of used iodine, the rate at which it is administered and individual patient sensitivity. There is no treatment necessary for this sensation. Another mild reaction that can take place following the administration of iodine is itching over various parts of the body with hives (bumps on the skin). This reaction can last from several minutes to several hours after the injection. This type of reaction is usually treated with medication administered by the radiologist, nurse, technologist or other physician. More serious reactions, although much less likely, may include breathing difficulty, swelling of the throat, or swelling of other parts of the body. These reactions can be more serious if not treated immediately [1, 2]. Much work discusses and compares the risks of product quality improved contrast [3, 4], but prices of safety product are always more expensive. To overcome the disadvantages of the CT contrast we propose a clustering method to identify all possible clusters in an image, applied to CT image to visualize internal organs. Clustering has been extensively used as a primary tool of data mining process for discovering groups and identifying interesting distributions and patterns in the underlying data. Clustering problem is about partitioning a given data set into groups (clusters) such that the data points in a cluster are more similar to each other than points in different clusters. But in high dimensional data sets, it is difficult to differentiate similar data points from dissimilar ones. The clusters are embedded in subspaces of high dimensional data space, and different clusters may exist in different subspaces of different dimensions. This difficulty that conventional clustering algorithms encounter in dealing with high dimensional data sets motivates the concept of subspaces clustering [5]. Subspaces clustering [6] is an extension of traditional clustering that seeks to find clusters in different subspaces within a dataset. A number of approaches to subspaces clustering have been proposed in [7]. The two main categories of subspaces clustering algorithms are hard subspaces clustering and soft subspaces clustering. Firstly hard subspaces clustering methods [8] have been extensively studied for the clustering of high dimensional data. A review of methods from a hard subspaces clustering can be found in [9]. In the clustering process of soft subspaces clustering, a weight is assigned to each dimension to measure the contribution of each dimension to the formation of a particular cluster. Soft subspaces clustering can be divided into two main categories: fuzzy weighting subspaces clustering and entropy weighting subspaces clustering [10].Many subspaces clustering algorithms have been developed and applied to different areas [11-13]. Their performance can be further enhanced. In this paper, we propose a clustering method based on an objective function optimization. It is an iterative algorithm based on the minimization of this objective function. This function is obtained by extending the objective function of the enhanced soft subspaces clustering ESSC algorithm [10]. Most clustering techniques use the distance or similarity between objects as a measure to build clusters. A major weakness of subspaces clustering algorithms is that almost all of them are developed based on within- class information only or by employing both within-cluster and between- clusters information. The density of cluster is lost. But in high dimensional spaces, distances between points become relatively uniform. In such cases, the density of cluster may give better results [14, 15]. The proposed objective function contains two terms, the weighting within-cluster compactness and the weighting between-cluster separation. The clusters density is introduced in the compactness term. This is achieved by a suitable reprocessing step for the extraction of data features to classify. In this paper we have used the first six parameters of co-occurrence matrix, namely: contrast, homogeneity, correlation, energy, angular second moment and entropy. Also edge detection and local standard deviation of the image are used. After the iterative step a new step of clustering Image Processing Theory, Tools and Applications 978-1-4673-2584-4/12/$31.00 ©2012 IEEE

Transcript of [IEEE 2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA) -...

Page 1: [IEEE 2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA) - Istanbul, Turkey (2012.10.15-2012.10.18)] 2012 3rd International Conference on Image

A New Method for Finding Clusters Embedded in Subspaces Applied to Medical Tomography Scan Image

Amel Boulemnadjel1, Fella Hachouf1 1Laboratoire d’automatique et de robotique, Département d’électronique

Université Mentouri de Constantine Route Ain El-Bey, 25000 Constantine Algerie

e-mail: [email protected], [email protected]

Abstract— In this paper a new subspaces clustering algorithm

is proposed. This method has two levels, the first one is an iterative algorithm based on the minimization of an objective function. The density is introduced in this objective function where the distances between points become relatively uniform in high dimensional spaces. In such cases, the density of cluster may give better results. The idea of the second level is to find the clusters in each subspace individually. We applied the proposed method to medical tomography scan image without Intravenous or IV contrast dye. Then we compare the results with the same image with IV contrast. However in some cases, there are risks associated with this injection, where the mortality risk is low but not null. This method can reduce the use of this injection. Experimental results on synthetic and real datasets show that the proposed method gives good results in medical tomography image.

Keywords— subspaces clustering, cluster, medical Tomography image, IV contrast

I. INTRODUCTION In Computed Tomography scan (CT), doctors need always tools

used to help in differentiating normal structures from abnormal structures like tumors, aneurysms and infection. One of theses used tools is Intravenous or IV contrast dye. The contrast highlights different tissue types bringing out vessels, tumors, inflammation, cysts, etc.

The risk of serious complications exists regardless the type of the used contrast. Patients should inform the radiologist or technologist if they have a history of allergies (especially to medications, previous iodine injections, or shellfish), diabetes, asthma, a heart condition, kidney problems, or thyroid conditions. These conditions may indicate a higher risk of iodine reactions or problems with eliminating the iodine after the exam.

The most common side effect of iodine includes a warm or hot "flushed" sensation during the actual injection of the iodine and a "metallic" taste in the mouth, which usually lasts less than a minute or so. This can vary depending on the type of used iodine, the rate at which it is administered and individual patient sensitivity. There is no treatment necessary for this sensation. Another mild reaction that can take place following the administration of iodine is itching over various parts of the body with hives (bumps on the skin). This reaction can last from several minutes to several hours after the injection. This type of reaction is usually treated with medication administered by the radiologist, nurse, technologist or other physician. More serious reactions, although much less likely, may include breathing difficulty, swelling of the throat, or swelling of other parts of the body. These reactions can be more serious if not treated immediately [1, 2]. Much work discusses and compares the risks of product quality improved contrast [3, 4], but prices of safety product are always more expensive.

To overcome the disadvantages of the CT contrast we propose a clustering method to identify all possible clusters in an image, applied to CT image to visualize internal organs. Clustering has been extensively used as a primary tool of data mining process for discovering groups and identifying interesting distributions and patterns in the underlying data. Clustering problem is about partitioning a given data set into groups (clusters) such that the data points in a cluster are more similar to each other than points in different clusters. But in high dimensional data sets, it is difficult to differentiate similar data points from dissimilar ones. The clusters are embedded in subspaces of high dimensional data space, and different clusters may exist in different subspaces of different dimensions. This difficulty that conventional clustering algorithms encounter in dealing with high dimensional data sets motivates the concept of subspaces clustering [5]. Subspaces clustering [6] is an extension of traditional clustering that seeks to find clusters in different subspaces within a dataset. A number of approaches to subspaces clustering have been proposed in [7].

The two main categories of subspaces clustering algorithms are hard subspaces clustering and soft subspaces clustering. Firstly hard subspaces clustering methods [8] have been extensively studied for the clustering of high dimensional data. A review of methods from a hard subspaces clustering can be found in [9]. In the clustering process of soft subspaces clustering, a weight is assigned to each dimension to measure the contribution of each dimension to the formation of a particular cluster. Soft subspaces clustering can be divided into two main categories: fuzzy weighting subspaces clustering and entropy weighting subspaces clustering [10].Many subspaces clustering algorithms have been developed and applied to different areas [11-13]. Their performance can be further enhanced.

In this paper, we propose a clustering method based on an objective function optimization. It is an iterative algorithm based on the minimization of this objective function. This function is obtained by extending the objective function of the enhanced soft subspaces clustering ESSC algorithm [10]. Most clustering techniques use the distance or similarity between objects as a measure to build clusters. A major weakness of subspaces clustering algorithms is that almost all of them are developed based on within- class information only or by employing both within-cluster and between- clusters information. The density of cluster is lost. But in high dimensional spaces, distances between points become relatively uniform. In such cases, the density of cluster may give better results [14, 15]. The proposed objective function contains two terms, the weighting within-cluster compactness and the weighting between-cluster separation. The clusters density is introduced in the compactness term. This is achieved by a suitable reprocessing step for the extraction of data features to classify. In this paper we have used the first six parameters of co-occurrence matrix, namely: contrast, homogeneity, correlation, energy, angular second moment and entropy. Also edge detection and local standard deviation of the image are used. After the iterative step a new step of clustering

Image Processing Theory, Tools and Applications

978-1-4673-2584-4/12/$31.00 ©2012 IEEE

Page 2: [IEEE 2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA) - Istanbul, Turkey (2012.10.15-2012.10.18)] 2012 3rd International Conference on Image

is proposed, each subspace finds its clusters individually, using the cluster center obtained for each feature by the iterative algorithm.

The paper is organized as follows. First section is devoted to an introduction to the subject of this paper. Section II describes the new subspaces clustering method. Experimental results will be presented and discussed in section III. The last section concludes obtained results.

II. SUBSPACES CLUSTERING ALGORITHM

A. objective function

We will treat the clustering process as an optimization problem with the goal to minimize the objective function. The objective function below is thus developed for the proposed algorithm (eq.1)[10].

∑∑ ∑

∑∑ ∑

== =

== =

−−

+−=

D

kkikik

c

i

N

j

mij

ik

D

kjkik

c

i

N

j

mijfm

vvwu

invxwuuvJ

1

20

1 1

2

11 1

)()(

)1)(/()(),(

η (1)

Where c: clusters number. N: data number. D: features number. v: cluster center matrix. w : weight matrix. u: fuzzy partition matrix.

By using the Lagrange multiplier, we deduce u and v that minimize the objective function of eq.1, given respectively by equations 2 and 3.

(2)

With 01

1=−∑

=

c

iiju

(3)

After a processing step where the objective function is minimized, a new clustering step is proposed.

B. New clustering step

In this step, clusters centers obtained in the previous step for each subspaces are used. The clustering of each subspace is performed individually using its own cluster center. All obtained clusters for different subspaces are superposed for the final classification.The proposed method can be described by the following algorithm:

Algorithm

Step1. Initialization: Input: - Number of cluster C, parameters, εη , . - Initialize randomly cluster centers v0 and set the initial weight matrix w0. - Extract the features of the images: G matrix size (N x D) Step2. Processing step: While u(t+1)-u(t) ε≤ do

1. Compute the partition matrix u by using (2) 2. Compute the cluster center matrix v by using (3) 3. Update the density n(i) ,the cluster center v0 and the weight

matrix w by the equation (4) w(c,k)= Entp (j,k) / ∑ (Entp(k)) k=1…D; (4)

/* Entp designs the entropy*/ End while Step 6: new clustering step: For j = 1 : D do For i = 1 : N do

Assign each pixel i in the subspace j to its nearest cluster-center v.

End for End for End.

III. RESULTS AND DISCUSSIONS In this section we evaluate the proposed method using the synthetic

and medical images. In each subspace, some clusters are detected where part of them are hiding in other subspaces. Among the D features used in the iterative step, we consider only the subspaces in which different clusters are detected.

a b c

d e

Fig1: a: original image, b, c, d : subspaces clustering ,e: final clustering

Final clustering

=

=

−+

−+= N

jij

N

jkjkij

ik

inu

vinxuv

1

10

)))1)(/(1((

))1)(/((

η

η

∑ ∑

=

−−

=

=

−−

−−+−

−−+−= c

i

mD

kkikikikik

D

k

mkikikjkik

ij

vvinvxw

vvinvxwu

1

1/1

1

20

2

1

1/120

2

])()1)(/()([

])()1)(/()([

η

η

Page 3: [IEEE 2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA) - Istanbul, Turkey (2012.10.15-2012.10.18)] 2012 3rd International Conference on Image

a b

a b

Fig.2.a: tomography image without IV contrast , b:tomography image with IV contrast (1: liver, 2: kidney; 3: Spleen, 4: vertebra,5: Rib,

6:Aorta)

Generally subspaces number can be lower or equal to the number of clusters contained in each subspace. In this paper 8 features have been used. For example the cluster number 1 was detected in the subspace 8 (Fig.1.b) and the cluster number 2 was detected in the subspace 3 (Fig.1.c). In the subspace 5 the edge of the original image

is detected (Fig.1.d).The final clustering is given by the superposition of all clusters for each subspace in one space (Fig.1.e).

We applied this method to a CT image for abdomen examination without intravenous contrast. The results of the proposed method are compared with the same images but with IV contrast. The used images are shown in Fig.2. For the same Computed Tomography scan cup two images are used, the first image obtained by the CT scan examination without intravenous contrast and the second one is obtained with the IV contrast. The used contrast in Fig.2.b brightens internal organs; arteries, veins and tissues as it courses through them. Notice that some organs like the kidney (2) are clearer in figure b than in Fig.2.a. The different images obtained by the implementation of the proposed method are given in Fig.3 and Fig.4. In each subspaces, different clusters are detected. From each subspaces, different information about the diagnosis was extracted. The superposition of any cluster obtained in different subspaces gives the final clustering. Figure 3.b shows the kidney and liver detected in the subspaces number 3, the other organs are hidden but are detected in the other subspaces as the vertebra. The superposition of the tow clustering result (Fig 3.b,c) is given in Fig.1.e where all organs are detected and there contours are limited. The clustering results show that the organs are well detected and separated from other organs, contours are also very limited. In the original image (Figure 3.a) the contours of the kidneys are not clear in opposition to Fig.3.e, where kidneys are taking the same shape as in the image Fig.3.d with the use of contrast product. Figure.4 shows another example of our applied method.

a b c

d e

Fig.3: a: tomography image without the IV contrast, b,c: subspaces clustering result, d:tomography image whit the IV contrast, e :final

clustering.

12

345

1

3

4

6

Page 4: [IEEE 2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA) - Istanbul, Turkey (2012.10.15-2012.10.18)] 2012 3rd International Conference on Image

a b

c d

e f

Fig. 4: a: tomography image without the IV contrast, b,c,d: subspaces clustering result, e:tomography image whit the IV contrast, f :final

clustering

It is noticed that a cluster may be detected in a subspace but it is

hidden in another one. The subspace number 4 has detected the edges of the original image. The superposition of all these clusters is shown in Fig.4.f, where all organs which take the contrast in the Fig.4.e appear completely in Fig.4.f

IV. CONCLUSION

We have presented in this paper a new subspace clustering method based on the iterative algorithm and the new clustering step. Each supspace finds different clusters individually which allow to separate correctly all clusters in the full space. It is not necessary to take all subspaces which increase the runtime. This method can find clusters embedded in subspaces of the original data. Subspaces clustering

algorithms are efficient to deal with high dimensional data sets. The new method can reduce the use of the IV contrast.

REFERENCES

[1] J. Liotier et al. Injections de produits iodés : les complications, les préventions possibles. Médecine thérapeutique,14,(2), 67-76, mars-avril 2008, Revue. [2] C Lejus , Y Blanloeil , B Dupas .Complications des produits de contraste. Conférences d'actualisation 1998, 277-297.© 1998 Elsevier, Paris, et SFAR . [3]C .Heinrich et al. Nephrotoxicity of Iso-osmolar Iodixanol Compared with Nonionic Low-osmolar Contrast Media: Meta-analysis of Randomized Controlled Trials1, Radiology: 250(1),68-86,January 2009 [4]C. Vernassiere1 et al. Low negative predictive value of skin tests in investigating delayed reactions. radio-contrast media, 50( 6), 359–366, June 2004.

Page 5: [IEEE 2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA) - Istanbul, Turkey (2012.10.15-2012.10.18)] 2012 3rd International Conference on Image

[5] R. Agrawal, J. Gehrke, D. Gunopulos, and P. Ragha-van. Automatic subspace clustering of high dimensional data for data mining applications. In SIGMOD Record ACM Special Interest Group on Management of Data,94–105, 1998. [6] L. Parson, E.Haque, H.Liu. subspace clustering for high dimensional data: A review. SIAM Int. Conf. on Data Mining, 6(1), 90–105.2004. [7]René Vidal. subspace clustering Applications in motion segmentation and face clustering. IEEE SIGNAL PROCESSING MAGAZINE, 52-68,2011. [8]R.Agrawal. et al. Automatic subspace clustering of high dimensional data for data mining applications. In Proceedings of the 1998 ACM SIGMOD international conference on Management of data, ACM Press ,.94-105,1998 [9] L. Parsons, E. Haque, H. Liu. Evaluating subspace clustering algorithms. In Workshop on Clustering High Dimensional Data and its Applications. SIAM Int. Conf. on DataMining. 48–56. 2004. [10] Z. Deng et al .Enhanced soft subspace clustering integrating within-cluster and between-cluster information, Pattern Recognition,(43),.767–781,2010 [11] R. Vidal, et al. An algebraic geometric approach to the identification of a class of linear hybrid systems. In Proc. Conf. Decision and Control,167–172, 2003. [12] A. Yang, et al. Unsupervised segmentation of natural images via lossy data compression. Comput. Vis. Image Understand. 110( 2): 212–225, 2008. [13] R. Vidal, R. Tron, and R. Hartley. Multiframe motion segmentation with missing data using power factorization and GPCA. Int. J. Comput. Vis. 79(1):85–105, 2008. [14] J. Sunita, K. Parag. Intelligent Subspace Clustering, A Density based Clustering approach for High Dimensional Dataset. World Academy of Science, Engineering and Technology. 55: 69-73, 2009. [15] R.W. Sembiring, J.M. Zain. Cluster Evaluation of Density Based subspace Clustering. journal of computing. 2.(11): 2151-9617, NOV.2010.