Classiﬁcation of incidental carcinoma of the prostate ...

Cellular Oncology 26 (2004) 45–55 45IOS Press

Classification of incidental carcinoma of theprostate using learning vector quantizationand support vector machines

Torsten Mattfeldt a,∗, Danilo Trijic a, Hans-Werner Gottfried b and Hans A. Kestler c,d

a Department of Pathology, University of Ulm, Ulm, Germanyb Department of Urology, University of Ulm, Ulm, Germanyc Department of Neuroinformatics, University of Ulm, Ulm, Germanyd Department of Internal Medicine I, University of Ulm, Ulm, Germany

Received 7 July 2003

Accepted 29 November 2003

Abstract. The subclassification of incidental prostatic carcinoma into the categories T1a and T1b is of major prognostic andtherapeutic relevance. In this paper an attempt was made to find out which properties mainly predispose to these two tumorcategories, and whether it is possible to predict the category from a battery of clinical and histopathological variables usingnewer methods of multivariate data analysis. The incidental prostatic carcinomas of the decade 1990–99 diagnosed at our depart-ment were reexamined. Besides acquisition of routine clinical and pathological data, the tumours were scored by immunohisto-chemistry for proliferative activity and p53-overexpression. Tumour vascularization (angiogenesis) and epithelial texture wereinvestigated by quantitative stereology. Learning vector quantization (LVQ) and support vector machines (SVM) were used forthe purpose of prediction of tumour category from a set of 10 input variables (age, Gleason score, preoperative PSA value, im-munohistochemical scores for proliferation and p53-overexpression, 3 stereological parameters of angiogenesis, 2 stereologicalparameters of epithelial texture). In a stepwise logistic regression analysis with the tumour categories T1a and T1b as dependentvariables, only the Gleason score and the volume fraction of epithelial cells proved to be significant as independent predictorvariables of the tumour category. Using LVQ and SVM with the information from all 10 input variables, more than 80 of thecases could be correctly predicted as T1a or T1b category with specificity, sensitivity, negative and positive predictive value from74–92%. Using only the two significant input variables Gleason score and epithelial volume fraction, the accuracy of predictionwas not worse. Thus, descriptive and quantitative texture parameters of tumour cells are of major importance for the extent ofpropagation in the prostate gland in incidental prostatic adenocarcinomas. Classical statistical tools and neuronal approaches ledto consistent conclusions.Keywords: Artificial neural networks, bioinformatics, classification, immunohistochemistry, incidental carcinoma, learningvector quantization, logistic regression, pathology, pattern recognition, prostatic cancer, stereology, support vector machine

1. Introduction

Incidental prostatic cancer is prostatic adenocarci-noma discovered by chance. In contrast to the cate-gories pT2–pT4, it is not a pathologically defined tu-mour stage but includes a heterogeneous population ofcancers with different extent of invasion, and is only

*Corresponding author: Prof. Dr. T. Mattfeldt, Department ofPathology, Oberer Eselsberg M23, D-89081 Ulm, Germany. Fax:+49 731 58738; E-mail: [email protected].

characterized by the mode of its clinical presentation.In most cases such tumours are found at the occasionof transurethral resections (TUR) or surgical adenom-ectomies of the prostate after a diagnosis of benignprostatic hyperplasia (BPH) [2]. Moreover, inciden-tal carcinomas are found in radical cystoprostatectomyspecimens removed because of bladder cancer [29].In the course of time, many clinical data have beenaccumulated for the T1 tumour [2,5]. However, thenumber of studies in which modern methods of struc-tural cell biology, quantitative stereology and genet-

1570-5870/04/$17.00 2004 – IOS Press and the authors. All rights reserved

46 T. Mattfeldt et al. / Classification of incidental prostatic cancer

ics have been applied to this collective is scarce. Alsothese techniques have rarely been applied in a multi-variate approach in order to look for the relative impor-tance of the factors. Recently, the first findings on T1prostate cancer using comparative genomic hybridiza-tion (CGH) have been reported [52].

We attempted to study all cases of incidental pro-static cancer that were diagnosed in our departmentin one decade (1990–1999) as a retrospective clinicaland histopathological investigation. Additionally, im-munohistochemical and stereological studies on the tu-mor texture, proliferation and angiogenesis were per-formed, and the data were studied statistically by astepwise logistic regression analysis. Specifically, itwas attempted to find out which variables mainly pre-dispose to the subcategories T1a and T1b, as T1b isknown to be associated with worse prognosis, andmay lead to more aggressive therapies such as radicalprostatectomy or radiation especially in younger pa-tients [10,40]. We follow the definitions of T1a andT1b indicating �5% and >5% area fraction of resectedprostate occupied by prostate cancer tissue accordingto the current tumour classification of the UICC [36].For the purpose of classification (pattern recognition)we used learning vector quantization (LVQ), whichhas been applied for classification of prostatic cancerbefore [23–25]. Additionally support vector machines(SVM) were used for classification [3,4,17,45].

2. Materials and methods

2.1. Pathology

We proceeded from the total number of prostatespecimens of the Department of Urology of the Uni-versity of Ulm, which had been examined histopatho-logically at the Department of Pathology at the Uni-versity of Ulm in the years 1990–1999. This mate-rial included prostate tissue, that had been removedas TUR because of BPH, and prostate specimens sur-gically operated because of BPH (adenomectomies).Cystoprostatectomies with incidental prostatic carci-nomas were not included into the study. On the whole,we found 59 incidental carcinomas in the TUR spec-imens and 7 incidental carcinomas in the adenomec-tomies, when strict criteria of inclusion were applied.Especially, the original documents of each case werestudied again, and a search was performed for previ-ous histological findings of that patient. A case wasonly included when there was no prior diagnosis of

PCA by biopsy or antecedent TUR, and when in thewritten clinical diagnosis only benign changes of theprostate gland (adenoma, BPH) were mentioned. Fromall cases with incidental carcinomas, the preoperativePSA-value in the serum estimated according to theHybritech kit (Hybritech Inc; Beckman Coulter Fuller-ton, CA; see [23]) was evaluated.

2.2. Histology and immunohistochemistry

The Gleason score was reevaluated microscopicallyfor all cases by the first author, without knowledge ofthe other variables. Additionally, immunohistochem-ical stains were performed from the technically bestsuitable paraffin block of each case. We used mono-clonal antibodies against the human Ki-67 antigen(Mib-1) and a monoclonal antibody against humanp53-protein. For the study of vascularization an anti-body against factor VIII-related antigen (FVIII-ra, vonWillebrand factor) was used [43]. All antibodies wereobtained from DAKO, Glostrup, Denmark. The reac-tions were judged semiquantitatively. In the evalua-tion of the Mib-1 reaction, a subjective estimate of thefraction of positively labelled nucleus profiles was ob-tained in intervals of 10% (0, >0–10%, >10–20%, . . . ,>90–100%; >0.1–0.2, . . . , >0.90–1.00). The p53-reaction was judged in the same manner.

2.3. Image analysis and quantitative stereology

Stereology means that data obtained from sectionsare extrapolated to the true three-dimensional proper-ties of a structure using unbiased mathematical meth-ods [8]. Stereological principles were used to study theepithelial texture and the capillary vascularization. Forboth evaluations, paraffin sections with an immuno-histochemical stain using an antibody against FVIII-rawere used. Such sections show simultaneously the ep-ithelial cells, the stroma, the glandular lumina and thecapillaries (Fig. 1). To characterize the epithelial tissuetexture, systematic series of visual fields containing tu-mour tissue were evaluated with a random startpoint[21,22,24]. Using the well-known fundamental stereo-logical equations:

VV = AA (1)

SV = (4/π)BA (2)

we estimated the volume fraction VV (epi) of epithe-lium from the area fraction of epithelial tissue per unitreference area, AA(epi), whereas the surface area of

T. Mattfeldt et al. / Classification of incidental prostatic cancer 47

Fig. 1. Left panel. Quadratic visual field of a selected T1b prostatic adenocarcinoma. Immunohistochemical stain with antibody against FactorVIII related antigen. The walls of capillary profiles are positively stained (brown). The tumour cells of the adenocarcinoma are shown with lightblue cytoplasm, dark blue nuclei and optically empty gland lumina. Between the neoplastic epithelia one sees the faintly stained stroma. Theedgelength of the quadrat corresponds to 0.4 mm at the tissue level. Right panel. Digitized binary image of the colour image on the left side. Theepithelia are coded as white, stroma and gland lumina are coded as black, and the capillaries are shown in gray colour. From such binary images,the epithelial texture parameters and the capillary vascularization parameters (see Tables) were estimated using Eqs (1)–(3).

epithelium per unit tissue volume, SV (epi), was esti-mated from the mean boundary length of epithelial tis-sue per unit reference area, BA(epi) (see, e.g., [8]).Note that VV (epi) is different from the fraction of tu-mor tissue in the whole specimen that determines thecategories T1a and T1b: VV (epi) refers to the volumefraction of tumour cells within the tumour, whereasthe latter is the fraction of tumour tissue in the totalsample. Image analysis was performed by means ofPC programs after interactively segmenting the imageswith the Kontron system IBAS 2000 using a CCD cam-era connected to the microscope and converting the im-ages to the TIF-format [21,22,24]. The final magnifica-tion corresponded to a width of 0.4 mm of the quadraticvisual field at the scale of the tissue (Fig. 1).

To characterize tumour vascularization, the volumefraction and the surface area of vessels per unit tissuevolume, VV (cap) and SV (cap), were estimated by us-ing Eqs (1), (2) as above. A further important parame-ter in the context of vascularization is the length of thecapillary network per unit tissue volume, LV (cap) [18,20]. It was estimated from the fundamental stereologi-cal equation:

LV = 2QA (3)

where QA denotes the mean number of vessel pro-files per unit area, a planar parameter sometimes de-noted as capillary density or microvessel density [8,18,20,50,51]. A case was completed with the evaluationof that field in which the 200th capillary profile wasfound.

2.4. Logistic regression analysis

Logistic regression is a mathematical modeling ap-proach that can be used to describe the relationshipof several independent variables to a dichotomous (bi-nary) variable f (z). Such a variable assumes only twovalues, here it is the tumor category, i.e. T1a or T1b,which was coded as 0 and 1. We consider now the lin-ear function z = f (X) of a k-dimensional input vec-tor X = (x1, . . . , xk), where the xi are the k indepen-dent variables: z = α + β1x1 + β2x2 + · · · + βkxk =α +

∑ki=1 βixi. The function z = f (X) can be inter-

preted as an index of combined risk factors. The logis-tic model can be written as

f (z) =1

1 + e−z =1

1 + e−(α+∑k

i=1βixi)

(4)


where the parameters α and βi are obtained (fitted) ac-cording to the principle of maximum likelihood fromthe data [11]. Essentially, this leads to a mapping of alinear combination of the independent variables ontothe interval [0, 1]. In the present study we used the pro-cedure of logistic regression as implemented in the sta-tistic software package of SAS/STAT (proc logistic)with the stepwise selection option [33]. In this casethe independent variables are entered into and removedfrom the model in such a way that each forward selec-tion step is followed by one or more backward elimi-nation steps. The stepwise selection process terminatesif no further variable can be added to the model. Sim-plifying, this means that significant prognostic (pre-dictive) factors are identified: all other variables areeliminated and only significant predictors remain in themodel.

3. Methods of data classification

3.1. Learning vector quantization (LVQ)

LVQ is an artificial neural network with a super-vised learning rule [12,14]. It can be seen as the super-vised counterpart to self-organizing maps (SOM) (forreferences on SOMs see [9,13–15,26,27,38]), whichfunction according to an unsupervised learning rule.While SOMs are fed only with input variables, e.g.,gene expression data or CGH data, LVQ obtains notonly the input vectors but also an output vector. Ifthis is a single variable with two values only, e.g.,0 and 1 as here, we enter the domain of binary pat-tern recognition. When one has decided on a set of kvariables to evaluate, the data of a single case can berepresented as a vector or point in the correspondingk-dimensional input space. Using LVQ, further vec-tors are randomly placed into this space, which aremoved until the distance of these vectors to the in-put vectors is minimized [12,14]. In this manner, thealgorithm attempts to spread the new vectors as uni-formly as possible across the input vectors, very sim-ilar as in SOMs. The new vectors are called code-book vectors and are marked as 0 or 1 for the classes.The input vectors are classified from the system ac-cording to the mark of their nearest neighbour amongthe codebook vectors. On the whole, we classified thedata with LVQ networks with 1–20 codebook vec-tors, and for each number of codebook vectors 46different combinations of LVQ parameters were ap-plied. More details about LVQ can be found in [12,14]

and in our previous publications [23–25]. LVQ canbe downloaded with documentation as academic soft-ware by internet under http://www.cis.hut.fi/research/som-research/nnrc-programs.shtml as a set of sourcefiles for Unix or DOS, and as binary (executable) filesfor Windows. For our study the source files were com-piled and executed under Linux.

3.2. Support vector machines

Support vector machines (SVM) were developedby Vapnik since the end of the seventies and are afurther well-known supervised method for data clas-sification [3,4,45]. In the simplest case, a SVM inbinary pattern recognition mode has the task to clas-sify a separable dataset of vectors by a linear deci-sion function (hyperplane) into two halfspaces. Eachinput vector, which is marked in two classes as −1or +1, is classified according to the halfspace whereits coordinates are located. The hyperplane is con-structed by the SVM in such a manner that the widthof the margin of the separating hyperplane to the near-est neighbours among the input vectors of both classesbecomes a minimum. Only these input vectors havean influence on the definite equation of the hyper-plane, and the vectors of this subset of input vectorsare denoted as “support vectors”. For more compli-cated tasks, such as classifying overlapping vector setsfrom two classes, or in case of nonlinearity of thedata, variations of the aforementioned fundamental al-gorithm are available. Instead of complete separationan incomplete separation can be accepted, where over-lapping is allowed but penalized. Thus, it is possibleto select a kernel function K(x, y), and parameterswhich determine the penalty function for misclassifi-cation. We have used 5 types of kernels: the functionK(x, y) = xy, i.e. the simple dot product of the vec-tors (= inner or scalar product of vectors), radial basisfunctions with K(x, y) = exp(−γ|x − y|2), the reg-ularized Fourier kernel K(x, y) =

∏nk=1 Kk(xk, yk),

and the polynomial kernels K(x, y) = (xy + 1)d (Vap-nik polynomial, simple polynomial) and K(x, y) =[(xy/a) + b]d (full polynomial) [34]. This is onlya very limited selection of the full range of ker-nel functions available for SVM, see [34]. As im-plementation we used the program package svm un-der Linux, which can be downloaded by internetunder http://svm.dcs.rhbnc.ac.uk/dist/index.shtmlas bi-nary program or as source code [34].


3.3. Evaluation with the leave-one-out method

In the two data sets outlined below, output datawere predicted from input data on an individual ba-sis by the leave-one-out principle (synonyms: jack-knife, round-robin) [41,45]. This means that the to-tal group of n cases is partitioned into a subgroupof n − 1 cases (the training cases) and another sub-group which consists only of the single remainingcase (the test case). In the training phase, the network“learns” to estimate the output variable from the inputvariables within the training group. In the test phasethereafter, the output variable of the test case is esti-mated from the input variables of the test case mak-ing use of the information learnt previously from thetraining group (here: 65 cases). This strategy simu-lates a confrontation of the network with a new case,and by this manner one tests its ability to generalize.Thus the leave-one-out approach simulates the clin-ical situation where the diagnosis of a new patientis made on the basis of the doctor’s previous expe-rience with similar cases. The prediction is repeatedcyclically for every patient as test case with the com-plementary set of cases serving as its training group.For each cyclic evaluation, the following statisticswere computed: accuracy – overall percentage of cor-rectly classified cases, sensitivity – ratio of number ofcases correctly classified as positive to the total num-ber of positive cases, specificity – ratio of number of

cases correctly classified as negative to total number ofnegative cases, positive predictive value – ratio of num-ber of cases correctly classified as positive to totalnumber of cases classified as positive, negative predic-tive value – ratio of number of cases correctly classi-fied as negative to total number of cases classified asnegative.

It is well known that leave-one-out error estima-tion can be performed in two fundamentally differentways. The first approach is to decide on feature selec-tion (i.e. the choice of input variables for classifica-tion) on the complete data set beforehand, and there-after perform the leave-one-out cross-validation. Thesecond approach is to perform feature selection as wellas optimizing the classifier parameters for each cycleof the leave-one-out simultaneously. The latter methodresults in a lower bias, but a higher variance of the er-ror estimate [35]. Here the former of the two afore-mentioned approaches was selected, using a stepwiselogistic regression procedure for feature selection; seealso our previous papers [17,23–25].

4. Results

4.1. Group comparisons

In Table 1 we find the mean values and standarddeviations of the variables in the T1a and T1b carci-

Table 1

Group comparisons

Variable T1a tumours T1b tumours Level ofsignificance

x̄ SD x̄ SD

Age (years) 70.23 8.8 72.25 8.6 N.S.

Mib-1 (%) 2.40 2.31 3.47 2.24 N.S.

p53 (%) 0.39 0.93 2.91 6.54 N.S.

PSA (ng/ml) 8.82 10.68 24.60 38.83 p < 0.05

LV (cap) [mm/mm3] 147.63 46.96 182.88 69.71 N.S.

SV (cap) [mm2/mm3] 7.88 2.32 9.73 3.04 N.S.

VV (cap) 0.02 0.02 0.03 0.01 N.S.

SV (epi) [mm2/mm3] 33.93 10.08 42.47 6.99 p < 0.01

VV (epi) 0.28 0.11 0.35 0.06 p < 0.05

Gleason score 4.10 1.44 6.18 1.07 p < 0.0001Mean values with standard deviations are presented for the T1a and T1b group. The right column shows the resultsof group comparisons using a Wilcoxon rank sum test. The difference between the Gleason scores of T1a andT1b tumours is highly significant. Also the preoperative PSA value, and volume and surface area of epithelialcells per unit tissue volume were significantly higher in the group of T1b-tumours. Abbreviations: Mib-1, p53:semiquantitative score of Mib-1 and p53 immunohistochemistry, GS: Gleason score, VV (cap): volume fractionof capillaries, LV (cap): length of capillaries per unit tissue volume, SV (cap): surface area of capillaries perunit tissue volume, VV (epi): volume fraction of epithelial cells per unit tissue volume, SV (epi): surface area ofepithelial cells per unit tissue volume.


noma groups. For group comparisons, the nonparamet-ric Wilcoxon rank sum test (Mann–Whitney U test)was used [1]. There were 39 cases in category T1a and27 cases in category T1b. The Gleason score and thepreoperative PSA-values were significantly higher inthe T1b-group. With respect to the quantitative stere-ological findings of tumour texture, we found signif-icantly higher estimates of epithelial volume fractionand epithelial surface area per unit volume in the T1bgroup. There was a trend towards a denser tumour vas-cularization in terms of length, surface area and vol-ume of capillaries per unit tissue volume in the T1bgroup, but this trend was statistically not significantat the 5%-level. The same statement holds for nuclearMib-1 expression and p53-overexpression.

4.2. Logistic regression

Applying the aforementioned stepwise algorithm forlogistic regression to our data, we used age, Mib-1, p53, the PSA value, LV (cap), SV (cap), VV (cap),SV (epi), VV (epi) and the Gleason score as k = 10 in-dependent variables xi, and the tumor categories T1aand T1b as dependent variable f (z) ∈ [0, 1]. In thestepwise logistic regression procedure, only the twovariables Gleason score and epithelial volume fractionmet the 5% significance level for entry into the model.The corresponding values were χ2 = 9.05 for Gleasonscore (p < 0.01) and χ2 = 7.05 for the volume frac-tion of epithelial cells (p < 0.01). The other variableswere eliminated from the model as non-contributory.

4.3. Classification of cases by LVQ and SVM

In this part of the study, four data sets of the same 66cases with different sets of input variables were exam-ined. The single output variable in all 4 settings was thetumor category, which was coded as 0 or 1 for LVQ,and as −1 or 1 for SVM, respectively, correspondingto T1a and T1b. For the purpose of classification, miss-ing values were estimated from the mean values of thatvariable of the cases with available data [42], and allinput values were scaled (normalized) to the interval[0, 1]. The first dataset consisted of all 10 input vari-ables (age, Mib-1, p53, PSA, Gleason score, LV (cap),SV (cap), VV (cap), SV (epi) and VV (epi)) and the out-put variable T1a/T1b. For the second data set onlythe two variables Gleason score and epithelial vol-ume fraction were considered. For the third and fourthdataset, merely one significant input variable was in-cluded: only Gleason score or only epithelial volumefraction, respectively. Accuracy, sensitivity, specificity,positive and negative predictive values of the classifi-cation were tested by cross-validation according to theleave-one-out method [23–25,41,45].

Table 2 shows the results for these 4 data sets ob-tained by LVQ and SVM in comparison. In the table,only the best results obtained by the two algorithmshave been shown. These are based on an optimizationof the algorithm parameters, which is performed par-tially on the basis of experience but also to a large ex-tent by trial and error. It is fundamentally impossible to

Table 2

Classifications

Input variables Algorithm Accuracy Sensitivity Specificity PPV NPV Algorithmproperties

Def. Number

All 10 LVQ 0.8485 0.8519 0.8462 0.7931 0.8919 7 CV

SVM 0.8333 0.7407 0.8974 0.8333 0.8333 RF

GS, 2 LVQ 0.8485 0.8519 0.8462 0.7931 0.8919 4 CV

VV (epi) SVM 0.8636 0.8888 0.8462 0.8000 0.9167 RF

GS 1 LVQ 0.7424 0.7037 0.7692 0.6786 0.7895 2 CV

SVM 0.7424 0.7037 0.7692 0.6786 0.7895 VP, RBF

VV (epi) 1 LVQ 0.7273 0.5556 0.8462 0.7143 0.7333 6 CV

SVM 0.7273 0.5556 0.8462 0.7143 0.7333 RBFClassification (prediction) of category T1a versus T1b on the basis of all 10 input variables, and on reduced data sets with 2 or only 1 inputvariables. For the complete data set, the 10 variables age, Mib-1, p53, Gleason score, VV (cap) LV (cap), SV (cap), VV (epi), SV (epi), PSAwere used. For the other data sets, the input variables are indicated in the left column. LVQ: learning vector quantization, SVM: support vectormachine, NPV: negative predictive value, PPV: positive predictive value. In the right column more detailed specifications of the algorithms aregiven: number of codebook vectors (CV) for LVQ; type of kernel function for SVM (RF: regularized Fourier kernel, VP: Vapnik polynomial,RBF: radial basis function). For the data set based on GS only, identical results were obtained by SVMs with VP and RBF kernels. Note identicalpredictions by LVQ and SVM on the basis of the one-dimensional datasets in the two lower rows.


explore all theoretically conceivable parameter combi-nations, because various algorithm parameters are nu-merical quantities which possess an infinite range ofvariation. Dependent on the algorithms and the set of avariables, the accuracy ranged between 73% and 86%when new test cases were presented to the system aftertraining. The number of codebook vectors necessaryto produce the highest accuracy in LVQ varied from2 to 7. Using SVM, the best results were obtained forthe complete data set and for the reduced data set withtwo input variables with a regularized Fourier kernel.For the reduced data sets with only one input variable,the best results were obtained with polynomials andradial basis functions as kernels. Note that neither forLVQ nor for SVM, the accuracy of prediction couldbe improved by rising the number of input variablesfrom 2 to the full set of 10 variables. In general, theclassification accuracies by SVM and LVQ were verysimilar, sometimes even identical results were obtained(Table 2). As an increase of the number of variablesbeyond 2 did not augment the precision, the input vari-ables in addition to GS and VV (epi) must be consid-ered as non-contributory also from the viewpoint of apurely neural approach to the data. On the other hand,a further reduction to a single input variable (only GS,only VV (epi)) leads to a loss of accuracy. This showsthat GS and VV (epi) possess some independent infor-mation. Summarizing, a correct prediction of categoryT1a versus T1b was feasible in a high percentage ofincidental prostatic cancers (>80%) on the basis of all10 variables or using only the 2 significant variables.These predictions were obtained by LVQ as well asby SVM and were accompanied by similar results forsensitivity, specificity, positive and negative predictivevalue.

5. Discussion

The vascular supply and proliferation of prostaticcarcinomas have been investigated in various studieson prostatic cancer in advanced tumour stages [28,46,50,51]. In two previous studies capillary vascular-ization was also studied in incidental prostatic carci-nomas [47,48]. Also in other studies on T1 prostaticcarcinoma, overexpression of p53-protein and the ex-pression of proliferation-associated antigens such asKi-67 (Mib-1) have been explored [6,44]. We founda trend towards an increased capillary vasculariza-tion in T1b tumours as compared to T1a tumours. Ingeneral, higher pathological stage, higher histological

grade and metastasis have been shown to be correlatedwith increased tumour vascularization; our results arecompatible with this view. While some previous dataon these aspects of the quantitative pathology of inci-dental prostatic cancer are available, the stereologicalstudy of epithelial texture has only rarely been appliedin general [17,19,23,24]. The present data for volumefraction and surface area of epithelial cells per unit tis-sue volume are generally in the same size range forprostatic carcinomas as in the previous communica-tions [23,24]. The study of epithelial texture is labori-ous as long as the images are interactively segmented.However, in principle automatic image analysis aftercontrast enhancement of epithelial structures by appro-priate immunohistochemical stains (e.g., with antibod-ies against cytokeratin or PSA) could strongly increasethe efficiency of this method, which is presently stillrestricted to scientific studies.

It was attempted to characterize incidental prosta-tic carcinomas by quantitative and multivariate tech-niques. Stepwise logistic regression was used to findout which of these variables remain significant whentheir interactions are taken into account. Thus, an ap-proach of classical statistics was applied for the pur-pose of feature selection for the neural paradigms. Itcould be shown that the dependent variable, T1a vs.T1b, could be explained largely by two independentsignificant variables, the Gleason score and the epithe-lial volume fraction of the tumour tissue. Clearly thisobservation is consistent with the view that the Gleasonscore is correlated to, but not fully determined by theepithelial volume fraction. A look at the well-knownschematic diagrams of the Gleason score [7] showsthat in general the volume fraction of the epithelia riseswith increasing Gleason score, but not monotonouslyand in addition to volume fraction changes there arealso complex pattern alterations from tubular to crib-riform to solid architecture (Fig. 2). This aspect is notentirely represented by first-order parameters such asvolume and surface area, but can be quantified only byspecial techniques of stereology and stochastic geom-etry, e.g., pattern analysis on the basis of second-orderproperties or by comparison of empirical structuresto reference models of random set theory [17,19,22,24,30,37]. Both Gleason score and epithelial texturestereology have the advantage that they can be ob-tained from every specimen also if these should beuninformative for immunohistochemical stains. Theyare also available independently from any clinical data.Presently the Gleason score remains a highly efficientand indispensable tool for grading epithelial texture.


Fig. 2. Central areas of a reference image for Gleason grading were digitized and segmented (quoted from [7]). Here the components are epithelialcells (white), glandular lumina (gray), and stroma (black). The numbers indicate the Gleason grade of individual tumour patterns; the Gleasonscore is the sum of the grades of the two major patterns (e.g., 3 + 4 = 7). The patterns related to the grades differ by volume fraction andsurface area of tumour cells per unit volume, but in addition there are complex shape changes which transcend quantification by simple first-orderparameters.

In the present investigation newer tools of dataanalysis were applied for the first time to cases of in-cidental prostatic cancer. There exists already a num-ber of studies with artificial neural networks on pro-static carcinoma, in which different predictions weremade, e.g., preoperative prediction of tumour stagefrom clinical data and biopsy findings, prediction of tu-mour relapse after prostatectomy, and other study de-signs (see [17,23,25,39,49] and references therein). Inthese investigations, the object of the study was clin-ically manifest carcinoma, and in the large majorityof studies multilayer feedforward networks with back-propagation (multilayer perceptrons, MLP) were used.In the present study, the neuronal paradigms LVQ and

SVM were applied for the first time to prostatic cancerdata in comparison. Previous data of our group haveshown that LVQ may be advantageous with respect toaccuracy and other characteristics of the quality of pre-diction in comparison to MLPs [23–25]. Superior re-sults by LVQ in classification of urological and othertumours have also been reported by other groups [16,31,32]. On the whole, the results obtained by LVQ andSVM were similar in this study. One cannot excludethat even better results could have been obtained bySVM if still more variations of the kernel function andother SVM parameters had been performed; here werestricted ourselves to 5 basic types of kernel functionsin SVM [34]. We suggest that alternative techniques of


data analysis such as LVQ and SVM, which are alsooften faster and no more difficult to use than MLPs,should be tried in the general context of prediction inclinical studies. Clearly SVMs can give additional in-formation by identifying a subset of the cases as sup-port vectors, but in the present context we have notexploited this option any further. The reader mightwonder why the availability of more information (here:10 input variables as compared to only 2 input vari-ables) did not improve the accuracy of the classifica-tion. To understand this phenomenon, one must takeinto account that the networks learn on preclassifiedtraining data sets and thereafter must generalize to new,unknown test cases. For the ability to generalize, it isnecessary to find a good compromise between too littlelearning and too much learning; in the latter situationthe system “learns the patterns by heart”, but extrap-olates less well to new patterns. In the present situa-tion we see this effect (albeit at a very low level) forSVM when switching from 2 to 10 variables, whichimplies a considerable increase of model complexity.A similar behaviour is well-known as “overtraining”(overfitting), e.g., for multilayer perceptrons [23,24],where greatly enhanced numbers of training epochsmay rather lead to worse than improved results.

With respect to practical applications, we have re-stricted our study to techniques applicable to diagnos-tic routine material. It would also have been possibleto include genetic data, e.g., from CGH-studies [26,27,52], but this technique is more laborious and will possi-bly remain restricted to research laboratories. Anyway,the conventional distinction between T1a and T1b restson an estimation whether the tumour cells occupy 5%or less, or more than 5% of the resected prostatic tissue.Such a value can be highly biased by the localizationwhere the tissue has been removed. In contrast, prop-erties such as histological grade, texture parameters orestimates of proliferation and angiogenesis are morehomogeneous and should depend less on the site ofthe removed histological specimen. Clearly our caseswere not reclassified after the study. The diagnosis T1aor T1b case was kept unchanged on the basis of thevolume fraction of tumour per total tissue as definedaccording to the UICC. Nevertheless it is plausible toconceive hypothetical cases where a computer-basedclassification of a case would be desirable, e.g., be-cause a reliable determination of the tumor fraction pertissue is not feasible due to paucity of resected mater-ial. Also it is of general pathobiological interest whichfactors enable a clinically silent prostatic neoplasm tooccupy more than 5% of the gland. It is planned to in-

vestigate the predictive value of quantitative histolog-ical texture variables in further prospective studies. Ifour best network (SVM with regularized Fourier ker-nel and 2 input variables) were used for this purpose,this would mean that ≈86% of the cases would be cor-rectly classified into two classes, which could be de-fined as “group with low risk of T1b” and “group withhigh risk of T1b”. In the first group, the risk of havingT1b cancer would be only 8.3%, whereas in the secondgroup, this risk would be 80%. This observation con-firms that artificial neural networks can be useful forthe identification of individual patients in low and highrisk categories [23,25,26].

Acknowledgements

This study was supported by the IZKF (interdiscipli-nary centre for clinical research) of the University ofUlm, grant number C8, and by the Stifterverband fürdie Deutsche Wissenschaft (H.A.K.). Thanks are dueto Katrin Schierle for excellent technical help with theillustrations.

References

[1] J. Bortz, G. Lienert and K. Boehnke, Verteilungsfreie Methodenin der Biostatistik, Springer, Berlin, 2001.

[2] D.G. Bostwick, The pathology of incidental carcinoma, CancerSurveys 23 (1995), 7–18.

[3] M.P.S. Brown, W.N. Grundy, D. Lin, N. Cristianini, C.W. Sug-net, T.S. Furey, M. Ares and D. Haussler, Knowledge-basedanalysis of microarray gene expression data by using sup-port vector machines, Proc. Natl. Acad. Sci. USA 97 (2000),262–267.

[4] J.C. Burges, A tutorial on support vector machines for patternrecognition, Data Mining Knowl. Discov. 2 (1998), 121–167.

[5] L. Cheng, E.J. Bergstrath, B.G. Scherer, R.M. Neumann, M.L.Blute, H. Zincke and D.G. Bostwick, Predictors of cancer pro-gression in T1a prostate adenocarcinoma, Cancer 85 (1999),1300–1304.

[6] M.R. Feneley, M. Young, C. Chinyama, R.S. Kirby and M.C.Parkinson, Ki-67 expression in early prostate cancer and asso-ciated pathological lesions, J. Clin. Pathol. 49 (1996), 741–748.

[7] D.F. Gleason, Histologic grading of prostate cancer: a perspec-tive, Hum. Pathol. 23 (1992), 273–279.

[8] C.V. Howard and M.G. Reed, Unbiased Stereology. Three-Di-mensional Measurement in Microscopy, Bios Scientific Pub-lishers, Oxford, 1998.

[9] A.K. Jain and R.C. Dubes, Algorithms for Clustering Data,Prentice Hall, Englewood Cliffs, 1988.


[10] D. Keyser, P. Kupelian, C.D. Zippe, H.S. Levin and E.A. Klein,Stage T1–2 prostate cancer with pretreatment prostate-specificantigen level < or = 10 ng/ml: radiation therapy or surgery?,Int. J. Radiat. Oncol. Biol. Phys. 38 (1997), 723–729.

[11] D.G. Kleinbaum, Logistic Regression: A Self-Learning Text,Springer, New York, 1994.

[12] T. Kohonen, J. Hynninen, J. Kangas, J. Laaksonen andK. Torkkola, LVQ_PAK: The learning vector quantization pro-gram package, Technical Report A30, Helsinki University ofTechnology, Laboratory of Computer and Information Science,Otaniemi, Finland, 1996.

[13] T. Kohonen, J. Hynninen, J. Kangas and J. Laaksonen,SOM_PAK: The Self-Organizing Map Program Package,Technical Report A31, Helsinki University of Technology,Laboratory of Computer and Information Science, FIN-02150Espoo, Finland, 1996.

[14] T. Kohonen, Self-Organizing Maps, 2nd edn, Springer, Heidel-berg, 1997.

[15] P. Mangiameli, S.K. Chen and D. West, A comparison of SOMneural network and hierarchical clustering, Eur. J. Operat. Res.93 (1996), 402–417.

[16] C. Markopoulos, P. Karakitsos, E. Botsoli-Stergiou, A. Pouli-akis, A. Iokim-Liossi, K. Kyrkou and J. Gogas, Application ofthe learning vector quantizer to the classification of breast le-sions, Analyt. Quant. Cytol. Histol. 19 (1997), 453–460.

[17] T. Mattfeldt, Classification of binary spatial textures using sto-chastic geometry, nonlinear deterministic analysis and artificialneural networks, Int. J. Pattern Recogn. Artif. Intell. 17 (2003),275–300.

[18] T. Mattfeldt and G. Mall, Estimation of length and surface ofanisotropic capillaries, J. Microsc. 135 (1984), 181–190.

[19] T. Mattfeldt and D. Stoyan, Improved estimation of the paircorrelation function, J. Microsc. 200 (2000), 158–173.

[20] T. Mattfeldt, G. Mall, H. Gharehbaghi and P. Möller, Estima-tion of surface area and length with the orientator, J. Microsc.159 (1990), 301–317.

[21] T. Mattfeldt, H. Frey and C. Rose, Second-order stereologyof benign and malignant alterations of the human mammarygland, J. Microsc. 171 (1993), 143–151.

[22] T. Mattfeldt, V. Schmidt, D. Reepschläger, C. Rose and H. Frey,Centred contact density functions for the statistical analysis ofrandom sets, J. Microsc. 183 (1996), 158–169.

[23] T. Mattfeldt, H.A. Kestler, R. Hautmann and W. Gottfried, Pre-diction of prostatic cancer progression after radical prostatec-tomy using artificial neural networks: a feasibility study, BJUInt. 84 (1999), 316–323.

[24] T. Mattfeldt, H.W. Gottfried, V. Schmidt and H.A. Kestler,Classification of spatial textures in benign and cancerous glan-dular tissues by stereology and stochastic geometry using arti-ficial neural networks, J. Microsc. 198 (2000), 143–158.

[25] T. Mattfeldt, H.A. Kestler, R. Hautmann and H.W. Gottfried,Prediction of postoperative prostatic cancer stage on the basisof systematic biopsies using two types of artificial neural net-works, Eur. Urol. 39 (2001), 530–537.

[26] T. Mattfeldt, H. Wolter, R. Kemmerling, H.W. Gottfried andH.A. Kestler, Cluster analysis of comparative genomic hy-bridization (CGH) data using self-organizing maps: applicationto prostate carcinomas, Anal. Cell. Pathol. 23 (2001), 29–37.

[27] T. Mattfeldt, H. Wolter, D. Trijic, H.W. Gottfried and H.A.Kestler, Chromosomal regions in prostatic carcinomas stud-ied by comparative genomic hybridization, hierarchical clusteranalysis and self-organizing feature maps, Anal. Cell. Pathol.24 (2002), 167–179.

[28] R. Mazzucchelli, R. Montironi, A. Santinelli, G. Lucarini, A.Pugnaloni and G. Biagini, Vascular endothelial growth fac-tor expression and capillary architecture in high-grade PINand prostate cancer in untreated and androgen-ablated patients,Prostate 45 (2000), 72–79.

[29] G. Moutzouris, C. Barbatis, D. Plastiras, N. Mertziotis, C. Kat-sifotis, V. Presvelos and C. Theodorou, Incidence and histolog-ical findings of unsuspected prostatic adenocarcinoma in rad-ical cystoprostatectomy for transitional cell carcinoma of thebladder, Scand. J. Urol. Nephrol. 33 (1999), 27–30.

[30] J. Ohser and F. Mücklich, Statistical Analysis of Microstruc-tures in Materials Sciences, Wiley, Chichester, 2000.

[31] D. Pantazopoulos, P. Karakitsos, A. Iokim-Liossi, A. Pouliakisand K. Dimopoulos, Comparing neural networks in the dis-crimination of benign from malignant lower urinary tract le-sions, Br. J. Urol. 81 (1998), 574–579.

[32] D. Pantazopoulos, P. Karakitsos, A. Pouliakis, A. Iokim-Liossiand M.A. Dimopoulos, Static cytometry and neural networksin the discrimination of lower urinary system lesions, Urology51 (1998), 946–950.

[33] SAS Institute, SAS/STAT User’s Guide, Release 6.03 Edition,SAS Institute Inc., Cary, NC, 1988.

[34] R. Saunders, M.O. Stitson, J. Weston, L. Bottou, B. Schölkopfand A. Smola, Support vector machine reference manual, Tech-nical Report, Royal Holloway, University of London, 1998.

[35] S.N. Snapinn and J.D. Knoke, estimation of error rates in dis-criminant analysis with selection of variables, Biometrics 45(1989), 289–299.

[36] L.H. Sobin and C.H. Wittekind, eds, International Unionagainst Cancer (UICC): TNM Classification of Malignant Tu-mours, Wiley, New York, 1997.

[37] D. Stoyan, W.S. Kendall and J. Mecke, Stochastic Geometryand Its Applications, 2nd edn, Wiley, Chichester, 1995.

[38] P. Tamayo, D. Slonim, J. Mesirov, Q. Zhu, S. Kitareewan,E. Dmitrovsky, E.S. Lander and T.R. Golub, Interpreting pat-terns of gene expression with self-organizing maps: Meth-ods and application to hemopoietic differentiation, Proc. Natl.Acad. Sci. USA 96 (1999), 2907–2912.

[39] A. Tewari, Artificial intelligence and neural networks: concept,applications and future in oncology, Br. J. Urol. 80(Suppl. 3),(1997), 53–58.

[40] K.L. Toh, P.H. Tan and C.W. Cheng, Six-year follow-up of un-treated T1 carcinoma of prostate, Ann. Acad. Med. Singapore29 (2000), 201–206.

[41] G.D. Tourassi and C.E. Floyd, The effect of data samplingon the performance evaluation of artificial neural networks inmedical diagnosis, Med. Decis. Making 17 (1997), 186–192.

[42] O. Troyanskaya, M. Cantor, G. Sherlock, P. Brown, T. Has-tle, R. Tibshirani, D. Botstein and R.B. Altman, Missing valueestimation methods for DNA estimation, Bioinformatics 17(2001), 520–525.


[43] R.E. Unger, J.B. Oltrogge, H. von Briesen, B. Engelhardt,U. Woelki, W. Schlote, R. Lorenz, H. Bratzke and C.J. Kirk-patrick, Isolation and molecular characterization of brain mi-crovascular endothelial cells from human brain tumors, In VitroCell. Dev. Biol. Anim. 38 (2002), 273–281.

[44] P.J. van Veldhuizen, R. Sadasivan, R. Cherian, T. Dwyer andR.L. Stephens, p53 expression in incidental prostatic cancer,Am. J. Med. Sci. 305 (1993), 275–279.

[45] V.N. Vapnik, Statistical Learning Theory, Wiley, New York,1998.

[46] R.K. Vartanian and N. Weidner, Endothelial cell proliferationin prostatic carcinoma and prostatic hyperplasia: correlationwith Gleason’s score, microvessel density, and epithelial cellproliferation, Lab. Invest. 73 (1995), 844–850.

[47] S. Vesalainen, P. Lipponen, M. Talja, E. Alhava and K. Syrja-nen, Tumor vascularity and basement membrane structure asprognostic factors in T1–2M0 prostatic adenocarcinoma, Anti-cancer Res. 14 (1994), 709–714.

[48] M. Volavsek, A. Masera and Z. Ovcak, Incidental prostatic car-cinoma. A predictive role of neoangiogenesis and comparisonwith other prognostic factors, Pathol. Oncol. Res. 6 (2000),191–196.

[49] J.T. Wei, Z. Zhang, S.D. Barnhill, K.R. Madyastha, H. Zhangand J.E. Oesterling, Understanding artificial neural networksand exploring their potential applications for the practicingurologist, Urology 52 (1998), 161–172.

[50] N. Weidner, Tumor angiogenesis: review of current appli-cations in tumor prognostication, Semin. Diagn. Pathol. 10(1993), 302–313.

[51] N. Weidner, P.R. Carroll, J. Flax, W. Blumenfeld and J. Folk-man, Tumor angiogenesis correlates with metastasis in invasiveprostate carcinoma, Am. J. Pathol. 143 (1993), 401–409.

[52] H. Wolter, D. Trijic, H.-W. Gottfried and T. Mattfeldt, Chro-mosomal changes in incidental prostatic carcinomas detectedby comparative genomic hybridization, Eur. Urol. 41 (2002),328–334.

Submit your manuscripts athttp://www.hindawi.com

Stem CellsInternational

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014


MEDIATORSINFLAMMATION

of


Behavioural Neurology

EndocrinologyInternational Journal of



Disease Markers


BioMed Research International

OncologyJournal of



Oxidative Medicine and Cellular Longevity


PPAR Research

The Scientific World JournalHindawi Publishing Corporation http://www.hindawi.com Volume 2014

Immunology ResearchHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Journal of

ObesityJournal of



Computational and Mathematical Methods in Medicine

OphthalmologyJournal of


Diabetes ResearchJournal of



Research and TreatmentAIDS


Gastroenterology Research and Practice


Parkinson’s Disease

Evidence-Based Complementary and Alternative Medicine

Volume 2014Hindawi Publishing Corporationhttp://www.hindawi.com

Classiﬁcation of incidental carcinoma of the prostate ...

Documents

Transcript of Classiﬁcation of incidental carcinoma of the prostate ...