An On-Line Medical Imaging Management for Shared Research in...

6
An On-Line Medical Imaging Management for Shared Research in the Web using Pattern Features Gustavo Molitor Porcides Federal University of Paran´ a - UFPR Centro Polit´ ecnico campus, Curitiba, Brazil [email protected] Leandro Henrique Stein Federal University of Paran´ a - UFPR Centro Polit´ ecnico campus, Curitiba, Brazil [email protected] Terumi Kamada Federal University of Paran´ a - UFPR Centro Polit´ ecnico campus, Curitiba, Brazil [email protected] Luiz Antonio Pereira Neves Federal University of Paran´ a - UFPR Centro Polit´ ecnico campus, Curitiba, Brazil [email protected] Gilson Antonio Giraldi National Laboratory for Scientific Computing Petropolis - RJ, Brazil Abstract This work aims to create a medical and biological imag- ing management system that will be used to share images and information between researchers via a client/server ar- chitecture. It will be developed using the DBMS PostgreSQL and PHP to create the web interface. In this system the im- age attributes are automatically extracted and the pattern features are obtained by image retrieval techniques for im- proving the searches. These are the main advantages of the proposed solution. Finally, we also present the development methodology of the proposed tool. 1. Introduction The objective of this research is the development of a management system for medical and biological images that includes resources for edition, registration and classification of images. Moreover, the proposed system offers facilities to store results of processing tasks. In addition, it allows re- searchers to share information through the Web. The pro- posed system uses web tools so that the data base can be re- motely accessed via a web browser. Therefore, we get high usability and interactivity, since web browsers are known and portable user interfaces. Many researchers have presented similar proposals in the literature. For instance, Azevedo-Marques et al [2] suggests the use of the PACS (Picture Archiving and Communica- tion Systems), for storage, communication, processing and edition of medical images and diagnoses in hospitals. The proposed PACS system uses the DICOM standard for infor- mation trade between hospitals. It is composed of a DICOM server and web server. The use of web technologies, such as HTML and ASP, allows a fast distribution and friendly in- terface. The PACS utilizes a client/server architecture and data model made in Oracle 8.1.7.0.0. and Delphi 5. In Azevedo-Marques et al [3], authors propose another kind of PACS for the Hospital das Cl´ ınicas from Ribeir˜ ao Preto (FMRP-USP). This model, uses a Linux server for im- age distribution through FTP using TCP/IP. The images are stored in disk-arrays or CD-ROM. The authors comment that the implementation of this system has a high cost and, for this reason, the implementation must be well planned. In another work, Pires et al [17] perform software de- veloped with Delphi 6 and Interbase 6 using the BI-RADS standard (”Breast Imaging Reporting and Data System”) for the registration of mammographic images. This system of- fers image visualization and training facilities for students of the Federal University of S˜ ao Paulo. For image retrieval Carit´ a et al [5] suggests the use of the CBIR (Content Based Image Retrieval). In this case, MySQL is used to store attributes like color, shape and tex- ture. These features are extracted by a system component written in C++ that will verify the images in a PACS server. It has textual recovery with HTML and PHP and a Java soft- ware to visualize the DICOM images. On the other hand, d’Ornellas et al [7] suggests the use of metadata from medical image. According to the W3C [6] definition, metadata are information located in the web, in- 04-07 de Julho - FCT/UNESP - P. Prudente VI Workshop de Visão Computacional 36

Transcript of An On-Line Medical Imaging Management for Shared Research in...

Page 1: An On-Line Medical Imaging Management for Shared Research in …iris.sel.eesc.usp.br/wvc/anais_WVC2010/artigos/poster/... · 2010-06-25 · An On-Line Medical Imaging Management for

An On-Line Medical Imaging Management for Shared Research in the Web

using Pattern Features

Gustavo Molitor Porcides

Federal University of Parana - UFPR

Centro Politecnico campus, Curitiba, Brazil

[email protected]

Leandro Henrique Stein

Federal University of Parana - UFPR

Centro Politecnico campus, Curitiba, Brazil

[email protected]

Terumi Kamada

Federal University of Parana - UFPR

Centro Politecnico campus, Curitiba, Brazil

[email protected]

Luiz Antonio Pereira Neves

Federal University of Parana - UFPR

Centro Politecnico campus, Curitiba, Brazil

[email protected]

Gilson Antonio Giraldi

National Laboratory for Scientific Computing

Petropolis - RJ, Brazil

Abstract

This work aims to create a medical and biological imag-

ing management system that will be used to share images

and information between researchers via a client/server ar-

chitecture. It will be developed using the DBMS PostgreSQL

and PHP to create the web interface. In this system the im-

age attributes are automatically extracted and the pattern

features are obtained by image retrieval techniques for im-

proving the searches. These are the main advantages of the

proposed solution. Finally, we also present the development

methodology of the proposed tool.

1. Introduction

The objective of this research is the development of a

management system for medical and biological images that

includes resources for edition, registration and classification

of images. Moreover, the proposed system offers facilities

to store results of processing tasks. In addition, it allows re-

searchers to share information through the Web. The pro-

posed system uses web tools so that the data base can be re-

motely accessed via a web browser. Therefore, we get high

usability and interactivity, since web browsers are known

and portable user interfaces.

Many researchers have presented similar proposals in the

literature. For instance, Azevedo-Marques et al [2] suggests

the use of the PACS (Picture Archiving and Communica-

tion Systems), for storage, communication, processing and

edition of medical images and diagnoses in hospitals. The

proposed PACS system uses the DICOM standard for infor-

mation trade between hospitals. It is composed of a DICOM

server and web server. The use of web technologies, such as

HTML and ASP, allows a fast distribution and friendly in-

terface. The PACS utilizes a client/server architecture and

data model made in Oracle 8.1.7.0.0. and Delphi 5.

In Azevedo-Marques et al [3], authors propose another

kind of PACS for the Hospital das Clınicas from Ribeirao

Preto (FMRP-USP). This model, uses a Linux server for im-

age distribution through FTP using TCP/IP. The images are

stored in disk-arrays or CD-ROM. The authors comment

that the implementation of this system has a high cost and,

for this reason, the implementation must be well planned.

In another work, Pires et al [17] perform software de-

veloped with Delphi 6 and Interbase 6 using the BI-RADS

standard (”Breast Imaging Reporting and Data System”) for

the registration of mammographic images. This system of-

fers image visualization and training facilities for students

of the Federal University of Sao Paulo.

For image retrieval Carita et al [5] suggests the use of

the CBIR (Content Based Image Retrieval). In this case,

MySQL is used to store attributes like color, shape and tex-

ture. These features are extracted by a system component

written in C++ that will verify the images in a PACS server.

It has textual recovery with HTML and PHP and a Java soft-

ware to visualize the DICOM images.

On the other hand, d’Ornellas et al [7] suggests the use

of metadata from medical image. According to the W3C [6]

definition, metadata are information located in the web, in-

04-07 de Julho - FCT/UNESP - P. Prudente VI Workshop de Visão Computacional

36

Page 2: An On-Line Medical Imaging Management for Shared Research in …iris.sel.eesc.usp.br/wvc/anais_WVC2010/artigos/poster/... · 2010-06-25 · An On-Line Medical Imaging Management for

telligible by the computer. So, the metadata is a data used

to describe a primary data. Information attached to an im-

age, or any other kind of document, are very useful for data

recovery and search in a data base. However, they may be

useless if they are not organized and structured. The use

of metadata is very complex and requires many computa-

tional resources due to the construction of its meaning. A

system of this nature is being developed by the PIGS group

and will be implemented at the Santa Maria University Hos-

pital (HUSM).

Santos M. and Furuie S. [21] propose a management sys-

tem in Java, with image processing algorithms for the stor-

age and manipulation of medical images. The authors indi-

cate an interactive architecture like an Internet portal, be-

ing useful as a tool for research, retrieval and data process-

ing. The images are stored in a PostgreSQL 8.0 database. It

can interact with other applications with distributed access

based on P2P and client/server. The proposed system sup-

ports many digital image formats, such as: DICOM, TIFF,

GIF, JPEG, BMP, etc.

Marchiori P. Z. [12] suggests the use of virtual li-

braries to improve the process of management of infor-

mation supported by data bases and web resources. Fur-

thermore, they can access other libraries by the Internet,

trading data through the use of protocols. Another ad-

vantage is the possibility of remote access, so a user

can take part of a discussions and trade data with the li-

brary.

The use of communication protocols becomes necessary

to create a connection between data bases and virtual li-

braries. A communication protocol is a convention or pat-

tern that controls and allows a connection and transference

of data between two computer systems [23]. Simply, a pro-

tocol can be defined as ”the rules that control” the syntax,

semantic and synchronization of communication. The pro-

tocols can be implemented by the hardware, software or a

combination of both. Rosetto M. [20] indicates the use of

the Z39.50, a communication protocol that allows access to

multiple systems using a single interface, with client/server

technology operating over the Internet. Its goal is to sim-

plify the manipulation of information in distributed sys-

tems.

Brito [4] proposes the use of the system called MicroI-

SIS. This system has fields that store digital images for your

records. It simplifies the management of collections because

it performs searches based on keywords. However, MicroI-

SIS has problems when dealing with large databases.

This research meets many challenges, such as: the lack

of medical image data base standards, the great variability

of image formats and the definition of a secure architecture

with the main server. Therefore, the great challenge faced

by our work is the development of a management system

for storage, manipulation, and sharing of medical and bio-

logical images using web technologies.

This work is organized as follows. The methodology is

described in section 2. Next, in the section 3, the proposed

system is analyzed. The conclusions are given in section 4.

2. Methodology

The methodology applied in this work is organized in

four stages, as shown in the Figure 1.

Figure 1. Methodology applied for the system

development.

2.1. First Stage: Definition of The Data Base Man-

agement System

The first phase consists of choosing the Data Base Man-

agement System (DBMS) that will be used in this work. In

this way, three freeware DBMS are analyzed and compared

accordingly to the maximum table load and its features.

2.2. Second Stage: Data Modeling

The second stage is the definition of the data model, us-

ing the UML tool to describe the input and output and rela-

tional modeling to identify all the system’s information.

2.3. Third Stage: Implementation

The third stage consists of the data model implementa-

tion, using the tools defined in stage 1 and the analyzes of

the features of three biotechnology web data bases, NCBI

(National Center for Biotechnology Information) [11], Soft-

Berry [22] and Addgene [1].

04-07 de Julho - FCT/UNESP - P. Prudente VI Workshop de Visão Computacional

37

Page 3: An On-Line Medical Imaging Management for Shared Research in …iris.sel.eesc.usp.br/wvc/anais_WVC2010/artigos/poster/... · 2010-06-25 · An On-Line Medical Imaging Management for

2.4. Fourth Stage: Validation Tests

In the fourth stage tests are made to validate the sys-

tem. For this phase, the validation protocol is defined via

the methods proposed by Pressmann [19] and by the use of

a checklist evaluation to identify the user’s perception of the

system.

3. Analysis of Obtained Results

In this section the results of the methodology are pre-

sented.

3.1. Results of the Data Base Management System

In this section the features of MySQL [16], FirebirdSQL

[8] and PostgreSQL [18] are shown and comparisons are

made between them.

MySQL is a DBMS written in C and C++, multitask, fo-

cused in threads [6], multiuser, optimized for web appli-

cations, specially if used with PHP. It’s easy to use and

it is portable to most computer platform with support for

several programming languages. It has an excellent perfor-

mance and stability, and it can be used in critical mission

systems [14].

PostgreSQL is client/server DBMS and has transactions,

triggers, views, foreign key referential integrity and locking

[6].

FirebirdSQL is a high performance relational DBMS

with trigger and procedures support [9].

In table 1 the maximum table load sizes of the three an-

alyzed DBMS are reported [13] [10] [15].

Maximum Table Load PostgreSQL MySQL FirebirdSQL

Windows 32TB 2TB 32TB

Linux 32TB 4TB 32TB

Table 1. Maximum Table Load

According to Table 1, PostgreSQL and FirebirdSQL has

support to greater table loads, offering up to 32TB of phys-

ical space for each table

As shown by Chen and Xie [6] PostgreSQL 8.2 supports

more features than MySQL 5.0 and Firebird 2.0. It sup-

ports associated integrity, database transactions, unicode,

indexes, temporary tables, table partion and clusters. Al-

though MySQL supports many of these features, it doesn’t

have GiST index support. Firebird supports only associated

integrity, database transactions and Unicode.

Therefore, by using the information shown in table 1

and by analyzing the features that these data base manage-

ment systems supports, PostgreSQL has been chosen as the

DBMS to be used in this project due to a greater support to

all the features required for the implementation of the pro-

posed system.

3.2. Results of the Image Manager’s Data Model-

ing

The data base has been modeled in a way that users are

divided in several levels by the actions of the administra-

tor as moderator. Each one has an access level with its priv-

ileges, such as remove, add, modify or only visualize the

images. Non-registered users can only view the images. On

the registration, the user must insert his name, address, in-

stitution, password and Social Security Number.

The image attributes are width, height, resolution, type

of compression, format and the quantity and type of chan-

nels. These attributes may be used to index the images.

These attributes will be extracted automatically during the

upload.

We have studied CBIR (Content-Based Image Retrieval)

algorithms as alternative to help the search for images. The

algorithm analyzes the actual contents of the image, such as

keywords, tags and descriptions, rather than the metadata.

The contents analyzed are colors, shapes, textures, or any

other information that can be derived from the image itself.

The main tables are ”image” and ”historical”. The first

one will keep all the data about the images. The second one

will combine the information about the user and operations

to have a complete historical of the data base used. Figure 2

shows the relational model.

3.3. Implementation of the Proposed System

During stage 3, a friendly user interface has been imple-

mented, which looks like the ones used by the biotechnol-

ogy information bases NCBI (National Center for Biotech-

nology Information) [11], SoftBerry [22] and Addgene [1].

NCBI and Addgene have a keyword search engine that sim-

plifies the access to the content that the user wants to view.

They also have menus divided in types categories as pro-

teins and genes. NCBI has a ”How To” page that shows to

navigate through the web site and access the content in an

easy and understandable way. Moreover, NCBI has a regis-

tration module that allows registered members to save their

researches, results, citations and offers several search fil-

ters and other benefits. In SoftBerry’s main page, the links

to the latest case studies are shown, to make the access eas-

ier. These websites are shown in Figures 3, 4, and 5.

These functionalities are implemented in the proposed

system to facilitate the interaction between the researcher

and the system, offering a pleasant, organized and objec-

tive environment.

04-07 de Julho - FCT/UNESP - P. Prudente VI Workshop de Visão Computacional

38

Page 4: An On-Line Medical Imaging Management for Shared Research in …iris.sel.eesc.usp.br/wvc/anais_WVC2010/artigos/poster/... · 2010-06-25 · An On-Line Medical Imaging Management for

Figure 2. Relational Model of the Proposed System.

Figure 3. NCBI Webpage.

3.4. Tests of Validation

For the system’s validation, three test methods proposed

by Pressman [19] will be used: black box test, white box test

and real-time test. The black box test verifies if the input is

adequately accepted and the output is correctly produced,

moreover it verifies if the external information integrity is

maintained. Three test methodologies are used. The equiva-

lency partitioning divides the output domain in equivalency

classes for tests. This minimizes the test cases, limiting each

class to one case. The limit value analysis verifies the neigh-

boring values, since many errors may occur in the input lim-

its of a module.

The white box test aims to define test cases that exercise

specific blocks of code of the web interface. The control

structure test verifies the logical conditions, the data flux

test takes the variables locations to define several test paths

and the execution paths that are tested.

The real-time test takes into account the actions timing

and aims to determine the reaction of the systems in sev-

eral states that vary with the time and can make the results

obtained vary too. The real-time tests are divided into four

stages. The first one, called task test, each task is tested

with white and black box test individually, revealing logi-

cal and function problems, but not behavioral or timing er-

rors. During the second stage the system’s behavior is simu-

04-07 de Julho - FCT/UNESP - P. Prudente VI Workshop de Visão Computacional

39

Page 5: An On-Line Medical Imaging Management for Shared Research in …iris.sel.eesc.usp.br/wvc/anais_WVC2010/artigos/poster/... · 2010-06-25 · An On-Line Medical Imaging Management for

Figure 4. SoftBerry Webpage.

Figure 5. Addgene Webpage.

lated using CASE (Computer-Aided software Engineering)

tools to test the system’s behavior as a consequence of exter-

nal events. These events are tests to detect errors and flaws.

After that they are tested in random sequences and frequen-

cies. The intertask test, the third stage, is realized after the

detection of behavioral and individual tasks errors. It aims

to detect timing errors. Several tasks communicate among

themselves with varying data and processing loads to de-

tect synchronization errors. In the last stage, the software

and hardware are integrated and then several tests are made

to discover errors in the hardware/software interface.

Besides these methods, checklist evaluation will be made

to identify the acceptance rate of the system. This evalua-

tion will made based on several questions about the system,

as shown if Figure 6.

4. Conclusions

This research presents a system for image sharing among

researchers in the Web, using feature patterns from images.

Figure 6. Proposed Checklist.

Currently, the web interface is being developed using

PHP language and integrated with PostgreSQL database

management as illustrated in figure 7.

Figure 7. Proposed Web Application.

The proposed research has advantages over others be-

cause the attributes are extracted automatically, the pattern

features are obtained by image retrieval techniques and im-

age database is public for any researchers in Digital Image

Processing.

Therefore, this project innovates by allowing image re-

search using its attributes and metadata, what makes the

search more efficient and effective. Furthermore, the im-

ages stored in the proposed system are registered with a

historical that simplifies the analysis of empirical results.

In future studies, several other functionalities will be im-

plemented, such as algorithms for image manipulation and

Content-Based Image Retrieval for improving the searches.

References

[1] Addgene. Available on http://www.addgene.org/pgvec1. ac-

cessed on april 21th, 2010.

[2] P. M. Azevedo-Marques, E. C. Carita, A. A. Benedicto, and

P. R. Sanches. Implantacao de um ris/pacs no hospital das

04-07 de Julho - FCT/UNESP - P. Prudente VI Workshop de Visão Computacional

40

Page 6: An On-Line Medical Imaging Management for Shared Research in …iris.sel.eesc.usp.br/wvc/anais_WVC2010/artigos/poster/... · 2010-06-25 · An On-Line Medical Imaging Management for

clınicas de ribeirao preto: Uma solucao baseada em web. Ra-

diol Bras 2005, Sao Paulo, pages 37–43, 2005.

[3] P. M. Azevedo-Marques, C. S. Trad, J. E. Junior, and A. C.

Santos. Implantacao de um mini-pacs (sistema de arquiva-

mento e distribuicao de imagens) em hospital universitario.

Radiol Bras 2001, Sao Paulo, pages 221–224, 2001.

[4] C. J. Brito. Gerencia de bases de imagens usando microisis.

[5] E. C. Carita, E. Seraphim, M. O. Honda, and P. M. Azevedo-

Marques. Implementaca e avaliacao de um sistema de geren-

ciamento de imagens medicas com suporte a recuperacao

baseada em conteudo. Radiol Bras 2008, Sao Paulo, pages

331–336, 2008.

[6] R. Chen and J. Xie. Open Source Approaches In Spatial Data

Handling. Springer, New York.

[7] M. C. d’Ornellas, S. R. Mussoi, and A. P. Dias. Avaliacao

e gerenciamento de qualidade de metadados de imagens

medicas. XVIII Congresso Brasileiro de Engenharia

Biomedica Santa Maria, 2004.

[8] FirebirdSQL. Available on http://www.firebirdsql.org/. ac-

cessed on april 25th, 2010.

[9] FirebirdSQL. Available on

http://www.firebirdsql.org/index.php?id=about-

firebird&nosb=1. accessed on april 26th, 2010.

[10] FirebirdSQL. Available on

http://www.firebirdsql.org/index.php?op=guide&id=techspec.

accessed on may 4th, 2010.

[11] N. N. C. for Biotechnology Information. Available on

http://www.ncbi.nlm.nih.gov/. accessed on april 21th, 2010.

[12] P. Z. Marchiori. Ciberteca ou biblioteca virtual: Uma per-

spectiva de gerenciamento de recursos de informacao. 1997.

[13] A. Milani. PostgreSQL : Guia do Programador. Novatec,

Sao Paulo, 2008.

[14] MySQL. Available on

http://dev.mysql.com/doc/refman/5.0/en/features.html .

accessed on april 26th, 2010.

[15] MySQL. Available on

http://dev.mysql.com/doc/refman/5.0/en/full-table.html.

accessed on may 4th, 2010.

[16] MySQL. Available on http://www.mysql.com/. accessed on

april 25th, 2010.

[17] S. R. Pires, R. B. Medeiros, and H. Schiabel. Banco de ima-

gens mamograficas para treinamento na interpretacao de im-

agens. Radiol Bras 2004, Sao Paulo, pages 239–244, 2004.

[18] PostgreSQL. Available on http://www.postgresql.org/. ac-

cessed on april 25th, 2010.

[19] R. S. Pressman. Engenharia de Software. Pearson Makron

Books, Sao Paulo, 1995.

[20] M. Rosetto. Uso do protocolo z39.50 para recupercao de

informcoes em redes eletronicas. 1997.

[21] M. Santos and S. S. Furuie. Base de imagens para avaliacao

de algoritmos de processamento de imagens medicas. IV

SBQS - V Workshop de Informatica Medica, 2005.

[22] SoftBerry. Available on

http://linux1.softberry.com/berry.phtml. accessed on april

21th, 2010.

[23] L. B. Sousa. Redes de Computadores: Guia Total. Editora

Erica, Sao Paulo, 2009.

04-07 de Julho - FCT/UNESP - P. Prudente VI Workshop de Visão Computacional

41