A Framework for Network RTK Data Processing Based on … · multiple continuously operating...

A Framework for Network RTK Data

Processing Based on Grid Computing

Deming Yin

A thesis submitted in partial fulfilment of the requirements for the degree of

Master by Research

Faculty of Science and Technology Queensland University of Technology

Brisbane, Queensland, 4001, AUSTRALIA

January, 2009

ABSTRACT

Real-Time Kinematic (RTK) positioning is a technique used to provide precise

positioning services at centimetre accuracy level in the context of Global Navigation

Satellite Systems (GNSS). While a Network-based RTK (N-RTK) system involves

multiple continuously operating reference stations (CORS), the simplest form of a

NRTK system is a single-base RTK. In Australia there are several NRTK services

operating in different states and over 1000 single-base RTK systems to support

precise positioning applications for surveying, mining, agriculture, and civil

construction in regional areas. Additionally, future generation GNSS constellations,

including modernised GPS, Galileo, GLONASS, and Compass, with multiple

frequencies have been either developed or will become fully operational in the next

decade.

A trend of future development of RTK systems is to make use of various isolated

operating network and single-base RTK systems and multiple GNSS constellations

for extended service coverage and improved performance. Several computational

challenges have been identified for future NRTK services including:

• Multiple GNSS constellations and multiple frequencies

• Large scale, wide area NRTK services with a network of networks

• Complex computation algorithms and processes

• Greater part of positioning processes shifting from user end to network centre

with the ability to cope with hundreds of simultaneous users’ requests

(reverse RTK)

There are two major requirements for NRTK data processing based on the four

challenges faced by future NRTK systems, expandable computing power and

scalable data sharing/transferring capability. This research explores new approaches

to address these future NRTK challenges and requirements using the Grid

Computing facility, in particular for large data processing burdens and complex

computation algorithms. A Grid Computing based NRTK framework is proposed in

this research, which is a layered framework consisting of: 1) Client layer with the

form of Grid portal; 2) Service layer; 3) Execution layer. The user’s request is passed

III

through these layers, and scheduled to different Grid nodes in the network

infrastructure.

A proof-of-concept demonstration for the proposed framework is performed in a

five-node Grid environment at QUT and also Grid Australia. The Networked

Transport of RTCM via Internet Protocol (Ntrip) open source software is adopted to

download real-time RTCM data from multiple reference stations through the

Internet, followed by job scheduling and simplified RTK computing. The system

performance has been analysed and the results have preliminarily demonstrated the

concepts and functionality of the new NRTK framework based on Grid Computing,

whilst some aspects of the performance of the system are yet to be improved in

future work.

IV

ACKNOWLEDGEMENT

Studying abroad has been a long and at times arduous journey, and one that could not

have been completed without the support of many people.

Firstly, I would like to express my sincere gratitude to my supervisors Professor

Mark Looi and Associate Professor Yanming Feng. They brought me into the area of

GNSS positioning, and provided various kinds of resources to facilitate my research.

During the research process, they also made sure that I was on the right track and

supervised the progress of my milestones. And most importantly, they checked my

thesis with great patience and gave me invaluable suggestions.

I wish to then thank Dr Charles Wang for his support in a variety of ways. I really

appreciate his time commenting on my various reports and thesis, and often gave me

lots of detailed and constructive feedback. I offer my thanks to PhD candidate

Bofeng Li for his answers to the NRTK problems which confused me a lot at the

early stage.

Computational resources and services used in this work were provided by the High

Performance Computing (HPC) and Research Support Group, Queensland University

of Technology (QUT). Special thanks are also offered to Mr Ashley Wright from

HPC for his untiring assistance with the Grid Australia node Auriga. I am fortunate

to have had access to such an excellent facility and staff. I also gratefully

acknowledge QUT and the Faculty of Information and Technology for awarding me

a scholarship to pursue my studies abroad.

Last but not least, I want to extend my deepest appreciation from the bottom of my

heart to my family and my friends, for believing that I could achieve anything I set

my mind to, for encouraging me to take the more challenging path, and for their

unwavering support during my study and my life so far.

V

STATEMENT OF AUTHORSHIP

“The work contained in this thesis has not been previously submitted to meet

requirements for an award at this or any other higher education institution. To the

best of my knowledge and belief, the thesis contains no material previously

published or written by another person except where due reference is made.”

Signature __________________

Deming Yin

Date __________________

VI

TABLE OF CONTENTS

1. Introduction ........................................................................................ 1 1.1 Background of Research................................................................................ 1

1.2 Statements of Research Problems ................................................................. 3

1.3 Research Aims and Objectives ...................................................................... 5

1.4 Structure of Thesis ......................................................................................... 6

2. Review of NRTK and Grid Computing ............................................. 8 2.1 Overview of GNSS ........................................................................................ 8

2.1.1 GNSS Systems ....................................................................................... 8

2.1.2 Real Time Kinematic (RTK) Positioning ............................................ 11

2.2 Overview of Grid Computing ...................................................................... 16

2.2.1 Grid Computing ................................................................................... 16

2.2.2 Security ................................................................................................ 21

2.2.3 Web Service ......................................................................................... 22

2.2.4 Job Scheduling ..................................................................................... 23

2.2.5 Globus .................................................................................................. 23

2.2.6 Grid Computing Applications .............................................................. 25

2.3 Relationship between NRTK and Grid Computing .................................... 26

2.4 Summary ..................................................................................................... 26

3. Design of Framework for NRTK Data Processing Based on Grid

Computing ............................................................................................... 28 3.1 Requirements of NTRK Data Processing .................................................... 28

3.2 Architectural Studies of Grid Computing Based NRTK ............................. 31

3.3 Framework Design ...................................................................................... 32

3.4 Design of Different Layers .......................................................................... 35

3.4.1 Client Layer .......................................................................................... 35

3.4.2 Service Layer ....................................................................................... 36

3.4.3 Execution Layer ................................................................................... 39

3.5 Summary ..................................................................................................... 41

VII

4. Demonstration of Framework – A Simplified Data Processing

Experiment .............................................................................................. 42 4.1 Overview of a Proof-of-Concept Demonstration ........................................ 42

4.1.1 Introduction to Simplified Data Processing Demonstration ................ 42

4.1.2 Evaluation of Open Source Ntrip Client .............................................. 43

4.1.3 Demonstration Environment ................................................................ 46

4.2 Requirements and Design of the Proof-of-Concept Demonstration ............ 47

4.3 Implementation of Demonstration ............................................................... 48

4.3.1 Globus Configuration ........................................................................... 48

4.3.2 Submitting a Job ................................................................................... 50

4.3.3 Processing Procedure ........................................................................... 53

4.4 Results Analysis and Discussions ................................................................ 55

4.4.1 Performance Evaluation ....................................................................... 55

4.4.2 Multiple Mountpoints on Each Grid Node ........................................... 56

4.4.3 Computation Task Submission ............................................................. 57

4.4.4 Different Computation Tasks on Each Grid Node ............................... 57

4.4.5 Additional Tests ................................................................................... 59

4.4.6 Discussion ............................................................................................ 60

4.5 Result From Lightweight Grid Tool JPPF ................................................... 61

4.6 Summary ...................................................................................................... 63

5. Concluding Remarks and Future Work ............................................ 64 5.1 Concluding Remarks ................................................................................... 64

5.2 Future Work ................................................................................................. 65

Appendix A: Installation Process of Globus ........................................... 67

Appendix B: Web Service for Submitting a Job ..................................... 76

References ............................................................................................... 81

VIII

LIST OF TABLES Table 1.1 GNSS Present and Future ............................................................................ 3

Table 2.1 Background Processing and Time Requirements ...................................... 26

Table 3.1 Background Processing and Time Requirements ...................................... 29

Table 3.2 Data Amount of Backing-up Service ......................................................... 30

Table 3.3 User-System Table ..................................................................................... 36

Table 4.1 Open Source Ntrip Client Information....................................................... 44

Table 4.2 RTCM 2.x Message ................................................................................... 44

Table 4.3 Configuration of Grid Nodes in Laboratory .............................................. 46

Table 4.4 Specifications of the Grid Node at QUT .................................................... 47

Table 4.5 Basic Parameters of Grid Nodes ................................................................ 49

Table 4.6 Brief Installation Process of Globus .......................................................... 49

Table 4.7 Valid Proxy Generation ............................................................................. 50

Table 4.8 Submitting a Job in Globus ........................................................................ 51

Table 4.9 Job Description in XML Format ................................................................ 51

Table 4.10 Job Description with File Stage-In and Stage-Out................................... 52

Table 4.11 Time Factor of Simple Job (Unit: Second) .............................................. 57

Table 4.12 Time Factor of a Complex Job (Unit: Second) ........................................ 59

Table 4.13 Time Factor of Data Sharing (Unit: Second) ........................................... 60

Table 4.14 Time Factor of Data Sharing (Unit: Second) ........................................... 63

IX

LIST OF FIGURES Figure 1.1 SunPOZ Current Network Coverage (Higgins 2008) ................................. 2

Figure 1.2 Planned Distribution of New AuScope GNSS Stations (Rizos 2007) ........ 5

Figure 2.1 Constellations of GPS (Wikipedia 2007) .................................................... 9

Figure 2.2 Fundamental Principle of Positioning ....................................................... 10

Figure 2.3 RTK Positioning Illustrations ................................................................... 12

Figure 2.4 Illustration of Single Base RTK and NRTK ............................................. 13

Figure 2.5 System Architecture of the Virtual Reference Station (Wanninger 2006) 14

Figure 2.6 Simplified Grid Topology ......................................................................... 17

Figure 2.7 Layered Grid Architecture vs. the Internet Protocol Architecture ............ 18

Figure 2.8 Basic Classifications of Grid Computing Organizations .......................... 19

Figure 2.9 Globus GT4 Infrastructures ...................................................................... 19

Figure 2.10 Topology of Grid .................................................................................... 20

Figure 2.11 OGSA Platform Architecture .................................................................. 20

Figure 2.12 Simplified Scenario of Typical Secured Communication ...................... 21

Figure 2.13 Web Service Architecture ....................................................................... 22

Figure 2.14 Globus Components (Sotomayor and Childers 2006) ............................ 24

Figure 2.15 Schematic View of GT4.0 Components (Foster 2005) ........................... 25

Figure 3.1 Data Sharing Service Among Different Sites ........................................... 29

Figure 3.2 NRTK Architecture ................................................................................... 31

Figure 3.3 Grid Computing based NRTK .................................................................. 32

Figure 3.4 Different Organisation Being Responsible for Different Component ...... 33

Figure 3.5 Execution Layer Network Topology ........................................................ 35

Figure 3.6 Service Decomposition and Composition ................................................. 38

Figure 3.7 Background Processing Service Decomposition ...................................... 39

Figure 3.8 A Simplified Network Infrastructure of NRTK Data Processing ............. 40

Figure 3.9 Simple CA Solution for Security .............................................................. 41

Figure 4.1 Ntrip Data Flow Process ........................................................................... 43

Figure 4.2 Grid Environment in a Laboratory at QUT .............................................. 47

Figure 4.3 Design of Demonstration .......................................................................... 48

Figure 4.4 Ntrip Data Processing Procedure .............................................................. 53

Figure 4.5 Job Submission Procedure ........................................................................ 54

Figure 4.6 Calculating Positions on Auriga ............................................................... 58

X

Figure 4.7 Calculating Ionospheric Bias on Auriga ................................................... 58

Figure 4.8 JPPF Architecture (JPPF 2008) ................................................................ 62

XI

XII

LIST OF ACRONYMS ARGN: Australian Regional GNSS Network

CORS: Continuously Operating Reference Stations

DoD: Department of Defence

EDGE: Enhanced Data rates for GSM Evolution

GNSS: Global Navigation Satellite Systems

GPRS: General Packet Radio Services

GPS: NAVSTAR Global Positioning System

GSM: Global System for Mobile Communication

IGS: International GNSS Service

NCRIS: National Collaborative Research Infrastructure Strategy

NPC: Network Processing Centre

Ntrip: Networked Transport of RTCM via Internet Protocol

NRTK: Network RTK

RINEX: Receiver Independent Data Exchange format

RTCM: Radio Technical Commission for Maritime Services

RTK: Real Time Kinematic

SAPOS: Satellite Positioning Service of the German National Survey

SV: Space Vehicle

TCAR: Three Carrier Ambiguity Resolution

UMTS: Universal Mobile Telecommunication Service

WS GRAM: Web Service Grid Resource Allocation Management

1. Introduction

1.1 Background of Research

The research problem of this thesis arose from a Cooperative Research Centre for

Spatial Information (CRCSI) project which faces several challenges for future GNSS

precise positioning services. The CRCSI project 1.04, titled “Delivering Precise

Positioning Services in Regional Areas”, includes several organizations and industry

partners including: Queensland University of Technology (QUT), The University of

New South Wales (UNSW), Queensland Department of Natural Resources and

Water (NRW), Geoscience Australia, Ergon Energy, Leica Geosystems, Trimble

Navigation, etc (CRCSI 2007).

Project 1.04’s main goal is to develop an extension of GNSS precise positioning

services into regional areas to facilitate the adoption of such services in agriculture,

mining, utilities, construction, tourism, defence and environmental protection. With

some similar service networks having been established in areas of high population

density, such as the SunPOZ network for South East Queensland as shown in Figure

1.1, the project aims to extend the current service to sparsely populated regional

areas where the available technology may be characterised as “thin infrastructure”. In

order to achieve this, several research areas have been identified and have to be

explored.

1

Figure 1.1 SunPOZ Current Network Coverage (Higgins 2008)

Project 1.04’s first research task is to investigate the technical framework for the

current and future GNSS network architecture and operations. This will examine

alternative approaches for precise positioning network architecture beyond the

solutions currently offered by the major commercial suppliers. It will also address

challenges and benefits of next generation GNSS such as GPS-III and Galileo in

terms of all level GNSS services at local, regional and global scales.

The second task of Project 1.04 is to investigate appropriate communications

infrastructure for both the provider’s reference stations and the client’s rover

receivers, when working in remote and sparsely populated regional areas.

The third part of Project 1.04 is to develop a software-based research platform to

investigate and develop the next generation precise positioning services.

This research is closely related to the third part of Project 1.04 and will be conducted

by trying to approach the software-based research platform utilising Grid.

2

halla

This figure is not available online. Please consult the hardcopy thesis available from the QUT Library

1.2 Statements of Research Problems

Over the next decade, new GNSS systems (Galileo, GLONASS, and others) will be

deployed, offering worldwide positioning services which have much more enhanced

capabilities than the current GPS system. However technical challenges have been

identified with future GNSS systems. Firstly, the number of available satellites will

double, or triple compared to the current situation (GPS only), and even quadruple if

the proposed Chinese Compass is also considered. Additionally, a major change in

the broadcast frequency is proposed. The positioning service will be upgraded from

the current dual frequency based to triple or multiple frequency based. Table 1.1

provides a comparison between the current and future GNSS system. As more

satellites and L-band signals become available, the reference station in Continuously

Operating Reference Stations (CORS) network needs to have the ability to deal with

increasing amounts of data and complexity of data interoperability.

Table 1.1 GNSS Present and Future

GNSS System

Number of Satellites Frequency Band Present Future Present Future

GPS 31 31+ L1, L2 L1, L2, L5 GLONASS 10 24+ G1, G2 G1, G2, G3

Galileo 1 30+ L1 E5A, E5B, L1, E6 Compass 1 35+ E1, E2 E1, E2, E6, L5

As the number of base stations increases in each region, there is a tendency for

different RTK networks in different regions to connect to each other to provide wider

area positioning services. For example, the number of reference stations will grow

from 30 to over 100 in GPSnet (GPSnet 2008) of Victoria, and the same situation

will happen from 11 to 30 in SydNET (SydNET 2008) of New South Whales, and 7

to over 20 in SunPOZ of Queensland. AuScope (AuScope 2008), an organisation for

a National Earth Science Infrastructure Program 2007-2011 funded by National

Collaborative Research Infrastructure Strategy (NCRIS), will establish more than

100 new Australian Regional GNSS Network (ARGN) quality stations throughout

Australia.

Due to the increasingly larger ground network scales and updates of GNSS systems,

computations used to compute differential corrections in Network Processing Centre

3

(NPC) will be much more complicated and heavier than those used currently. Firstly,

more rapid tropospheric and ionospheric model updates, and the adoption of Three

Carrier Ambiguity Resolution (TCAR) will bring additional complexity to the

positioning processes (Feng and Rizos 2005). Secondly, as the purpose of the CRC

project is to deliver precise positioning services in regional areas, the base line

between stations in NRTK will expand from 70km to 150-200km (Feng and Li

2008). Thirdly, background processing referring to time consuming processes such

as preparing GNSS orbital polynomial functions, estimation of tropospheric grid

models and ionospheric grid map generation, requires huge extra computational

capability. As a result, a new solution is needed for these extremely time-consuming

computations (e.g., huge-dimensional matrix calculation) to provide high-quality

real-time positioning services.

Another problem is that a greater part of the positioning process may shift from user-

terminals to the network centre which has the ability to cope with hundreds of

simultaneous users’ requests. Conventionally, most of the positioning process is done

at the user end. As more positioning processes could be dealt with at the network

centre or servers, also called reverse RTK, this could be a challenge for real-time

positioning service (Lim and Rizos 2007).

To summarize, there are four main aspects that will challenge the computation of

NRTK data processing.


• Large scale, wide area NRTK services with more reference stations

• Increasingly Complex computation algorithms for network-based processing

• Large part of positioning computation process shifting from the user end to

the network centre with ability to cope with hundreds of simultaneous users’

requests (reverse RTK)

In addition to computational requirements, data sharing should also be a significant

post-processing requirement for NRTK. Data sharing among processing centres and

servers sitting in different CORS networks and IGS reference stations/servers will

become very frequent as each CORS becomes more extensively covered. Such data

sharing will bring substantial benefits to nation-wide geodetic activities, such as the

4

backup service of one state data in another state’s processing centre in case of an

earthquake for an example.

This research proposes a Grid Computing based framework to address the problems

mentioned above. As the whole project evolves and approaches full Australia

coverage at a later stage, as shown in Figure 1.2, the NPC from every state may be

integrated as a whole. This network infrastructure will act as the hardware base for

the Grid Computing environment. Corresponding software architecture will also be

discussed in detail in the proposal.

Figure 1.2 Planned Distribution of New AuScope GNSS Stations (Rizos 2007)

It is important to note that although the research questions come from the CRCSI

project, the work does not reflect the official position and viewpoints of the project

team of QUT, NRW and UNSW. Instead, the research provides an independent

perspective to the problems.

1.3 Research Aims and Objectives

The aim of this research is to explore the use of Grid Computing as a solution to

provide a high performance computing platform, thus meet the requirement of

providing real-time or near real-time corrections to users. The objectives of this

research are outlined below:

• To analyse the time requirements of large scale NRTK.

5

halla


At the very first stage, timeliness requirements of large scale NRTK will be

analysed based on the existing research problems. The requirements will serve as

the basis of the framework to be designed. Also, the requirements are so generic

that they can also be used by other methods trying to approach large scale NRTK

problems.

• To propose a Grid Computing-based framework to process data from

multiple reference stations.

This is the main objective of this research effort. A robust, reusable and compact

framework will be developed to cope with the proposed research problems. Lots

of factors, such as time, computational capability, and heterogeneous

environment, should be considered during the designing process.

• To implement the framework to demonstrate the idea of Grid.

A full implementation of the above framework for Network RTK services (data

collection, correction generation, positioning, etc) involves significant

development efforts and is beyond the scope of this research. A simplified data

processing experiment is instead used as a proof-of-concept demonstration for

the proposed framework based on Grid Computing.

1.4 Structure of Thesis

Apart from a review of existing methods and solutions, this research effort consists

of four tasks, which are conducted in four phases: 1) Examine timeliness

requirements of large scale NRTK; 2) Build a Grid Computing environment for

NRTK data processing using Globus; 3) Reconstruct reference station algorithms

adapted to Grid Computing; 4) Test simplified NRTK positioning cases in the Grid

Computing environment.

The whole thesis is divided into five chapters. The first chapter gives some

background of the research topic and research problems. The second chapter is

dedicated to an overview of NRTK and Grid Computing. The third chapter presents

the requirements of data processing and design of NRTK data processing based on

Grid Computing. The fourth chapter discusses how the Ntrip data processing

6

experiment is conducted and the test results. The final chapter presents concluding

remarks and a brief outline of future work.

7

2. Review of NRTK and Grid Computing

2.1 Overview of GNSS

This section provides a general overview of GNSS. Topics such as different GNSS

systems, GNSS applications, and various Real Time Kinematic (RTK) principles and

concepts are covered.

2.1.1 GNSS Systems

GNSS Navigation and Positioning

Firstly, we outline the concept of navigation. Navigation is simply about how we can

go from point A to point B. GNSS is the acronym of Global Navigation Satellite

System. Generally speaking, GNSS tells us where we are and how to get to our

destination if we have a GNSS receiver. A GNSS allows receivers to determine their

states (longitude, latitude, altitude, velocity and time) with positioning accuracy of a

few to tens of metres using time signals transmitted along a line of sight by radio

from the satellites. Receivers on the ground with a fixed position can also be used as

reference stations for precise positioning services and scientific purposes (Misra and

Enge 2006).

There are several existing and planned GNSS navigation systems, such as

NAVSTAR Global Positioning System (GPS) of U.S., GLONASS of Russia, Galileo

of Europe and Compass of China. GPS is the first of the new generation of

navigation satellite systems to become operational and is likely to remain the only

fully operational system at least until 2010. GPS cost the U.S. government US$10

billion to develop and US$500 million for operation and maintenance annually

(Misra and Enge 2001; Misra and Enge 2006).

GPS consists of three segments, space segment, control segment and user segment.

The Department of Defence (DoD) is responsible for both the Space and Control

Segments. The Space segment shown in Figure 2.1 is composed of satellites in the

sky, which are also called space vehicles (SVs). The baseline constellation comprises

24 satellites deployed in nearly circular orbits with a radius of 26,560 km, a period of

approximately twelve hours, and stationary ground tracks. There are several kinds of

8

satellites, Block I, Block II, Block IIA, Block IIR (the ‘R’ stands for replenishment)

and Block IIR-M. Block IIR-M and the future GPS III satellites (or Block IIF/Block

III) will contain more frequencies. Early sailors relied on stars in the sky to judge

their direction and they had to rely on their experiences when it was rainy or cloudy.

But satellites as man-made stars function 24/7, 365 days a year. So thanks to

satellites, nowadays it’s really much easier to know where we are on the earth.

Figure 2.1 Constellations of GPS (Wikipedia 2007)

The Control segment consists of Master Control Station (MCS), monitor stations and

ground antennas. At the heart of the Control Segment is the Master Control Station,

located at the Schriever (formerly named Falcon) Air Force Base near Colorado

Springs, Colorado. The Master Control Station operates the system and provides

command and control functions. The specific functions of the Control Segment are:

• to monitor satellite orbits,

• to monitor and maintain satellite health,

• to maintain GPS Time,

• to predict satellite ephemerides and clock parameters,

• to update satellite navigation messages,

• to command small manoeuvres of satellites to maintain orbit, and relocations

to compensate for failures, as needed

The User segment consists of user receivers. The success of GPS in large-scale civil

use is attributable almost entirely to the revolution in integrated circuits, which has

made the receivers compact, light, and an-order-of-magnitude less expensive than

what was possibly thought twenty years ago. Much higher capabilities and lower

9

price of receivers have been achieved since the first generation receivers were

designed for precise positioning.

GPS satellites transmit two kinds of codes, C/A code for civilian use and P(Y) code

for military use. Signals are transmitted from the satellites to users in two

frequencies, L1, L2. Receivers can receive C/A code, P(Y) code and carrier phase

measurement from L1 frequency, P(Y) code and carrier phase measurement from L2

frequency. Two civilian frequencies will be available in the future, that is, L2C code

of L2 frequency and the future L5 frequency. As most GNSS systems except Galileo,

are originally developed for military service, so there are always some encrypted

codes or frequencies which can’t be accessed by civilian users.

The fundamental positioning principle is shown in Figure 2.2. Basically, at least four

satellites are needed to determine the three dimensional (x, y, z) position of the

receiver and the time factor. There are a variety of GNSS positioning errors in this

positioning process, such as satellite clock and ephemeris, ionospheric delay,

tropospheric delay, multipath and receiver noise. This is why RTK concept is

introduced to reduce these positioning errors in order to provide a better positioning

service.

Satellite

Satellite

User Receiver (X, Y, Z, T)

Satellite

Satellite

Figure 2.2 Fundamental Principle of Positioning

GNSS Applications

Applications of GNSS exist in both military and civilian fields. GPS and GLONASS

were first developed and adopted for dedicated military use, but they are now

10

becoming more and more popular for civilian usage. The Galileo and Compass

systems under development are designed for dual use, although some services may

be accessed by authorised users only.

The civil applications of GNSS-based positioning and navigation may be divided

roughly into (Misra and Enge 2006)

• mass market, such as vehicle navigation

• specialized applications such as aviation, transport and space navigation

• professional level, such as high-precision (millimetre-to-centimetre-level)

positioning

The mass market has the largest number of user receivers and takes about 90% of the

whole GNSS application’s market, but the accuracy requirement for this kind of

consumers is not very strict, usually one to ten meters. More specialized applications

such as transport safety of life, liability critical and space navigation require more

accurate positioning results and more reliable service, which is also known as

integrity and high availability of GNSS positioning service. The third kind of GNSS

application is more professional level, such as high-precision (millimetre-to-

centimetre-level) positioning, which is the main concern of this research.

2.1.2 Real Time Kinematic (RTK) Positioning

Basic GNSS Positioning Methods

There are two types of tracking ranges available from GNSS systems to support

various positioning methods: code range and phase range. Code tracking ranges

provide estimates of instantaneous ranges to the satellites, which are mainly used for

single point positioning or navigation. The code measurements at an instant from

different satellites have measurement errors and are referred to as pseudoranges. The

other is carrier phase measurement. Carrier phase tracking provides measurements of

the received carrier phases relative to the phase of a duplicate sinusoidal signal

generated by the receiver clock. The carrier phase gives a precise measurement of

change in the satellite-user pseudorange over a time interval, and an estimate of its

instantaneous rate of change, or Doppler frequency (Misra and Enge 2006).

11

RTK Positioning Concept and Principles

Based on the basic positioning methods, RTK was developed to reduce the

positioning errors, and thus provides high-precision (centimetre-level) positioning in

real time through the use of GNSS and communication links.

Positioning errors are similar for users located ‘not far’ from each other. RTK

provides corrections to user receivers located in a certain area, which is usually less

than 20km, as Figure 2.3 illustrates. The reference station remains stationary and

transmits corrections to the user receiver via communication links.

Satellite

Satellite

User Receiver Reference Station

Broadcast Correction

Satellite

Satellite

Figure 2.3 RTK Positioning Illustrations

RTK, which is used for kinematic surveying, relies primarily on precise carrier phase

measurements. The most critical part in RTK is rapid ambiguity resolution (AR).

Hence the base stations must be deployed in a dense enough pattern to model

distance-dependent errors to such an accuracy that residual double-differenced

carrier phase observable errors can be ignored in the context of such rapid AR (Rizos

2003), which means a much higher cost to deploy. Therefore, RTK positioning is

extended from a single base to a multi-base technique by implementing the latest

technique, NRTK.

Network-based RTK

One significant drawback of the single-base RTK approach is that the maximum

distance between a reference and a rover receiver is limited, for instance, 10 to 20

kilometres (Rizos 2003), in order to be able to rapidly and reliably resolve the carrier

phase ambiguities. This limitation is caused by distance-dependent biases such as

12

orbit error, and ionospheric and tropospheric signal refraction. These errors,

however, can be accurately modelled using the measurements of an array of GNSS

reference stations surrounding the rover site. Thus, RTK positioning is extended

from a single base to a multi-base technique. NRTK can make the baseline among

reference stations increase from 10-20km up to 70km or even longer. In this way,

NRTK also reduces the number of reference stations needed, which saves significant

cost to the operators and service providers (Wanninger 2006).

NRTK positioning is becoming a successful story in more and more states and

countries in recent years. Commercial mode has been developed that users can access

the reference data for their needs after payment. On the other hand, NRTK can cut

the cost dramatically by reducing the number of reference stations. According to data

from some projects (Wanninger 2006), the number of reference stations can be

reduced from about 30 reference stations per 10,000 square kilometres for single

base RTK to 5 to 10 reference stations per 10,000 square kilometres for NRTK An

example of the coverage of single base RTK compared with NRTK is given in

Figure 2.4, in which the circles stand for the coverage of single base RTK and we

can see the coverage increases a lot after single base RTK systems are linked

together to form a NRTK.

Figure 2.4 Illustration of Single Base RTK and NRTK

Three major steps are used in NRTK (Wanninger 2006):

• Fixing the ambiguities among the baselines of the reference network

13

• Correction model coefficients are estimated (modelling of the distance-

dependent biases)

• An optimum set of reference observations is computed from the observations

of a selected master reference station

VRS

One of the most popular NRTK concepts is the Virtual Reference Station (VRS),

which was first introduced in the German reference station network Satellite

Positioning Service of the German National Survey (SAPOS). The name of this

approach results from the fact that observations for a “virtual” non-existing station

are created from the real observation of a multiple reference station network. This

allows eliminating or reducing systematic errors in reference station data resulting in

an increase of distance separation to the reference station for RTK positioning while

increasing the reliability of the system and reducing the initialization time

(RETSCHER 2002; Wanninger 2003). The system architecture of the virtual

reference station concept is depicted in Figure 2.5 (Wanninger 2006). A virtual

reference station is formed by several reference stations surrounding the user to

facilitate the position service.

Figure 2.5 System Architecture of the Virtual Reference Station (Wanninger 2006)

Other important concepts relating to NRTK include Continuously Operating

Reference Stations (CORS), Radio Technical Commission for Maritime Services

14

halla


(RTCM), and Networked Transport of RTCM via Internet Protocol (Ntrip). A brief

introduction to each of them is given below.

CORS

CORS is the first important item for NRTK. CORS are generally a group of

reference stations, which form the infrastructure for NRTK. The CORS network can

be divided into several categories, such as global/continental CORS, national CORS,

regional CORS, and local CORS etc. The representative CORS are continental

CORS run by International GNSS Service (IGS), American National CORS

maintained by National Geodetic Survey (NGS) (NGS 2006), and so on. There are

also several CORS networks in Australia, GPSnet of Victoria, SydNET of New

South Whales and SunPOZ in Queensland. Most CORS data is available for

retrieving using some client software for free which will be introduced later.

RTCM

The concept of Radio Technical Commission for Maritime Services (RTCM) is the

data format of RTCM recommended standards for NRTK, which is usually referred

to RTCM 3.x (RTCM 2007). There are quite a few types of messages in an

observation data file of RTCM format. For example, message type 1 contains data of

DGPS, pseudo range corrections and range rate corrections, type 3 contains data of

reference station coordinates, and types 18 and 19 contain data of RTK, uncorrected

carrier phase measurements and uncorrected pseudorange measurements. The

specification of RTCM can be purchased from Radio Technical Commission for

Maritime Services.

Ntrip

Ntrip is used for an application-level protocol streaming GNSS data over the

Internet. Ntrip is a generic, stateless protocol based on the Hypertext Transfer

Protocol HTTP/1.1. The HTTP objects are enhanced to GNSS data streams (LENZ

2004).

Ntrip is an RTCM standard designed for disseminating differential correction data

(e.g. in the RTCM-104 format) or other kinds of GNSS streaming data to stationary

15

or mobile users over the Internet, allowing simultaneous PC, Laptop, PDA, or

receiver connections to a broadcasting host. Ntrip supports wireless Internet access

through Mobile IP Networks like Global System for Mobile Communication (GSM),

General Packet Radio Services (GPRS), Enhanced Data rates for GSM Evolution

(EDGE), or Universal Mobile Telecommunication Service (UMTS).

Ntrip is implemented in three system software components: NtripClients,

NtripServers and NtripCasters. The NtripCaster is the actual HTTP server program

whereas NtripClient and NtripServer are acting as HTTP clients (FACC 2007).

To summarize, CORS provides the base infrastructure for NRTK, and RTCM 3.x

and Ntrip care about the data format and protocol used during the data transmission

in NRTK.

Real Time Kinematic and near Real Time Kinematic

Differences between the concepts of Real Time Kinematic and near Real Time

Kinematic must be stated here in order to make the later part of the thesis clearer. In

the context of Real Time Kinematic positioning, the service has to be provided

within 1 second, while the time requirement for near Real Time Kinematic can be

within several seconds.

2.2 Overview of Grid Computing

This section provides a brief review for various Grid computing concepts and

applications.

2.2.1 Grid Computing

One similar concept of Grid Computing is Peer-to-Peer (P2P) Computing. P2P

Computing is the sharing of resources between computers. Such resources include

processing power, knowledge, disk storage and information from distributed

databases (Kamath 2001). In the last several years, about half a billion dollars have

been invested in companies developing P2P systems (Loo 2007). This kind of

interest is due to the success of several high-profile and well-known P2P applications

such as the Napster and Oxford anti-cancer projects (Loo 2003). The possibility of

16

using P2P Computing in our framework will be explored in Chapter 3, which is

about designing NRTK data processing based on Grid Computing.

The word of Grid is always used by analogy with the electric power grid, which

provides pervasive electric power and has fundamentally changed our way of living.

Back in 1998, Grid was defined, “A Computational grid is a hardware and software

infrastructure that provides dependable, consistent, pervasive, and inexpensive access

to high-end computational capabilities” (Foster and Kesselman 2004). The

proceeding definition concentrates more on the computational capabilities of Grid

Computing. Grid Computing accomplishments can now prove to be more powerful

than the largest computer in the world. Later iterations broaden this definition with

more focus on coordinated resource sharing and problem solving in multi-

institutional Virtual Organizations (VO). Virtual Organization is defined as a

dynamic set of individuals and/or institutions defined around a set of resource-

sharing rules and conditions (Foster and Kesselman 2004). A simplified Grid

topology is shown in Figure 2.6, which consists of three Grid nodes. The grid nodes

can be computer servers or clusters located in different places, and connected by

various kinds of communication links, such as optical fibre, twisted-pair and so on.

Grid Node2

Grid Node1

Grid Node3 Figure 2.6 Simplified Grid Topology

17

In the context of this research project, VO means different grid nodes sitting in the

processing centres of different states. The computing resources of these VOs will be

utilised to form the strong computing capability of grid.

Grid Computing has evolved as an important field in the computer industry by

differentiating itself from distributed computing with an increased focus on resource

sharing, coordination, and high-performance orientation. Grid Computing is trying to

solve the problems associated with resource sharing among a set of individuals or

groups. Figure 2.7 illustrates a layered grid architecture and its relationship to the

Internet Protocol architecture.

Application Application

Inte

rnet

Pro

toco

l Arc

hite

ctur

e

Grid

Pro

toco

l Arc

hite

ctur

e

Figure 2.7 Layered Grid Architecture vs. the Internet Protocol Architecture

In addition to resource-sharing and formation of virtual organizations, open

standards become a key underpinning. It is important for open standard to be used

through the whole grid implementation, which also accommodate other open

standards-based protocols and frameworks to provide interoperable environments

(Joseph and Fellenstein 2004). Grid Computing environments must be constructed

upon the following foundations (Joseph and Fellenstein 2004):

• Coordinated resources

• Open standard protocols and frameworks

Collective

Resource

TransportConnectivity

Internet

Fabric Link

18

There are many organizations in the world striving to achieve new and innovative

Grid Computing environments. Figure 2.8 depicts the Grid Computing organizations

(Joseph and Fellenstein 2004).

Figure 2.8 Basic Classifications of Grid Computing Organizations

To achieve a successful adoption of Grid Computing requires an adequate

infrastructure, security and other key components. The Globus project is a multi-

institutional research effort to create a basic infrastructure and high-level services for

a computational grid. Globus GT4 middleware, core, and high-level services present

a wide variety of capabilities. Figure 2.9 illustrates the details on Globus

infrastructure based on Globus GT4.

Figure 2.9 Globus GT4 Infrastructures

In the early development stage of grid applications, numerous middleware solutions

are developed to solve Grid Computing environments. Today, with the emergence of

grid service-oriented technologies, including XML-based solutions becoming more

and more popular, it’s simpler to achieve valuable solutions. Grid middleware topic

areas are becoming more sophisticated at an aggressive rate. Figure 2.10 shows the

topology of these middleware topics.

Grid Users (NSF, TerraGrid)

Tool kits and middleware solution providers Custom Middleware (Sun Grid Engine…)

Corporate Toolkits and Users

(Globus, OGSI.NET, NMI)

(IBM, Avoki/Sun, HP, MS, Platform)

Standards Organizations (GGF, IETF, DMTF, W3C, OASIS)

CIM, WSDL, SOAP, .NET, OGSA, OGSI, URI, HTTP, J2EE, CORBA

Globus High-Level Services GRAM MDS

Globus GT4 Core

OGSI Web Service

GSI

WS-Security

19

Grid Applications

Hosting Environment

Grid

Mid

dlew

are

Infr

astru

ctur

e

Dat

a M

anag

emen

t

Secu

rity

Res

ourc

e M

anag

emen

t

Info

rmat

ion

Serv

ices

Grid

Mid

dlew

are

Infr

astru

ctur

e

Figure 2.10 Topology of Grid The emergence of the Open Grid Services Architecture (OGSA) in 2002 led to a true

community standard with multiple implementations, including, in particular, the

OGSA-based GT 3.0, released in 2003. Building on and significantly extending GT2

concepts and technologies, OGSA firmly aligns Grid Computing with broad industry

initiatives in Service-Oriented Architecture (SOA) and web services. OGSA

architecture is shown in Figure 2.11. A fundamental OGSA concept is that of the

Grid service: a Web service that implements standard interfaces, behaviours, and

conventions that collectively allow for services that can be transient (i.e., can be

created and destroyed) and stateful (i.e., we can distinguish one service instance from

another).

More specialized and domain-specific

services

Other

Schema

OGSA Platform Services

(CMM, Service Domain, Policy, Security, Logging, Metering/Accounting)

OG

SA

Schemas

Open Grid Services Infrastructure (OGSI)

Web Services

Figure 2.11 OGSA Platform Architecture

Hosting Environment

Host env. and protocol bindings

Protocol

20

The foundational Open Grid Services Infrastructure (OGSI) specification defines the

interfaces, behaviours, and conventions that control how Grid services can be

created, destroyed, named, monitored, and so forth. OGSI defines a set of building

blocks that can then be used to implement a variety of resource layer and collective

layer interfaces and behaviours (Foster and Kesselman 2004).

WS-Resource Framework (WSRF) is the latest standards of web service. WSRF is

also known as stateful web service, where state stands for the resources the web

service needs. OGSI was the model that was used in the Globus Toolkit 3.0, and now

it’s being replaced by WSRF, WS-Security, and the broader set of Web services

standards.

2.2.2 Security

Security is a critical issue for Grid Computing, because at the very first stage grid

nodes have to trust each other and then their computing resources, such as CPU

cycle, memory resource, can be shared in a Grid Computing environment. Privacy,

integrity and authentication are believed to be the three pillars of secure

communication. Privacy makes sure the communication between sender and receiver

is private, and Public Key Infrastructure (PKI) is the most popular mechanism to

implement privacy. Integrity is used to constrain that the sender admits what it sends.

Authentication checks whether information is sent by sender as declared, and digital

signature is the mature technology to realize authentication. A typical secured

communication picture is shown in Figure 2.12, which indicates the role each

security pillar plays.

Figure 2.12 Simplified Scenario of Typical Secured Communication

21

In the first step, the sender sends the data encrypted by receiver’s public key to

receiver. Then receiver uses his own private key to decrypt the data. These two steps

utilise the PKI mechanism to ensure the privacy of the data transferred. At the same

time, the sender also sends a digest to the receiver, which is a summary of the

original data using a certain algorithm. The last step is to generate a digest of the data

receiver receives in the first step using the same algorithm as the second step, and

then compare whether this digest and the received digest are the same or not. If they

are the same, it means the data is not changed by somebody else during the

transferring process. Otherwise, the integrity of the data has a problem and we have

to dispose the data received. Actually the digest is also encrypted by sender’s private

key, and the receiver has to use the sender’s public key to decrypt it, which process is

omitted in Figure 2.12. As only the sender has his own private key, so when the

receiver can successfully decrypt the digest, the sender then cannot deny the data he

sends. To some extend, this process also guarantees the authentication of the data

transferring.

Another important concept in computer security, although not generally considered a

‘pillar’ of secure communication, is the concept of authorization (Sotomayor and

Childers 2006), which is used to grant privilege to users. In Grid Computing

environment, Certificate Authority (CA) is responsible for distributing certificates to

users, that is, the process of authorising.

2.2.3 Web Service

Find

Publish

Figure 2.13 Web Service Architecture

Figure 2.13 depicts the classic web service architecture, which is made up of service

requester, service broker and service provider. The service provider registers its

22

service in the service broker using technology such as UDDI, and when the service

requester requests for a service, the service broker will provide the corresponding

service address WSDL to the service requester, then the service requester is bound

with the service provider through SOAP. The whole process is also known as

SOAP/WSDL/UDDI, which indicates the three important technologies used.

2.2.4 Job Scheduling

Job scheduling is the most important topic in Grid Computing. One of Grid

Computing’s main tasks is to distribute user’s request to different Grid nodes, which

is the concept of job scheduling. Job scheduling has been extensively studied over

the past 50 years, including intelligent scheduling algorithms and meta-scheduling.

Classic scheduling modes deals with job shop, flow shop, open shop, cycle shop and

online scheduling etc. And some new scheduling modes have emerged from

applications in computer science, while some others are from operation research and

management community (Leung 2004).

For the data-intensive projects, the data location has also to be taken into

consideration when scheduling jobs. Sometimes job scheduling has also to deal with

some other similar localised factors, such as the computing resource distribution. We

can say that it’s always not easy to make a most optimised job scheduling policy.

2.2.5 Globus

Globus (Globus 2008) is the most popular Grid Computing toolkit, developed by the

Globus team at University of Chicago and Argonne National Laboratory. At this

stage, Globus is the de facto standards of Grid Computing and the latest version is

Globus Toolkit 4 (GT4).

Globus’s kernel components include GSI for security, GridFTP for data

management, WS GRAM (Feller, Foster et al. 2007) for execution management,

Index for information services and Python/C/Java Core for common runtime, as

Figure 2.14 depicts. Most of Globus components are built upon web service, such as

WS GRAM, Common Runtime. But some are not web service based, such as

GridFTP, which uses Globus’s specific protocol and now has become the industry

standard.

23

Figure 2.14 Globus Components (Sotomayor and Childers 2006)

The security component is the basis of other components, which is made up of

various authentication, authorisation and delegation solutions. Data management is

devised for high throughput data transmission. Execution management takes care of

job scheduling and maximises the performance and efficiency of job execution.

Normally local adapters are needed to achieve this goal, while the adapter Fork

provided with Globus can be used for simple solution. Information services’ function

is to monitor and discover resources and services on Grids, and Monitoring and

Discovery System is one of the solutions. Common runtime are some web service

based CLIENT solution, such as C Runtime, Java Runtime. This is the interface of

Globus for both Globus users and programmers.

Schematic view of GT4.0 components is illustrated in Figure 2.15. Requests from

CLIENT end written in Java/C/Python are forwarded to SERVER end, which is

authenticated by WS-Security with GSI. In the SERVER end, CLIENT’s requests are

24

halla


processed by corresponding components implemented as Java Services in Apache

Axis plus GT libraries and handlers, or C Services using GT libraries and handlers.

Figure 2.15 Schematic View of GT4.0 Components (Foster 2005)

In next chapter we will demonstrate how to utilise the Globus Toolkit as the basis of

our framework for NRTK data processing. The Globus Toolkit will provide the

security infrastructure for the whole Grid environment and be able to schedule jobs

anagement (CRM), Supply Chain Management (SCM),

which dynamically adjust the number of worker processes used to meet

through web service.

2.2.6 Grid Computing Applications

Grid Computing has been successfully applied in many fields. SAP AG has modified

their flagship product to be Grid-based. It has also made some applications, like

Workforce Management (WFM), which form part of SAP’S core product suite, like

Customer Relationship M

computational demands.

Grid is also extensively used in high performance data capture, network for

earthquake engineering simulation, Earth System Grid (ESG), open science grid,

Biomedical Informatics Research Network (Ellisman and Peltier 2004) and federated

computing for high-energy physics. In the famous LHC (Large Hadron Collider )

project (CERN 2008) operated by the Europe Organisation for Nuclear Research

(CERN), the facility Grid helps transfer the large amount of experiment output data

25

halla


(roughly 15 Petabytes) to tens of countries around the world to do the round-the-

clock analysis. In the biomedical and biochemical industry, large computation for

experiments is done under the Grid environment. A single-sign-on portal is provided

to Geoscience community researchers to access a myriad of computational and data

TK needs by utilising all the computational resources of each Grid

ode. Globus will be installed in each Grid node to collect data or do some

ing and send back the

result to a server, while the server is responsible for integrating all the results

ckground Processing a

t of updating

resources based on Grid computing (R.Fraser, T.Rankine et al. 2007).

2.3 Relationship between NRTK and Grid Computing

The concepts of NRTK and Grid Computing are very similar from high-level aspect.

As there are lots of reference stations in NRTK, so are lots of Grid nodes in Grid

Computing. Each reference station or each server in Network Processing Centre can

be mapped to one Grid node. Grid Computing will provide the high computational

capability that NR

n

computing task.

As long as concrete computational tasks are concerned, Grid Computing will mainly

take charge of background processing to support the whole real time positioning

service. As we can see from Table 2.1, most background processing should only be

conducted every several minutes or hours. In this way, the computational capability

of Grid can be utilised to compute these background process

together in order to facilitate the real time processing service.

Table 2.1 Ba nd Time Requirements

Background processing Time requiremen

Orbital interpolation Every 2 hours for each satellite

Zenith Tropospheric Delay (ZTD) estimation Every 5 minutes

Ionospheric grid map generation Every 30 to 60 seconds

2.4 Summary

This chapter gives background knowledge of NRTK and Grid Computing, which is

the basis of this research topic. The review of GNSS is firstly presented followed by

the Real-Time Kinematic positioning, namely, RTK positioning. In the RTK

26

positioning review part, the concept of NRTK is introduced as well as some other

NRTK related topics, such as CORS, RTCM, Ntrip. The second part of this chapter

is a summary of a Grid Computing literature review. Some critical aspects, such as

Virtual Organisation, Security, Web Service, Job Scheduling and de facto toolkit

his chapter is dedicated to the background knowledge review. The design of the

ew framework for NRTK data processing will be presented in the next chapter.

Globus are briefly presented as well.

T

n

27

3. Design of Framework for NRTK Data Processing Based on Grid Computing

This chapter provides the design of the grid computing based framework for NRTK.

As stated in the statements of research problems (see Chapter 1), there are mainly

four aspects that will bring challenges to the computation of NRTK.


• Large scale, wide area NRTK services with more reference stations

• Complex computation algorithms for network-based processing

• Greater part of positioning processes shifting from user end to network centre

with ability to cope with hundreds of simultaneous users’ requests (reverse

RTK)

Section 3.1 of the Chapter provides the grid computing framework design

requirements that were identified based on the four challenges faced by future NRTK

systems. Section 3.2 provides a detailed architecture design of the current NRTK

system as well as the general grid computing architecture. Section 3.3 outlines the

developed framework for this research. Each of the different layers in the framework

is presented and discussed.

3.1 Requirements of NTRK Data Processing

There are two main requirements for NRTK data processing, expandable computing

power and scalable data sharing/transferring capability.

Computational requirement

As long as concrete computational tasks are concerned, Grid Computing will mainly

take charge of background processing to support the whole real time positioning

service. Background processing refers to the timing consuming processes such as

preparing GNSS orbital polynomial functions, updating reference station coordinates,

estimation of tropospheric grid models and ionospheric grid map generations. These

corrections vary slowly, and can be updated with different delays, for instance up to

30 seconds for ionospheric grids, and a few minutes for tropospheric corrections, as

we can see from Table 3.1. In this way, the computational capability of Grid can be

utilised to compute these background processing and send back the result to a server,

28

while the server is responsible for integrating all the results in order to facilitate the

real time processing service.

Table 3.1 Background Processing and Time Requirements Background processing Time requirement of updating

Orbital interpolation Every 2 hours for each satellite

Zenith Tropospheric Delay (ZTD) estimation Every 5 minutes

Ionospheric grid map generation Every 30 to 60 seconds

Data sharing/transferring capability

In addition to computational requirements, there are also data sharing or transferring

requirement for NRTK. Data sharing among processing centres sitting in different

CORS and IGS reference stations will become very frequent as each CORS is

becoming more and more widely covered. Such data sharing will bring lots of

benefits to national-wide geodetic activities, such as the backup service of one state

data in another state’s processing centre in the case of earthquakes for example.

SunPOZGPSnet

SydNET

Private site

IGS

Figure 3.1 Data Sharing Service Among Different Sites

As shown in Figure 3.1, sites in different states and places are connected together

and form a whole Virtual Organisation in the Grid Computing environment by

trusting each other. Different sites can share intermediate positioning results with

each other. For example, SydNET can ask GPSnet and SunPOZ to send data to it.

29

And in this case, SydNET acts as the Grid server, while GPSnet and SunPOZ act as

the Grid node. Actually every site can serve as server or node.

One obvious data transferring application is the data backup service. As it’s always

no good to put all the eggs in one basket, so in order to prevent disastrous results in

case of some emergencies, data should be backed up in several places on a

daily/weekly/monthly basis based on different requirements. The amount of data

generated by one reference station can be 1-2k/s, that’s about 75M per day. The data

amount accurs during the backing up process is shown in Table 3.2.

Table 3.2 Data Amount of Backing-up Service

Per day Per week Per month Per year

10 stations ~750MB ~5.25GB ~160GB ~2TB*

20 stations ~1.5GB ~10.5GB ~320GB ~4TB

50 stations ~3.75GB ~26.25GB ~800GB ~10TB

100 stations ~7.5GB ~52.5GB ~1.6TB* ~20TB

* 1TB=1024GB, 1GB=1024MB, 1MB=1024KB

Hundreds of reference stations are being deployed around Australia by different

bodies, such as IGS, government departments and private owner. Even some single

CORS network, such as GSPnet, is planning to set up about 100 reference stations in

a few years’ time. The data amount will also certainly increase a lot if GLONASS

and Galileo become fully operating. These changes bring a huge requirement for a

consolidated and collaborative data sharing framework among different sites

Both the critical computational requirements and large-amount data sharing service

need an integrated collaborative framework to assist the whole process, which can be

provided by a well-structured Grid RTK data processing framework.

30

3.2 Architectural Studies of Grid Computing Based NRTK

The concept of NRTK coincides with Grid Computing to some extent. Both are

network architecture and new solutions for emerging problems in a more efficient

way.

NRTK can be divided into three parts, as Figure 3.2 illustrates. Part One consists of

reference stations generating data streams at the speed of 1-2kbps. Part Two is the

Network Processing Centre which is critical for processing and computing the

received data (from Part One) and forwards the results to Users (Part Three). Part

Three are users receiving correction data from Network Processing Centre through

mobile or wireless network, such as GSM, GPRS, EDGE, or UMTS. To some extent

the last part in the architecture of NRTK is transparent in our project.

Reference Stations Network Processing Centre Rover

Network Processing Application

Tool kits and middleware solution

providers (Globus etc.)

(Part One) (Part Two) (Part Three)

Figure 3.2 NRTK Architecture

31

3.3 Framework Design

Taking all of the requirements and architectural studies into consideration, the

framework for NRTK data processing based on Grid computing is designed to

include three layers, Client layer, Service layer, and Execution layer. The overall

framework is depicted in Figure 3.3.

Client (Grid Portal in Java/C/Python)

Task pipe

Send requests

Security/Web service/Job Scheduling/...

Results

Receive results

Grid Node

Grid Node

Grid Node

Client layer

Execution layer

Service layer

Figure 3.3 Grid Computing based NRTK

Client layer

The Client layer is responsible for interacting directly with the user. When the user’s

request comes in, the Client layer receives user’s approximate position and returns

the accurate position to the user when the result is ready. The user not only receives

the accurate position, but also some value-added service, such as commercial

services around this position, which can be provided by merchants around this area

(Lim and Rizos 2007). Actually some products with the same commercial idea exist

now, such as Google Earth and Google Map.

32

Service layer

Then the request is forwarded to the Service layer where the function is implemented

to fulfill the Client’s requests. As Network RTK project is always so huge that no

any single organisation can finish all the jobs. Normally different components of the

whole process are undergone within different universites and companies. For

example, one organization is responsible for the positioning component, another one

maybe responsible for the integrity monitoring, as depicted in Figure 3.4.

Organisation APositioning component

Organisation BIntegrity monitoring

Organisation…...

C

Ntrip Caster

Ntrip ClientNtrip Client

IGS Ephemeris ServerNtrip Caster

Figure 3.4 Different Organisation Being Responsible for Different Component

Various methods can be used here to implement the requirement, such as the

conventional VRS-constructing method or the newly proposed server-based NRTK

framework (Lim and Rizos 2008). As huge computational capability is needed in this

layer to calculate the VRS for hundreds of users simultaneously, the computational

task is distributed to different Grid nodes, such as NPC in Queensland, IGS in

Germany, through the GT4 scheduler.

In order to make a full use of the computational resources of all Grid nodes, the

whole algorithm should be modularised carefully. For example, if there are six Grid

nodes, then the whole algorithm can be modularised to,

1. Data collection and conversion from each station

2. Network-based Ambiguity Resolution

33

3. Network-based ionospheric grid generations

4. Network-based tropospheric grid generations

5. GNSS orbital corrections

6. Integrity monitoring

But if there are only three Grid nodes, then the computation tasks may be shared by

different nodes, such as one for the tropospheric bias calculation, one for ionospheric

bias calculation and other bias calculations to map each module to each Grid node.

Execution layer

The execution layer consists of all the Grid nodes, the real place where the jobs will

be done. This layer has to deal with the network topology, security and some other

heterogeneity-related issues. As long as network topology is concerned, the present

context that most of the reference stations are already connected to the servers sitting

in different NPC (Network Processing Centre) provide a consolidated basis for us to

construct a Grid computing environment. All we have to do is to deal with the

interconnection of the servers and their communication. As illustrated in Figure 3.5,

the network will connect CORS processing centre of different states, such as GPSnet

of Victoria, SydNET of NSW, and IGS worldwide stations together.

34

SydNETNSW

GPSnetVictoria

IGSGermany

Router

SunPOZQueensland

Node

Node

Node

Node

Node

Node

Node

Node

Client

Client

Client

Figure 3.5 Execution Layer Network Topology

Finally, the result is collected from all Grid nodes involved in the processing back to

Web Service layer, which responds to the Client’s request.

The whole framework will be loose coupling, as each layer only exposes their

interface to other layers. If one layer needs to change, we only need to modify that

layer; the other layers will not be affected by this change. Service composition or

aggregation will also be used to reuse some of the basic services, which can then

form a more meaningful and more extensive service for the Client layer.

3.4 Design of Different Layers

3.4.1 Client Layer

The Client layer can be designed as a Grid portal, which can be implemented in C,

Java, or Python. The Client layer invokes endpoints of appropriate Web Services to

pass the requests to the next layer with necessary parameters. After the Web Service

35

layer finishes the request, the Client layer collects the results and sends a response if

needed.

According to the client’s requirements and budget, different levels of Single Sign On

(SSO) interface or Portal can be designed for the Client layer. If using a small

budget, then we can adopt the simplest way of SSO, as depicted in Table 3.3. Every

person, depicted as ‘User’ in the table, will have to register in the Portal system first

and get his user-id and password. Originally every user has a pair of username and

password for each system, such as the positioning service, data sharing etc, the

person needs to map these pairs of usernames and passwords with the user-id in the

Portal. The final mapping relationships should look like Table 3.3. In this way, once

the user logs on the Portal, logging on to the other subsystems will not be required.

Table 3.3 User-System Table

User System Username Password 001 A System A 1234 001 B System B 2345 001 C System C AAAA 001 D System D CCCC 001 … … …

Of course, if the client has sufficient budget and time, we can implement a more

complex but more robust Portal solution. For example, Kerberos (Wikipedia 2008)

can be adopted to be the authentication protocol. In this way, we’ll have to maintain

one or several central servers to store the user-system mapping, and a third party to

verify the user and server’s identity.

3.4.2 Service Layer

The Web service is a set of functions that are implemented by traditional languages,

like C, Java or some script languages. eXtensible Markup Language (XML) is used

to expose web service to outside for invoking. So the web service is platform

independent, which is an important feature for Grid Computing because of the

heterogeneous environment.

WS-Resource Framework (WSRF) and WS-Resource Transfer (WS-RT) are the

latest standards of web service. WSRF is also known as stateful web service, where

36

state stands for the resources the web service needs. WS-RT combines elements of

the original WSRF standard with the WS-Management standards to enable easier

exchange of resource information and objects between different components. All the

web services developed in this project will run in Globus Toolkit, and as only WSRF

is supported at this stage, so WSRF will be first considered to be the standard we

should comply to. At a later stage, we may use delegation or XSTL to migrate WSRF

web services to WSRT complying if WSRT is supported by the future versions of

Globus Toolkit (Cafaro 2007).

Service layer receives requests from the Clients layer. Then the corresponding web

service is invoked to implement user’s requests. Several core services are described

below.

Real Time Positioning Service This is the most important service for the NRTK

mission. As it is so complex, it has to be decomposed into several smaller services

according to the NRTK positioning algorithm mentioned in Network RTK part of

Section 2.1.2:

The first service will be used to determine and reconfirm ambiguities among

the baselines of the reference network.

The second service will be used to generate high-rate ranges and update the

correction models for the distance-dependent biases from all stations

The third service will be used to compute a user position with observations

from the user and selected set of reference stations.

The fourth service, designed as parallel service to the third service, is to

compute a user position with user observations, correction grids, and high-

rate ranges from a nearest reference station.

The Real Time Positioning service can be invoked by operators in any NPC or

servers to provide services to their user group. Of course, this service can also be

invoked by a central server if necessary. Figure 3.6 illustrates the idea of service

decomposition and composition

37

Figure 3.6 Service Decomposition and Composition

ackground processing service As defined in 3.1, background processing is

at to

elay (ZTD) estimation, based on ambiguity resolved

EC

position and composition including background processing service is

B

performed after the fact for timing consuming tasks. It processes the past data sets

and generates slowly varying corrections. Three major smaller services include

Orbital Interpolation service: convert the IGS predicted orbits in SP3 form

polynomial functions (Feng and Zheng 2005).The process is conducted every 2

hours for each satellite.

Zenith Tropospheric D

double-differenced phase measurements and zero-differenced phase

measurements for each reference station, which are updated every 5 minutes.

Ionospheric grid map generation, network-adjustment from the smoothed T

solution from each station and DD ionospheric solutions, updated every 30 to 60

seconds.

Service decom

shown in Figure 3.7.

38

Figure 3.7 Background Processing Service Decomposition

ata sharing service The Data sharing service is provided for the data sharing

ata transferring service Basically the data transferring service is based on File

3.4.3 Execution Layer

most fundamental and important layer in the NRTK data

D

among different processing centres or reference stations. The service is all about the

collaboration work, such as the sharing of intermediate results, background

processing outputs and solutions for national-wide real time and near real time, and

post-mission services. It is also possible that if appropriately configured, each facility

could theoretically provide cross-border redundancy to other CORS networks in the

case of disruption to a particular service (Hale 2007).

D

Transfer Protocol (FTP). As the magnitude of data transferring among different

processing centres can be TB (Tera-Bytes), GridFTP protocol from Globus Toolkit

will be adopted to ensure the whole process efficient and reliable.

The execution layer is the

processing framework. It has to deal with the network topology, security and some

other heterogeneity-related issues.

39

The first issue with the network infrastructure would be the network topology. As

illustrated in Figure 3.8, the network will connect CORS processing centre of

different states, such as GPSnet of Victoria, SydNET of NSW, and IGS worldwide

stations together.

SydNETNSW

GPSnetVictoria

IGSGermany

Router

SunPOZQueensland

Figure 3.8 A Simplified Network Infrastructure of NRTK Data Processing

The second issue that the design of network infrastructure has to solve is security. In

the early experimental stage, simple Certificate Authority (CA) could be adopted to

implement the authentication and authorisation process, as shown in Figure 3.9. One

site will be chosen to act as the simple CA server, and a backup server should also be

considered, as the CA server is the most critical component in this simple CA

solution of security. Each site or user has to apply for a user certificate from simple

CA, through X.509 or other security protocols (Foster, Kesselman et al. 1998).

40

Certificate Authority

(CA) Server

Site1User Certificate

Auth

orisa

tion/

Auth

entic

atio

n

Auth

oris

atio

n/

Auth

entic

atio

n

Site2User Certificate

Figure 3.9 Simple CA Solution for Security

For a more robust solution, a third party CA has to be imported to assist the

authentication and authorisation process. MyProxy would be a good third part CA

centre to choose. Users have to apply their certificate from MyProxy CA centre, and

they have to maintain a connection with the centre during a communication for the

server to authenticate the users’ privilege.

3.5 Summary

This chapter focused on how the layered framework for NRTK data processing is

designed. The high-level framework was firstly introduced and this was followed by

details of each individual layers. The framework is designed based on Grid

Computing, namely, the Globus Toolkit, and user’s request is forwarded through

each layer to the Grid environment and scheduled to the Grid nodes through Globus.

Chapter 4 demonstrates the framework of Grid Computing for use in Network RTK

data processing and positioning services. The implementation procedures are

described in detail and results will be used to justify the power of Grid Computing.

41

4. Demonstration of Framework – A Simplified Data Processing Experiment

The implementation of the above framework for full Network RTK services (data

collection, correction generation, positioning, etc) involves significant development

efforts. Thus, a simplified data processing demonstration is used as a proof-of-

concept demonstration for the proposed framework based on Grid Computing idea.

Some open source software packages are evaluated and used during the

demonstration.

Firstly, the overview of a simplified data processing demonstration is presented.

This includes the introduction of open source Ntrip client (for data gathering and

processing) and overview of demonstration environments. The implementation of the

framework is presented which includes the Globus configuration, job submission,

and details of the Ntrip data processing procedures. The last section of the chapter

presents the result analysis and discussion, where different scenarios are tested and

evaluated based on various performance measurement factors.

4.1 Overview of a Proof-of-Concept Demonstration

4.1.1 Introduction to Simplified Data Processing Demonstration

The first step of the simplified data processing demonstration is the collection of data

through the open source Ntrip Client software, where the concept of Ntrip was

introduced in Chapter 2. Herein, data flow process of Ntrip is briefly introduced

below.

Ntrip is composed of three components: Ntrip Server, Ntrip Caster and Ntrip Client.

Ntrip Server is responsible for collecting GPS data from various GPS receivers and

reference stations and then is transferred to Ntrip Caster, such as EUREF-IP

(www.euref-ip.net:80) and IGS-IP (www.igs-ip.net:80). Every Ntrip Caster address

contains an IP address and a port number. Finally Ntrip Client accesses the data from

Ntrip Caster through mountpoint, which stands for different reference stations. The

overall Ntrip data flowing process is shown in Figure 4.1 (Weber G. 2005).

42

Ntrip CasterWww.euref-ipWww.igs-ip.net:80

.net:80

Ntrip Server

Ntrip Client

NtripServer

Ntrip Server…...

Ntrip Client

MountpointACOR0

MountpointBOCH0

MountpointZIM20…...

… Ntrip Client...

Reference Station

Reference Station

Reference Station…...

Figure 4.1 Ntrip Data Flow Process

For this demonstration, Ntrip is used for collecting data from reference stations.

Firstly, commands will be sent from the grid server to grid nodes requesting for data

download from one or multiple mountpoints through Ntrip Client software. Once the

data has been downloaded, computing jobs are launched for data processing and

analysis. Finally, results are returned to the server.

4.1.2 Evaluation of Open Source Ntrip Client

Open source Ntrip clients will be used to obtain the RTCM data in this research

project. There are quite a few open source Ntrip clients provided by Federal Agency

for Cartography and Geodesy (BKG) of Germany (FACC 2007).Their tabulated

information is given below in Table 4.1.

As Linux will be chosen as our operating system, so those Ntrip clients which can

only run under Windows are excluded. After considering version (maturity),

complexity and a series of testing of functionality and stability, we chose

NtripLinuxClient and RTCM 2.x Decoder as our combined option. NtripLinuxClient

will take care of downloading RTCM 2.x raw data from Ntrip Caster, and RTCM 2.x

Decoder will decode the raw data to the format as illustrated in Table 4.2.

43

Table 4.1 Open Source Ntrip Client Information Name Operating

System

Version Type Size Data format

NtripLinuxClient Linux 1.2.7 ZIP ~7 K RTCM 2.x raw data

NtripPerlClient Linux 0.6 ZIP ~15 K

RTCM 2.x raw data

RTCM 2.x Decoder Linux 1.1 ZIP 17K Decoded RTCM 2.x data

Ntrip Client Linux 1.24 ZIP 17 K Converts RTCM 3.x to RINEX (fail to function)

GNSS Internet

Radio

Windows 1.4.11 EXE

~680 K

RINEX

BKG Ntrip Client

(BNC)

Windows/

Linux

1.x ZIP ~4 MB

RINEX

GNSS Surfer Windows 1.06c ZIP 17 K Untested

Table 4.2 RTCM 2.x Message Type01: 4 0 186 2640.0 9.660 0.124

Type01: 5 0 18 2640.0 17.740 0.122

Type01: 6 0 236 2640.0 6.280 0.126

Type01: 9 0 233 2640.0 17.000 0.128

Type01: 14 0 117 2640.0 16.680 0.120

Type01: 22 0 230 2640.0 -33.180 0.060

Type01: 24 0 211 2640.0 8.260 0.116

Type01: 30 0 63 2640.0 18.680 0.120

Type03: 4033461.020 23537.680 4924318.170

Z-Count: 2640.00 0.000000

Type18/19: 0 4 1387 146639 23819816.580 0.000 23819820.460

-306300.343 -306299.597 0 0

Type18/19: 0 5 1387 146639 20033021.480 0.000 20033024.440

-469646.405 -1613658.205 0 0

The explanations for the message types are given as follows:

(1) Records beginning with "Type01:" contain information derived from message

type 1. Parameters 1 to 6 are to be interpreted as:

1 SVPRN

2 User Differential Range Error (UDRE):

44

0 ... <= 1 m

1 ... > 1 m <= 4 m

2 ... > 4 m <= 8 m

3 ... > 8 m

3 Issue of Data (IOD)

4 Z-Count

5 Pseudo-Range Correction [m]

6 Pseudo-Range Rate Correction [m/s]

(2) Records beginning with "Type03:" contain information derived from message

type 3. Parameters 1 to 3 are to be interpreted as the X coordinate, Y coordinate,

and Z coordinate respectively.

(3) Records beginning with "Type18/19:" contain information derived from

message types 18 and 19. Parameters 1 to 11 are to be interpreted as:

1 Station ID

2 SVPRN

3 GPSWeek

4 GPS WeekSec

5 CA Code on L1

6 P Code on L1

7 P Code on L2

8 L1

9 L2

10 SNR1

11 SNR2

(4) Records beginning with "Z-Count:" contain epoch time information.

Columns 1 to 2 are to be interpreted as:

1 Epoch time

2 Clock error

45

Only ‘Type 18/19’ will be used in this demonstration, which will be changed to

RINEX format as the input of positioning algorithm.

4.1.3 Demonstration Environment

The first stage of the demonstration is conducted in a laboratory in the Faculty of

Information Technology, Queensland University of Technology. The whole

hardware environment is a Bus-type topology computer network, which is made up

of five grid nodes, namely five PCs and a 10/100 Mbps Techworks 5-Port Ethernet

Switch. The configuration of every PC is shown in Table 4.3.

Table 4.3 Configuration of Grid Nodes in Laboratory CPU 2A GHZ, Intel Pentium 4 (400MHZ/266MHZ)

L1 Cache 8K/L2 Cache 512K

RAM 512M (DDR266)/ Cache RAM 512 KB

Ethernet 10/100 Mbps Intel 82562 Ethernet Device

Hard Disk 40G (ST340016A) *2

Operating System Debian

Other Software Core set of software in Debian

As data from different base stations must be downloaded in real-time with Ntrip from

Internet in this demonstration, ability to access Internet becomes a critical issue.

There are two ways to access Internet in the campus of QUT. One is wired and

another one is wireless, but both need to login through a secure window in order to

access public websites except QUT Intranet. As no Graphical User Interface (GUI) is

installed for Debian of all Grid nodes, there is no way to get access to Internet

through the traditional way. One laptop is added into the Grid environment to solve

this problem. The wireless network connection of the laptop is shared in the Grid

environment, while the Internet Protocol (IP) address of local network card is set to

‘192.168.0.1’ to be in charge of transferring data to other Grid nodes. The

environment is shown in Figure 4.2.

46

Figure 4.2 Grid Environment in a Laboratory at QUT

In the second stage, we will make use of Grid Australia, belonging to which there is

one node in High Performance Computing Centre (HPC) at QUT. We can submit

jobs to several other grid nodes that we have quota from this node. The grid node at

QUT is an IBM eCluster 1350 machine, which is a twenty-two (22) processor cluster.

The specifications are shown in Table 4.4.

Table 4.4 Specifications of the Grid Node at QUT CPU 2x 3.4GHz 64bit Intel Xeon processors.

20x 2.4GHz 64bit AMD Optron processors

Peak Rating 75 Gflop (approx.)

RAM 44 Gigabytes of main memory (11 x 4GB)

Hard Disk 1.5 Terabytes of disk storage

Operating System RedHat Linux Operating System

Other Software a very large range of research software

4.2 Requirements and Design of the Proof-of-Concept Demonstration

The requirements of this demonstration are quite clear and are divided into three

parts:

1. downloading GNSS RTCM data via Ntrip from mountpoints in real-time

2. submitting various computing jobs through Globus

3. processing the data collected in real-time or near real-time (background

process)

47

The design of this demonstration is similar to the design of the framework for NRTK

based on Grid Computing, as it is a demonstration for proof-of-concept for the whole

framework. The layered high-level architecture is depicted in Figure 4.3.

Grid Node

Grid Node 1 Grid Node 2 …... Grid Node n

MountpointACOR0

Real Time Positioning

Calculate IonoBias

Orbital Interpolation ZTD estimation Ionospheric grid

map generation

Figure 4.3 Design of Demonstration

One Grid server (which actually can be any Grid node in the Grid) will take charge

of sending commands to each Grid node and collecting results back. Actually if the

demonstration is applied to industry applications, there should be more than two Grid

servers to be fault tolerant. The number of Grid nodes should be dynamic and

configurable, and should be tested as the indicator of scalability in the later part of

this demonstration. Each Grid node can download RTCM data from one or many

mountpoints and perform different kinds of tasks, such as calculating positions,

ionopheric bias estimation, data file comparison and validation and so on.

This layered architecture ensures that the Grid server can interact with Grid nodes

flexibly. Also it’s easy to implement the demonstration based upon this architecture

from a technical aspect, as the functional algorithms can be modularized and the

change of one layer will cause little effect to other layers.

4.3 Implementation of Demonstration

4.3.1 Globus Configuration

Globus is installed on every grid nodes and server to act as the security basis. There

are three types of users using Globus, as shown in Table 4.5. The first type is the user

48

‘root’, who can do everything as he wants. The second type is the Globus

administrative user, normally is named ‘globus’, who can do installing Globus,

assigning the certificates and other such kinds of jobs. The third type is the Globus

user, who can execute various kinds of Globus commands after getting the user

certificate from server.

Table 4.5 Basic Parameters of Grid Nodes Server Grid node 1 Grid node 2

IP 192.168.0.1 192.168.0.2 192.168.0.3

Hostname Grid01 Grid02 Grid03

OS Debian 3.1 Debian 3.1 Debian 3.1

Users Root/globus/ade Root/globus/ade Root/globus/ade

The installation process of Globus is quite complex and it is briefly described in

Table 4.6.

Table 4.6 Brief Installation Process of Globus Pre-requisites Zlib/Jdk/Ant/gcc/g++/tar/sed/make/perl/sudo/postgres/libiodbc2/libiodbc2-dev

Building the Toolkit

1. Add a non-privileged user named ‘globus’, which will be used to perform

administrative tasks such as starting and stopping the container, deploying

services, etc.

2. Setup java environment.

3. Build and install the toolkit

Setting up security (different in Server and other grid nodes)

1. SimpleCA (Install SimpleCA on the server, for other grid nodes, just

trust this SimpleCA)

2. Make the machine trust the new CA

3. set up hostcert and sign the certificate using the SimpleCA

4. set up usercert and sign the certificate as user ‘globus’

5. create a grid-mapfile as ‘root’ for authorization

6. test and verify of CA

Setting up GridFTP

1. add the gridftp service to xinetd.d

2. add the gsiftp service to /etc/services

3. reload xinetd service

Starting the webservices container

49

1. setup an /etc/init.d entry for the webservices container

2. create an /etc/init.d script to call the globus user’s start-stop script

3. use one of the sample clients/services to interact with the container

Configuring RFT

1. Configure the system to allow TCP/IP connections to postgres, as well as

adding a trust entry for our current host

2. create the ‘rftDatabase’ as the user ‘globus’

3. try an RFT transfer

Setting up WS GRAM

1. Setup sudo so the user ‘globus’ can start jobs as a different user

2. Test WS GRAM command ‘globusrun-ws’

Details about the commands used in the installation process of Globus are described

in Appendix A.

4.3.2 Submitting a Job

Before submitting any job using Globus, a valid proxy has to be generated for user

from Globus CA server in order to make the user be trusted in the grid system. The

command to get the proxy is ‘grid-proxy-init -verify -debug’ and then user has to

provide his grid password. Table 4.7 shows the scenario of generating a valid proxy

for ‘Deming Yin’ in the grid node ‘Auriga’ at HPC of QUT.

Table 4.7 Valid Proxy Generation -bash-3.00$ grid-proxy-init -verify -debug

User Cert File: /home/n6390544/.globus/usercert.pem

User Key File: /home/n6390544/.globus/userkey.pem

Trusted CA Cert Dir: /opt/vdt/globus/TRUSTED_CA

Output File: /tmp/x509up_u1206

Your identity: /C=AU/O=APACGrid/OU=QUT/CN=Deming Yin

Enter GRID pass phrase for this identity:

Creating proxy ....++++++++++++

..++++++++++++

Done

Proxy Verify OK

Your proxy is valid until: Wed Jun 4 05:50:13 2008

50

There are several ways to submit a job in Globus. The simplest way is to use the

command-line ‘globusrun-ws’. Several formats of job descriptions can be executed

by command-line with the format ‘globusrun-ws –submit –c -job-command’. Table

4.8 shows the scenario of submitting a dummy job in the grid node ‘Auriga’ at HPC

of QUT.

Table 4.8 Submitting a Job in Globus -bash-3.00$ globusrun-ws -submit -c /bin/true

Submitting job...Done.

Job ID: uuid:c0629636-3141-11dd-80d0-224466880045

Termination time: 06/04/2008 07:50 GMT

Current job state: Active

Current job state: CleanUp

Current job state: Done

Destroying job...Done.

Every job submitted to Globus will be assigned a job id first. There can be quite a

few states for each job, such as unsubmitted, active, pending, suspended, failed,

clean up, done, and stagein and stageout if existing. Also various resources and

delegation used will be destroyed at the end of every job submission. Different

strategies can be made to decide what should be done if a job is suspended, failed

and etc, which can be implemented in the web service client written in C or Java with

the function of submitting a job.

The second job description format is XML. One example is as depicted in Table 4.9.

Table 4.9 Job Description in XML Format <job>

<executable>gpstk1.5/bin/PRSolve</executable>

<directory>${GLOBUS_USER_HOME}</directory>

<argument>-o </argument>

<argument>bahr1620.08o</argument>

<argument>-n </argument>

<argument>bahr1620.08n</argument>

<stdout>${GLOBUS_USER_HOME}/stdout</stdout>

<stderr>${GLOBUS_USER_HOME}/stderr</stderr>

</job>

It can be executed using ‘globusrun-ws –submit –f a.xml’.

51

We can also specify file stage-in and stage-out in the job description, as shown in

Table 4.10.

Table 4.10 Job Description with File Stage-In and Stage-Out <job>

<executable>gpstk1.5/bin/PRSolve</executable>

<directory>${GLOBUS_USER_HOME}</directory>

<argument>-o </argument>

<argument>bahr1620.08o</argument>

<argument>-n </argument>

<argument>bahr1620.08n</argument>

<stdout>${GLOBUS_USER_HOME}/stdout</stdout>

<stderr>${GLOBUS_USER_HOME}/stderr</stderr>

<fileStageIn>

<transfer>

<sourceUrl>gsiftp://ng2.ivec.org:2811/stdout</sourceUrl>

<destinationUrl>file:///home/n6390544/</destinationUrl>

</transfer>

</fileStageIn>

<fileStageOut>

<transfer>

<sourceUrl>file:///home/n6390544/gpstk1.5/bin/bahr1620.08o</sourceUrl>

<destinationUrl>gsiftp://job.submitting.host:2811/tmp/

</destinationUrl>

</transfer>

</fileStageOut>

<fileCleanUp>

<deletion>

<file>file:///${GLOBUS_USER_HOME}/my_echo</file>

</deletion>

</fileCleanUp>

</job>

The other way is to submit jobs in C or Java using WS GRAM in Globus, which is

frequently used in this research project. The C or Java programs will invoke some

interfaces from C WS Core or Java WS Core, both of which belong to Common

Runtime Components of Globus. The Common Runtime Components provide GT4

web and pre-web services with a set of libraries and tools that allows these services

to be platform independent, to build on various abstraction layers (threading, IO) and

to leverage functionality lower in the web services stack (WSRF, WSN, etc).

52

Basically we have to write the web service client in C or Java with the function of

submitting a job.

4.3.3 Processing Procedure

There are three main processing procedures in this simplified demonstration, that is,

real-time RTCM data downloading, job submission and positioning process. The

overall procedure will be firstly described and followed by details of each individual

procedure.

After commands are sent from the server to each grid node, open source Ntrip clients

are used to download RTCM data from Ntrip Casters. In the first step,

NtripLinuxClient software is adopted to download RTCM data from Internet in real-

time. And at the same time, LinuxRtcmDecoder software decodes the RTCM data

stream that NtripLinuxClient downloaded into RINEX format. The data processing

procedures is depicted in Figure 4.4, which uses one grid node as an example. In the

second step, some computing programs from GPS Toolkit (UTA 2008) are used to

perform calculation tasks in each grid node. This is a simplified computing task

aimed to representing the network-based RTK processing services as discussed in

3.3. Specifically the position of each mountpoint, namely reference station, will be

calculated by PRSolve or rinexpvt functions of GPS Toolkit.

MountpointACOR0

(www.euref-ip.net:80)

Ntriprtcm

RTCM DATA

RINEXDATA

NtripDecoder

Figure 4.4 Ntrip Data Processing Procedure

Two types of data are required for data processing and computation tasks:

observation data and navigation data. The observation data from multiple reference

stations in 1Hz updating rate will be obtained from the real-time Ntrip downloading

process. The navigation data will be retrieved from FTP site every 2 hours. The

whole data downloading process is properly controlled by some Linux shell scripts

and all the data are saved into files, every few seconds for observation data and every

two hours for navigation data.

53

Debian (192.168.0.1)

Globus (SimpleCA)

GPS Toolkit

Debian (192.168.0.2)

Globus (User Certificate)

GPS Toolkit

Switch

Debian (192.168.0.3)

Globus (User Certificate)

GPS Toolkit

Job

Job

Figure 4.5 Job Submission Procedure

As depicted in Figure 4.5, Globus acts as the middleware software lying between the

operating system and application software. One grid node serves as the server, which

will be the certificate authority. Other grid nodes have to apply for their user

certificates from certificate authority for the first time and get a valid proxy every

time before submit a job. In this way, the whole security infrastructure is built and

every node can submit job to be executed on other grid nodes. The common structure

of algorithm for submitting a job is:

Step 1. Importing necessary classes and libraries

Step 2. Loading the job description

Step 3. Setting the security attributes

Step 4. Creating the factory client handle

Step 5. Querying for factory resource properties

Step 6. Creating the notification consumer

Step 7. Creating the job resource

Step 8. Subscribing for job state notifications

Step 9. Releasing any state holds (if necessary)

Step 10. Destroying resources

54

One example algorithm written in C is given in Appendix B.

The last procedure is the positioning process, which is carried out by some programs

from GPS Toolkit. The observation and navigation file acquired from the real-time

Ntrip data downloading procedure is used as the input of the positioning algorithm of

the GPS Toolkit program. The positioning algorithm will calculate the position

epoch-by-epoch (i.e. second-by-second), and save the results and the final receiver-

autonomous integrity monitoring (RAIM) solutions for all the files.

4.4 Results Analysis and Discussions

Test of each procedure is conducted in the last stage of the demonstration. First

retrieving real-time data with Ntrip from multiple mountpoints through configuration

on each Grid node is carried out. Then jobs are submitted to the Grid. Thirdly,

different computation task is performed on each Grid node. Also some additional

tests are performed, such as different numbers of Grid nodes are added in the Grid

Computing environment to test the scalability of the Grid and certain mount of data

is transferred to test the data sharing ability of the Grid. Finally, the whole

performance by a timing factor of this demonstration is evaluated and the result is

analysed and discussed.

4.4.1 Performance Evaluation

In order to show the results of each test, we have to give some performance

evaluation metrics first. We submitted different kinds of jobs in the Grid

environments of both the laboratory and Grid Australia. Simple jobs, like a dummy

job with no data input or output, need the least time and are most stable one in the

whole demonstration. Jobs written in job description language, like XML, take a

little longer time for the grid to process. One reason is that it takes time to resolve the

structure of the job description file. But more information can be contained for this

kind of job than the simple jobs. The most time-consuming jobs seem to be the jobs

with file stage-in and stage-out, which means there are huge data input and output.

This situation is more obvious in the laboratory, as the whole communication must

come through the simple five-port switch. Below is the detail description of job type.

55

Simple job: job that does not involve computation, e.g. dummy job /bin/date

etc.

Simple job with job description: simple job submitted in the form of XML-

based Job Submission Description Language (JSDL).

Complex job: job with file stage-in or stage-out and computation.

In each run, 5 jobs are submitted, and the time from the first submission to last

completion is measured at the client and then divided by 5 to get the average per-job

time. The time command of Linux reports the real time, microprocessor time used by

the program. The microprocessor time is divided into User CPU time and Sys CPU

time. The detail description of each time types (given in seconds) is given below.

Elapsed time: elapsed time from beginning to end of the program

User CPU time: time used by the program and library subroutines that it calls

Sys CPU time: time used by system calls invoked by the program (directly or

indirectly)

4.4.2 Multiple Mountpoints on Each Grid Node

On each Grid node, RTCM data from multiple mountpoints are downloaded and

saved to separate files naming with the RINEX naming convention (Gurtner and

Estey 2007), such as ACOR077C.08O and etc.

The result indicates that every Grid node is capable of retrieving real time data from

multiple mountpoints. Some analyses for this result are summarized as follows:

• the hardware configuration of each Grid node is not too low

• the data amount is relatively small

• the speed of internet is relatively high within the campus network

Since every Grid node can be utilised if needed, in this procedure Grid will

outperform single PC with nearly the number of Grid nodes of the Grid times’

computational capability.

56

4.4.3 Computation Task Submission

Some simple jobs are submitted to the grid in the test of the second procedure. As the

execution time of simple job can be omitted, the elapsed time of simple jobs can be

treated as the time needed to submit a job to the Grid. The time factor of simple job

is summarized as Table 4.11. Due to communication cost, it seems to be a little bit

slower in Grid Australia than in the Laboratory.

Table 4.11 Time Factor of Simple Job (Unit: Second)

Environment Job Type Elapsed

time

User

CPU time

Sys

CPU time STD*

Laboratory (PC

network)

Simple job 2.949 0.324 0.018 0.231 Simple Job with

job description 2.999 0.32 0.022 0.301

HPC@QUT

(Grid Australia,

Cluster)

Simple job 3.810 0.43 0.022 0.823 Simple Job with

job description 3.464 0.434 0.016 0.567

*STD the standard deviation of the elapsed time based on the samples of 5 tests.

4.4.4 Different Computation Tasks on Each Grid Node

The second test is to execute different computation tasks on each Grid node, such as

calculating positions, ionopheric bias estimation, data file comparison and validation

and so on. Various kinds of tasks are executed on each Grid node to test the

robustness of the Grid. The result we got proves that the Grid can perform every kind

of tasks which can be done on a PC, and without any compromise of functionality.

Some examples are shown in Figure 4.6 and Figure 4.7 respectively.

57

Figure 4.6 Calculating Positions on Auriga

Figure 4.6 shows the scenario when using ‘PRSolve’ function to calculate the

position of one reference station. Observation file and navigation file are the input

parameters and pseudorange position measurement is adopted. Then Figure 4.7 is the

example of calculating the ionospheric bias using ‘IonoBias’. Two days’ observation

data and navigation data are used as the input and the output is displayed on the

screen directly.

Figure 4.7 Calculating Ionospheric Bias on Auriga

58

The time factor of a complex job with different kinds of stage-in and stage-out is

depicted in Table 4.12. As we don’t have quota to transfer data in Grid Australia, this

test and the data sharing test is not conducted there. In one of the test cases, which is

used to calculate the position, 140KB observation data is staged-out from Grid server

to Grid node, and 22.2KB result data in log file is staged-in back to grid server. One

issue we must mention here is that the real-world NRTK module always takes a

longer time to finish (especially the post-processing), such as several minutes, than

the user CPU time in this experiment, herein less than 1 second.

Table 4.12 Time Factor of a Complex Job (Unit: Second)

Environment Complex Job Elapsed

time

User

CPU

time

Sys

CPU

time

STD*

Laboratory (PC

network)

None 140KB 5.5628 0.508 0.034 0.439 None 26.4MB 10.538 0.57 0.00 0.422 None 172MB 25.844 0.58 0.005 0.435

22.2KB 140KB 7.9188 0.536 0.036 0.399

* STD the standard deviation of the elapsed time based on the samples of 5 tests

As we can see from Table 4.12, high time cost will occur even only a small amount

of data is staged in or staged out. So there is a high demand for the data needed to be

downloaded in each Grid node locally, while not being staged out from the server as

in the centralised solution.

4.4.5 Additional Tests

Different number of Grid nodes

One of the additional tests is to add different number of Grid nodes into the Grid

Computing environment to test the scalability of the Grid. As there are only 5 Grid

nodes in total, the time factor of this test doesn’t differ a lot as more Grid nodes are

connected to the Grid. Some thoughts of this test are given below,

• It always takes some efforts to add an additional node

59

• The whole problem needs to be modularized reasonably to make full use of

all the Grid nodes

• Sometimes more Grid nodes don’t necessary bring reduction of time cost, but

certainly improve capability of problem solving

Data sharing

Data sharing is also tested in this demonstration, where 699M data is transferred in

the Grid using GridFTP protocol. The result is shown in Table 4.13.

Table 4.13 Time Factor of Data Sharing (Unit: Second)

Environment Method Elapsed

time

User

CPU time

Sys

CPU time

Laboratory (PC

network)

Local 46.2145 0.195 3.48 GridFTP 67.0565 0.105 0.005

From the time factors of different kinds of jobs in Table 4.11, 4.12, 4.13, we can get

several conclusions:

Near real-time result with 2-3s processing time can be achieved without data

stage-in and stage-out

High demand for the needed data to be downloaded in each Grid node

locally

Data transferring is high efficient in Grid with performance similar with

operation on local disk

4.4.6 Discussion

The results shown in this Chapter indicated that the Grid Computing based

framework developed for this research is of high throughput and high efficiency.

However, it has difficulties in achieving real-time processing within 1s. Several

reasons are given below:

The fundamental processes (security and communication process) of the

current network in Grid are very time consuming.

60

In the Grid environment, as Grid nodes can be from different virtual

organisations, each Grid node has to trust each other through the

authenticating process before one job can be submitted and executed.

Application execution time is to some extent fixed

The process to execute an application or algorithm needs a certain amount

of time, which is independent of single PC or Grid.

From the time cost analysis shown in Section 4.4.4, improvements to the framework

and implemented architecture have to be made if real-time requirements are needed.

For example, security mode can be simplified or authentication time can be

eliminated if all the Grid nodes sit in one organisation, and this can reduce the whole

time cost a lot. Another solution is that Grid Computing only be used for background

processing service as given in Section 3.3, in which delays of few seconds are not

issues. Additionally, our task can be divided into a huge number of small tasks, then

lightweight Grid tool like Java Parallel Processing Framework (JPPF) (JPPF 2008),

or multi-level scheduling strategy such as FALKON (Raicu, Zhao et al. 2007) or

Condor glide-ins can be utilised to achieve a relatively smaller time cost instead of

using fork or full-featured Local Resource Managers (LRMs), such as Condor,

Portable Batch System (PBS), Loading Sharing Facility (LSF), and SEG.

4.5 Result From Lightweight Grid Tool JPPF

This section demonstrates how to achieve better result utilising lightweight Grid tool-

Java Parallel Processing Framework (JPPF). JPPF is an open source Grid Computing

platform written in Java that makes it easy to run applications in parallel, and speed

up their execution by orders of magnitude (JPPF 2008). JPPF’s architecture is

divided into three layers, Client Application, JPPF Driver and Grid nodes, which is

depicted in Figure 4.8.

61

Figure 4.8 JPPF Architecture (JPPF 2008)

In JPPF terminology, the basic unit of execution is called a task. It is the smallest

self-contained piece of code that can be executed remotely. Clients submit their jobs

through Client Applications to JPPF Driver. JPPF Driver is the Grid server, which

manages the tasks queue and takes care of the security and other fundamental issues.

Tasks are distributed to different Grid note to execute and finally the result is

collected back through JPPF Driver to Clients.

Data sharing is tested using JPPF, where 699M data is transferred in the Grid using

GridFTP protocol.

62

halla


Table 4.14 Time Factor of Data Sharing (Unit: Second)

Environment Method Elapsed

time

User

CPU time

Sys

CPU time

Laboratory (PC

network)

Local 46.2145 0.195 3.48 GridFTP 57.0334 0.104 0.005

As we can see in Table 4.14, the test result can be improved to some extent using

lightweight Grid tool JPPF.

4.6 Summary This chapter has provided a proof-of-concept framework demonstration with the

simplified Network RTK processes. The open source software packages, such as

Ntrip client and GPS toolkit, are utilised to facilitate the demonstration tasks, such as

downloading real-time RTCM data through Ntrip from Internet and GPS related

computing tasks. The design and implementation part of the demonstration was also

described in detail and different types of tests have been conducted in the later stage

to verify the functionality and performance of the Grid.

The next chapter will give a summary of the whole thesis and research project, and

some recommendations for future research directions.

63

5. Concluding Remarks and Future Work

5.1 Concluding Remarks

The objective of this research was to design a framework for NRTK data processing

based on Grid Computing. One of the very first steps taken in this research was to

identify the current challenges of NRTK data processing. As described in Chapter 1,

the challenges include a much larger volume of data brought by an increasing

number of GNSS constellations, transmission frequencies, and ground reference

stations. Additionally, higher computational capabilities are required due to more

complex algorithms and a large volume of users whose positions may be computed at

the server ends. Review of these critical issues in Chapter 2 has provided a good

basis for the framework.

Next, a layered framework was designed to address these computing challenges in

NRTK data processing. The distributed platform with scalable ability can cope with

the higher computational capability requirements by scheduling the task to various

grid nodes, which are RTK network centres or servers. At the same time, the

processing algorithms are modularised into, for instance, real time positioning

services and background processing services, so different grid nodes or servers can

handle different parts of the whole processing task to meet different time

requirements. Data location is also taken into account in this framework, with the

preference of collecting and managing data locally in each grid node.

Although it is not possible to fully demonstrate the performance potential of the

designed framework for network-based RTK services within the research period, a

proof-of-concept system with a simplified demonstration has been performed in the

laboratory at QUT and Grid Australia. In the demonstration, real-time data was

downloaded from Ntrip casters using evaluated open source Ntrip clients. The

downloaded RTCM data format was then converted to RINEX and saved to a file on

local disk. The second part of the demonstration was sending commands from the

Grid server to other Grid nodes through Globus. In this procedure jobs were

submitted by the simple function provided by Globus and customised function

separately. The last part of the demonstration is the position calculation processing

using a few programs from the open source GPS toolkit. Various computation tasks

64

were performed and different numbers of Grid nodes were added in the Grid

environment to test the scalability of the Grid.

Results from the demonstration revealed that near real-time results of 2-3s processing

time can be achieved by utilising the Grid. This means that the proposed framework

maybe used for background processing in network-based RTK processing, which

updates network-based differential corrections at intervals of tens of seconds to

minutes. Different kinds of time costs, such as security (authentication),

communication, and application execution, have been analysed and suggestions were

given as recommendations to improve the time performance under certain

circumstances.

5.2 Future Work

Based on the work conducted and the results achieved during this master’s research

period, the author has identified several areas of improvements that can be made to

the overall NRTK data processing framework, which are provided as follows:

• Improving the scheduling algorithm. Scheduling is the kernel part of Grid

Computing, which determines the performance of the whole Grid. Due to

time limitation of the master’s project, only existing algorithms were

implemented in the demonstration. The improvement of scheduling algorithm

will speed up the whole distributed computing a lot, and in this way it makes

the real-time positioning more sensible.

• Refining the NRTK data processing framework. At this stage of this research,

each organisation seems still busy in constructing their own CORS networks,

the opportunity to apply the framework to real-world application has not been

available. More importantly the RTK algorithms are still under development.

In one to two year’s time when the CRC-SI project has been completed and

real time RTK processing software platforms are ready, it should be the time

to fully implement and test the proposed grid computing framework for the

Network RTK positioning service. In this way, the more accurate

requirements can be obtained and the framework can be refined during the

deployment of the framework.

65

• Improving the time performance under certain circumstances. As mentioned

in the result discussion part of the demonstration, some lightweight multi-

level schedulers can be utilised to get a much lower time cost if the task can

be divided into a large number of small tasks. Also security cost can be

reduced significantly if most of the Grid nodes sit within one organisation.

66

Appendix A: Installation Process of Globus

This appendix shows a full installation of Globus Toolkit 4 on a Debian 3.1 machine.

Tips of Unix

1. Unix has different shells, such as Bourne Shell (bash) and C shell (csh). For

example, the command prompt of bash looks like ‘root@grid01:/usr/java#’,

while that of csh looks like ‘grid01 %’.

2. Users’ privilege in Unix is strictly controlled. So it’s necessary to execute the

commands as the same user as this appendix when installing Globus.

3. some commands:

cat: concatenate files and print on the standard output

vim: create or edit file

scp: remote secure file copy from one machine to another

Pre-requisites

Utility tools:

zlib/gcc/g++/tar/sed/make/perl/sudo/postgres/libiodbc2/libiodbc2-dev

use the following commands to check the tools:

grid01 % dpkg --list | grep zlib

grid01 % gcc –version

grid01 % g++ –version

grid01 % tar –version

grid01 % sed –version

grid01 % make –version

grid01 % perl –version

grid01 % sudo -V

grid01 % dpkg --list | grep postgres

grid01 % dpkg --list | grep psql

If any tool is not installed, use ‘apt-get install’ command to install it.

root@grid01:/usr/local# apt-get install postgresql

root@grid01:/root# apt-get install libiodbc2 libiodbc2-dev

compulsory software: Jdk/Ant

Install java:

root@grid01:/usr/java# ./jdk-1_5_0_14-linux-i586.bin

67

Install Ant:

root@grid01:/usr/local# tar xzf apache-ant-1.7.0-bin.tar.gz

root@grid01:/usr/local# ls apache-ant-1.7.0

Building the Toolkit

1. Add a non-privileged user named ‘globus’, which will be used to perform

administrative tasks such as starting and stopping the container, deploying

services, etc.

root@grid01:~# adduser globus

root@grid01:/etc/init.d# mkdir /usr/local/globus-4.0.1/

root@grid01:/etc/init.d# chown globus:globus /usr/local/globus-4.0.1/

2. Setup java environment.

globus@grid01:~/gt4.0.1-all-source-installer$ export

ANT_HOME=/usr/local/apache-ant-1.7.0

globus@grid01:~/gt4.0.1-all-source-installer$ export JAVA_HOME=/usr/java/jdk1.5.0_14/

globus@grid01:~/gt4.0.1-all-source-installer$ export

PATH=$ANT_HOME/bin:$JAVA_HOME/bin:$PATH

globus@grid01:~/gt4.0.1-all-source-installer$ ./configure --

prefix=/usr/local/globus-4.0.1/ \

--with-iodbc=/usr/lib

Note:

The machine I am installing on doesn't have access to a scheduler. If it did, I

would have specified one of the wsgram scheduler options, like --enable-

wsgram-condor, --enable-wsgram-lsf, or --enable-wsgram-pbs.

3. Build and install the toolkit

globus@ grid01:~/gt4.0.1-all-source-installer$ make | tee installer.log

globus@grid01:~/gt4.0.1-all-source-installer$ make install

Setting up security (different in Server and other grid nodes)

1. SimpleCA (Install SimpleCA on the server, for other grid nodes, just trust this

SimpleCA)

Install SimpleCA on the server

globus@grid01:~$ export GLOBUS_LOCATION=/usr/local/globus-4.0.1

68

globus@grid01:~$ source $GLOBUS_LOCATION/etc/globus-user-env.sh

globus@grid01:~$ $GLOBUS_LOCATION/setup/globus/setup-simple-ca

check installation result:

globus@grid01:~$ ls ~/.globus/

globus@grid01:~$ ls ~/.globus/simpleCA/

Other grid nodes:

globus@grid02:~$ scp

grid01:.globus/simpleCA/globus_simple_ca_6240356a_setup-0.19.tar.gz .

globus@grid02:~$ export GLOBUS_LOCATION=/usr/local/globus-4.0.1

globus@grid02:~$ $GLOBUS_LOCATION/sbin/gpt-build

globus_simple_ca_6240356a _setup-0.19.tar.gz

globus@ grid02:~$ $GLOBUS_LOCATION/sbin/gpt-postinstall

2. Make the machine trust the new CA

root@grid01:~# export GLOBUS_LOCATION=/usr/local/globus-4.0.1

root@grid01:~# $GLOBUS_LOCATION/setup/globus_simple_ca_6240356a

_setup/setup-gsi –default

check configuration results:

root@grid01:~# ls /etc/grid-security/

root@grid01:~# ls /etc/grid-security/certificates/

3. set up hostcert and sign the certificate using the SimpleCA

root@grid01:~# source $GLOBUS_LOCATION/etc/globus-user-env.sh

root@grid01:~# grid-cert-request -host `hostname`

globus@grid01:~$ grid-ca-sign -in /etc/grid-security/hostcert_request.pem -out

hostsigned.pem

root@grid01:~# cp ~globus/hostsigned.pem /etc/grid-security/hostcert.pem

root@grid01:/etc/grid-security# cp hostcert.pem containercert.pem

root@grid01:/etc/grid-security# cp hostkey.pem containerkey.pem

root@grid01:/etc/grid-security# chown globus:globus container*.pem

root@grid01:/etc/grid-security# ls -l *.pem

4. set up usercert for the real Globus user ‘ade’ and sign the certificate as user

‘globus’

grid01 % setenv GLOBUS_LOCATION /usr/local/globus-4.0.1/

grid01 % source $GLOBUS_LOCATION/etc/globus-user-env.csh

69

grid01 % grid-cert-request

grid01 % cat /home/ade/.globus/usercert_request.pem | mail globus@grid01

globus@grid01:~$ grid-ca-sign -in request.pem -out signed.pem

globus@grid01:~$ cat signed.pem | mail ade@grid01

grid01 % cp signed.pem ~/.globus/usercert.pem

grid01 % ls -l ~/.globus/

5. create a grid-mapfile as ‘root’ for authorization

root@grid01:/etc/grid-security# $GLOBUS_LOCATION/sbin/grid-mapfile-add-

entry -dn "/O=Grid/OU=GlobusTest/OU=simpleCA-

grid01.debianGridDomain/OU=debianGridDomain/CN=ade" -ln ade

6. test and verify of CA

grid01 % grid-proxy-init -verify -debug

Setting up GridFTP

1. add the gridftp service to xinetd.d

root@grid01:/etc/grid-security# vim /etc/xinetd.d/gridftp

root@grid01:/etc/grid-security# cat /etc/xinetd.d/gridftp

service gsiftp

{

instances = 100

socket_type = stream

wait = no

user = root

env += GLOBUS_LOCATION=/usr/local/globus-4.0.1

env += LD_LIBRARY_PATH=/usr/local/globus-4.0.1/lib

server = /usr/local/globus-4.0.1/sbin/globus-gridftp-server

server_args = -i

log_on_success += DURATION

nice = 10

disable = no

}

2. add the gsiftp service to /etc/services

root@grid01:/etc/grid-security# vim /etc/services

70

root@grid01:/etc/grid-security# tail /etc/services

vboxd 20012/udp

binkp 24554/tcp # binkp fidonet protocol

asp 27374/tcp # Address Search Protocol

asp 27374/udp

dircproxy 57000/tcp # Detachable IRC Proxy

tfido 60177/tcp # fidonet EMSI over telnet

fido 60179/tcp # fidonet EMSI over TCP

# Local services

gsiftp 2811/tcp

3. reload xinetd service

root@grid01:/etc/grid-security# /etc/init.d/xinetd reload

Reloading internet superserver configuration: xinetd.

root@grid01:/etc/grid-security# netstat -an | grep 2811

tcp 0 0 0.0.0.0:2811 0.0.0.0:* LISTEN

test the configuration:

grid01 % grid-proxy-init -verify -debug

grid01 % globus-url-copy gsiftp://debiangrid001/etc/group

file:///tmp/ade.test.copy

grid01 % diff /tmp/ade.test.copy /etc/group

Starting the webservices container

1. setup an /etc/init.d entry for the webservices container

globus@gid01:~$ vim $GLOBUS_LOCATION/start-stop

globus@grid01:~$ cat $GLOBUS_LOCATION/start-stop

#! /bin/sh

set -e

export GLOBUS_LOCATION=/usr/local/globus-4.0.1

export JAVA_HOME=/usr/java/j2sdk1.4.2_10/

export ANT_HOME=/usr/local/apache-ant-1.6.5

export GLOBUS_OPTIONS="-Xms256M -Xmx512M"

. $GLOBUS_LOCATION/etc/globus-user-env.sh

71

cd $GLOBUS_LOCATION

case "$1" in

start)

$GLOBUS_LOCATION/sbin/globus-start-container-detached -p 8443

;;

stop)

$GLOBUS_LOCATION/sbin/globus-stop-container-detached

;;

*)

echo "Usage: globus {start|stop}" >&2

exit 1

;;

esac

exit 0

globus@grid01:~$ chmod +x $GLOBUS_LOCATION/start-stop

2. create an /etc/init.d script to call the globus user’s start-stop script

root@grid01:~# vim /etc/init.d/globus-4.0.1

root@grid01:~# cat /etc/init.d/globus-4.0.1

#!/bin/sh -e

case "$1" in

start)

su - globus /usr/local/globus-4.0.1/start-stop start

;;

stop)

su - globus /usr/local/globus-4.0.1/start-stop stop

;;

restart)

$0 stop

sleep 1

$0 start

;;

*)

printf "Usage: $0 {start|stop|restart}\n" >&2

72

exit 1

;;

esac

exit 0

root@grid01:~# chmod +x /etc/init.d/globus-4.0.1

root@grid01:~# /etc/init.d/globus-4.0.1 start

3. use one of the sample clients/services to interact with the container

grid01 % setenv JAVA_HOME /usr/java/jdk1.5.0_14/

grid01 % setenv ANT_HOME /usr/local/apache-ant-1.7.0/

grid01 % setenv PATH $ANT_HOME/bin:$JAVA_HOME/bin:$PATH

grid01 % counter-client -s

https://debiangrid001:8443/wsrf/services/CounterService

Got notification with value: 3

Counter has value: 3

Got notification with value: 13

Configuring RFT

1. Configure the system to allow TCP/IP connections to postgres, as well as adding

a trust entry for our current host

root@grid01:~# vim /var/lib/postgres/postmaster.conf

root@grid01:~# grep POSTMASTER /var/lib/postgres/postmaster.conf

POSTMASTER_OPTIONS="-i"

root@grid01:~# vim /var/lib/postgres/data/pg_hba.conf

root@grid01:~# grep rftDatabase /etc/postgresql/pg_hba.conf

host rftDatabase "globus" "192.168.0.100" 255.255.255.255 md5

root@grid01:~# /etc/init.d/postgresql restart

Stopping PostgreSQL database server: postmaster.

Starting PostgreSQL database server: postmaster.

2. create the ‘rftDatabase’ as the user ‘globus’

root@grid01:~# su postgres -c "createuser -P globus"

Enter password for new user: *****

Enter it again: *****

Shall the new user be allowed to create databases? (y/n) y

Shall the new user be allowed to create more new users? (y/n) n

73

CREATE USER

globus@grid01:~$ createdb rftDatabase

CREATE DATABASE

globus@grid01:~$ psql -d rftDatabase -f

$GLOBUS_LOCATION/share/globus_wsrf_rft/rft_schema.sql

globus@grid01:~$ vim $GLOBUS_LOCATION/etc/globus_wsrf_rft/jndi-

config.xml

globus@grid01:~$ grep -C 3 password

$GLOBUS_LOCATION/etc/globus_wsrf_rft/jndi-config.xml

</parameter>

<parameter>

<name>

password

</name>

<value>

*****

test configuration:

root@grid01:~# /etc/init.d/globus-4.0.1 restart

Stopping Globus container. PID: 29985

Starting Globus container. PID: 8620

root@grid01:~# head /usr/local/globus-4.0.1/var/container.log

3. try an RFT transfer

grid01 % cp /usr/local/globus-4.0.1/share/globus_wsrf_rft_test/transfer.xfr rft.xfr

grid01 % vim rft.xfr

grid01 % cat rft.xfr

true

16000

16000

false

1

true

1

null

null

74

false

10

gsiftp://debiangrid001.debiangriddomain:2811/etc/group

gsiftp://debiangrid001.debiangriddomain:2811/tmp/rftTest_Done.tmp

grid01 % rft -h debiangrid001 -f rft.xfr

grid01 % diff /etc/group /tmp/rftTest_Done.tmp

Setting up WS GRAM

1. Setup sudo so the user ‘globus’ can start jobs as a different user

root@grid01:~# visudo

root@grid01:~# cat /etc/sudoers

globus ALL=(ade) NOPASSWD: /usr/local/globus-4.0.1/libexec/globus-

gridmap-and-execute

-g /etc/grid-security/grid-mapfile /usr/local/globus-4.0.1/libexec/globus-job-manager-script.pl *

globus ALL=(ade) NOPASSWD: /usr/local/globus-4.0.1/libexec/globus-

gridmap-and-execute

-g /etc/grid-security/grid-mapfile /usr/local/globus-4.0.1/libexec/globus-gram-

local-proxy-tool *

2. Test WS GRAM command ‘globusrun-ws’

grid01 % globusrun-ws -submit -c /bin/true

grid01 % echo $?

0

grid01 % globusrun-ws -submit -c /bin/false

grid01 % echo $?

1

75

Appendix B: Web Service for Submitting a Job

//0. Importing necessary classes and libraries

SubmittingJobInC()

{

//1. Loading the job description

const char * file = "job.xml";

globus_soap_message_handle_t message;

wsgram_CreateManagedJobInputType input;

globus_soap_message_handle_init_from_file(&message, file);

globus_soap_message_deserialize_element_unknown(message, &element);

if(strcmp(element.local, "job") == 0)

{

wsgram_JobDescriptionType * jd;

input.choice_value.type = wsgram_CreateManagedJobInputType_job;

jd = &input.choice_value.value.job;

wsgram_JobDescriptionType_deserialize(&element, jd, message, 0);

}

else if(strcmp(element.local, "multiJob") == 0)

{

wsgram_MultiJobDescriptionType * mjd;

input.choice_value.type = wsgram_CreateManagedJobInputType_multiJob;

mjd = &input.choice_value.value.multiJob;

wsgram_MultiJobDescriptionType_deserialize(&element, mjd, message, 0);

}

xsd_QName_destroy_contents(&element);

globus_soap_message_handle_destroy(message);

//2. Setting the security attributes

globus_soap_message_attr_t message_attr;

globus_soap_message_attr_init(&message_attr);

/*

* Set authentication mode to host authorization: other possibilities are

* GLOBUS_SOAP_MESSAGE_AUTHZ_HOST_IDENTITY or

* GLOBUS_SOAP_MESSAGE_AUTHZ_HOST_SELF.

*/

globus_soap_message_attr_set(

message_attr,

GLOBUS_SOAP_MESSAGE_AUTHZ_METHOD_KEY,

NULL,

NULL,

(void *) GLOBUS_SOAP_MESSAGE_AUTHZ_HOST);

/*

* Set message protection level. GLOBUS_SOAP_MESSAGE_AUTH_PROTECTION_PRIVACY

* for encryption.

*/

globus_soap_message_attr_set(

message_attr,

76

GLOBUS_SOAP_MESSAGE_AUTH_PROTECTION_KEY,

NULL,

NULL,

(void *) GLOBUS_SOAP_MESSAGE_AUTH_PROTECTION_PRIVACY);

//3. Creating the factory client handle

ManagedJobFactoryService_client_handle_t factory_handle;

result = ManagedJobFactoryService_client_init(

&factory_handle,

message_attr,

NULL);

//4. Querying for factory resource properties

/*

* localResourceManager, or other resource property names as defined in the

* WSDL

*/

xsd_QName property_name =

{

"http://www.globus.org/namespaces/2004/10/gram/job",

"localResourceManager"

};

wsrp_GetResourcePropertyResponseType * property_response;

int fault_type;

xsd_any * fault;

ManagedJobFactoryPortType_GetResourceProperty(

factory_handle,

endpoint,

&property_name,

&property_response,

(ManagedJobFactoryPortType_GetResourceProperty_fault_t *) &fault_type,

&fault);

//5. Creating the notification consumer

globus_service_engine_t engine;

wsa_EndpointReferenceType consumer_reference;

globus_service_engine_init(&engine, NULL, NULL, NULL, NULL, NULL);

globus_notification_create_consumer(

&consumer_reference,

engine,

notify_callback,

NULL);

//6. Creating the job resource

/*

* You can set input.InitialTerminationTime to be a timeout if interested.

* The xsd_dateTime type is a struct tm pointer.

*/

time_t term_time = time(NULL);

globus_uuid_t uuid;

wsa_AttributedURI * job_id;

wsa_EndpointReferenceType * factory_epr;

xsd_any * reference_property;

wsgram_CreateManagedJobOutputType * output = NULL;

xsd_QName factory_reference_id_qname =

77

{

"http://www.globus.org/namespaces/2004/10/gram/job",

"ResourceID"

};

term_time += 60 * 60; /* 1 hour later */

xsd_dateTime_copy(&input.InitialTerminationTime, gmtime(&term_time));

/*

* Set unique JobID. This is used to reliably create jobs and check for status.

*/

globus_uuid_create(&uuid);

wsa_AttributedURI_init(&job_id);

job_id->base_value = globus_common_create_string("uuid:%s", uuid.text);

/* Subscribe to notifications at create time */

wsnt_SubscribeType_init(&input.Subscribe);

wsa_EndpointReferenceType_copy_contents(

&input.Subscribe.ConsumerReference,

&consumer_reference);

xsd_any_init(&input.Subscribe->TopicExpression.any);

&input.Subscribe->TopicExpression.any->any_info =

&xsd_QName_contents_info;

xsd_QName_copy(

(xsd_QName **) &input.Subscribe->TopicExpression.any->any.value,

&ManagedJobPortType_state_rp_qname);

xsd_anyURI_copy_cstr(

&input.Subscribe->TopicExpression._Dialect,

"http://docs.oasis-open.org/wsn/2004/06/TopicExpression/Simple");

xsd_boolean_init(&input.Subscribe->UseNotify);

*(&input.Subscribe->UseNotify) = GLOBUS_TRUE;

/* Construct the EPR of the job factory */

wsa_EndpointReferenceType_init(&factory_epr);

wsa_AttributedURI_init_contents(&factory_epr->Address);

xsd_anyURI_init_contents_cstr(&factory_epr->Address.base_value,

globus_common_create_string(

"https://%s:%hu/wsrf/services/%s",

"192.168.0.1",

"8443",

"ManagedJobFactoryService");

wsa_ReferencePropertiesTypeinit(&factory_epr->ReferenceProperties);

reference_property = xsd_any_array_push(

&factory_epr->ReferenceProperties.any);

reference_property->any_info = &xsd_string_info;

xsd_QName_copy(

&reference_property->element,

&factory_reference_id_qname);

xsd_string_copy_cstr(

(xsd_string **) &reference_property->value,

"Fork");

/* Submit the request to the service container */

ManagedJobFactoryPortType_createManagedJob_epr(

factory_handle,

78

factory_epr,

input,

&output,

(ManagedJobFactoryPortType_createManagedJob_fault_t *) &fault_type,

&fault);

//7. Subscribing for job state notifications

ManagedJobService_client_handle_t job_handle;

wsnt_SubscribeType subscribe_input;

wsnt_SubscribeResponseType * subscribe_response;

ManagedJobService_client_init(

&job_handle,

message_attr,

NULL);

ManagedJobPortType_Subscribe_epr(

job_handle,

output->managedJobEndpoint,

subscribe_input,

&subscribe_response,

(ManagedJobPortType_Subscribe_fault_t *) &fault_type,

&fault);

//8. Releasing any state holds (if necessary)

wsgram_ReleaseInputType release;

wsgram_ReleaseOutputType * release_response = NULL;

wsgram_ReleaseInputType_init_contents(&release);

ManagedJobPortType_release_epr(

job_handle,


&release,

&release_response,

(ManagedJobPortType_release_fault_t *) &fault_type,

&fault);

//9. Destroying resources

/* destroy subscription resource */

SubscriptionManagerService_client_init subscription_handle;

wsnt_DestroyType destroy;

wsnt_DestroyResponseType * destroy_response = NULL;

SubscriptionManagerService_client_init(

&subscription_handle,

message_attr,

NULL);

/* if subscription done at job creation time, use

* output->subscriptionEndpoint in place of

* subscribe_response->SubscriptionReference,

*/

SubscriptionManager_Destroy_epr(

subscription_handle,

subscribe_response->SubscriptionReference,

destroy,

&destroy_response,

SubscriptionManager_Destroy_fault_t *) &fault_type,

fault);

79

/* destroy the job resource */

jobPort.destroy(new Destroy());

ManagedJobPortType_Destroy_epr(

job_handle,


&destroy,

&destroy_response,

(ManagedJobPortType_Destroy_fault_t *) &fault_type,

&fault);

}

80

References AuScope. (2008). "AuScope." from http://www.auscope.org.au/. Cafaro, D. A. (2007). "Migrating from WSRF to WSRT." Retrieved 20, May, 2008, from http://www.ibm.com/developerworks/grid/library/gr-wsrfwsrt/. CERN. (2008). "LHC Computing Grid." Retrieved 3rd May, 2008, from http://public.web.cern.ch/public/en/LHC/Computing-en.html.

=89

CRCSI. (2007). "Delivering Precise Positioning Services in Regional Areas." from http://www.crcsi.com.au/pages/project.aspx?projectid . Ellisman, M. and S. Peltier (2004). Medical Data Federation: The Biomedical Informatics Research Network. The grid: blueprint for a new computing infrastructure FACC. (2007). "NTRIP." from http://igs.bkg.bund.de/. Feller, M., I. Foster and S. Martin (2007). GT4 GRAM: A Functionality and Performance Study. TERAGRID 2007 CONFERENCE, MADISON, WI, US. Feng, Y. and B. Li (2008). An Overview of Three Carrier Ambiguity Resolutions: Problems, Models, Methods and Performance Analysis Using Semi-Generated Triple Frequency GPS Data. Proceedings of ION GNSS 2008, Savannah, Georgia. Feng, Y. and C. Rizos (2005). Three Carrier Approaches for Future Global, Regional and Local GNSS Positioning Services: Concepts and Performance Perspectives. Proceedings of ION GNSS 2005, Long Beach, CA. Feng, Y. and Y. Zheng (2005). "Efficient Interpolations to GPS Orbits for Precise Wide Area Applications." GPS Solutions 9(4): 273-282. Foster, I. (2005). "A Globus Primer." Retrieved 3rd, May, 2008, from http://www.globus.org/. Foster, I. and C. Kesselman (2004). The grid: blueprint for a new computing infrastructure. 2nd ed., Elsevier Foster, I., C. Kesselman, G. Tsudik and S. Tuecke (1998). "A Security Architecture for Computational Grids." 5th ACM Conference on Computer and Communication Security. Globus. (2008). "The Globus Alliance." Retrieved 3rd May, 2008, from http://www.globus.org/. GPSnet. (2008). "GPSnet." from http://www.land.vic.gov.au/GPSnet. Gurtner, W. and L. Estey. (2007). "RINEX: The Receiver Independent Exchange Format Version 2.11." from ftp://ftp.unibe.ch/aiub/rinex/.

81

http://www.auscope.org.au/

http://www.ibm.com/developerworks/grid/library/gr-wsrfwsrt/

http://public.web.cern.ch/public/en/LHC/Computing-en.html

http://www.crcsi.com.au/pages/project.aspx?projectid=89

http://igs.bkg.bund.de/

http://www.globus.org/

http://www.globus.org/

http://www.land.vic.gov.au/GPSnet

ftp://ftp.unibe.ch/aiub/rinex/

Higgins, M. (2008). Legal Traceability of GNSS Measurements in Australia. Integrating Generations, FIG Working Week 2008. Stockholm, Sweden. Joseph, J. and C. Fellenstein (2004). Grid computing, Upper Saddle River, N.J. : Prentice Hall Professional Technical Reference. JPPF. (2008). "Java Parallel Processing Framework." Retrieved 25th, Dec, 2008, from http://www.jppf.org/. Kamath, C. (2001). "The Role of Parallel and Distributed Processing in Data Mining." IEEE Computer Society, Spring(Newsletter of the Technical Committee on Distributed Processing). LENZ, E. (2004). Networked Transport of RTCM via Internet Protocol (NTRIP) – Application and Benefit in Modern Surveying Systems. FIG Working Week. Athens, Greece. Leung, J. Y.-T. (2004). Handbook of Scheduling : Algorithms, Models, and Performance. London, Chapman & Hall/CRC. Lim, S. and C. Rizos (2007). A New Framework for Server-Based and Thin-Client GNSS Operations for High Accuracy Applications in Surveying and Navigation. ION GNSS 20th International Technical Meeting of the Satellite Division. Fort Worth, TX, U.S. Lim, S. and C. Rizos (2008). System Architecture for Server-Based Network-RTK using Multiple GNSS. Loo, A. W.-S. (2003). "The future of peer-to-peer computing." Communication of ACM 46(9): 57-61. Loo, A. W.-S. (2007). Peer-to-Peer Computing: Building Supercomputers with Web Technologies. First Edition, Springer. Misra, P. and P. Enge (2001). GLOBAL POSITIONING SYSTEM-Signals, Measurements, and Performance. First Edition. Lincoln, Massachusetts, U.S.A, Ganga-Jamuna Press. Misra, P. and P. Enge (2006). GLOBAL POSITIONING SYSTEM-Signals, Measurements, and Performance. Second Edition. Lincoln, Massachusetts, U.S.A, Ganga-Jamuna Press. NGS, C. T. (2006). "What Is CORS?" from http://www.ngs.noaa.gov/CORS/cors-data.html. R.Fraser, T.Rankine and R.Woodcock (2007). "Service oriented grid architecture for geosciences community." Proceedings of the fifth Australasian symposium on ACSW frontiers 68(Fifth Australasian Symposium on Grid Computing and e-Research (AusGrid 2007)).

82

http://www.jppf.org/

http://www.ngs.noaa.gov/CORS/cors-data.html

http://www.ngs.noaa.gov/CORS/cors-data.html

83

Raicu, I., Y. Zhao, C. Dumitrescu, I. Foster and M. Wilde (2007). Falkon: a Fast and Light-weight tasK executiON framework. SC07. Reno, Nevada, USA. RETSCHER, G. (2002). "Accuracy Performance of Virtual Reference Station (VRS) Networks." Journal of Global Positioning Systems 1. Rizos, C. (2003). "Network RTK Research and Implementation - A Geodetic Perspective." Journal of Global Positioning Systems 1(No. 2): 144-150. Rizos, C. (2007). The International GNSS Service: In the Service of Geoscience and the Geospatial Industry. International Global Navigation Satellite Systems Society IGNSS Symposium 2007. The University of New South Wales, Sydney, Australia. RTCM. (2007). "The Radio Technical Commission for Maritime Services." from http://www.rtcm.org/. Sotomayor, B. and L. Childers (2006). Globus Toolkit 4: programming Java Services. 1st ed., Elsevier SydNET. (2008). "SydNET." from http://sydnet.lands.nsw.gov.au/images/MetroNETCoverage.jpg.

me

UTA. (2008). "GPS Toolkit." from http://www.gpstk.org/bin/view/Documentation/WebHo . Wanninger, L. (2003). "Virtual Reference Stations (VRS)." GPS on the Web. Wanninger, L. (2006). "Introduction to Network RTK ", from http://www.network-rtk.info/intro/introduction.html. Weber G., D. D., Gebhard H. (2005). Networked Transport of RTCM via Internet Protocol (Ntrip)–IP-Streaming for Real-Time GNSS Applications. ION GNSS 18th International Technical Meeting of the Satellite Division. Long Beach, CA. Wikipedia. (2007). "Global Positioning System." from http://en.wikipedia.org/wiki/Global_Positioning_System.

29

Wikipedia. (2008). "Kerberos (protocol)." from http://en.wikipedia.org/wiki/Kerberos_%28protocol% .

http://www.rtcm.org/

http://sydnet.lands.nsw.gov.au/images/MetroNETCoverage.jpg

http://www.gpstk.org/bin/view/Documentation/WebHome

http://www.network-rtk.info/intro/introduction.html

http://www.network-rtk.info/intro/introduction.html

http://en.wikipedia.org/wiki/Global_Positioning_System

http://en.wikipedia.org/wiki/Kerberos_%28protocol%29

A Framework for Network RTK Data Processing Based on … · multiple continuously operating...

Documents

Transcript of A Framework for Network RTK Data Processing Based on … · multiple continuously operating...