World Wide Web (WWW) A Distributed Document-Based System

42
World Wide Web (WWW) World Wide Web (WWW) A Distributed A Distributed Document-Based System Document-Based System Group E Ricky Tong (D-A0-1611) Eddy Leong (D-A0-1623) Dick Lei (D-A0-1658)

description

World Wide Web (WWW) A Distributed Document-Based System. Group E Ricky Tong (D-A0-1611) Eddy Leong (D-A0-1623) Dick Lei (D-A0-1658). Schedule of Presentation. Overview of World Wide Web Document Model HTML DOM XML Document Type MIME Architectural Overview Discussion Time. - PowerPoint PPT Presentation

Transcript of World Wide Web (WWW) A Distributed Document-Based System

Page 1: World Wide Web (WWW) A Distributed Document-Based System

World Wide Web (WWW)World Wide Web (WWW)A Distributed Document-Based A Distributed Document-Based

SystemSystem

Group E

Ricky Tong (D-A0-1611)

Eddy Leong (D-A0-1623)

Dick Lei (D-A0-1658)

Page 2: World Wide Web (WWW) A Distributed Document-Based System

Schedule of PresentationSchedule of Presentation Overview of World Wide Web Document Model HTML DOM XML Document Type MIME Architectural Overview Discussion Time

Page 3: World Wide Web (WWW) A Distributed Document-Based System

The World Wild WebThe World Wild Web

The www is a document-based systemIt can be view as a huge distributed system

consisting of millions of clients and servers for accessing linked documents

Sever maintain collections of documents, while clients provide users an easy-to-use interface for presenting and accessing those documents

Page 4: World Wide Web (WWW) A Distributed Document-Based System

Overview of Overview of World Wide Web World Wide Web

Documents are stored as files in the servers.Servers receive request and files are sent to

the clients.The client usually interacts with the web

server through a browser.

Page 5: World Wide Web (WWW) A Distributed Document-Based System

The overall organization of the The overall organization of the WebWeb

Page 6: World Wide Web (WWW) A Distributed Document-Based System

Document ModelDocument Model

Some documents are represents are ASCII text files.

Some are expressed as a collection of script that will run on the browser automatically

Some contains references to other document such as: hyperlink.

The new document may replace the current one or open in a new browser

Page 7: World Wide Web (WWW) A Distributed Document-Based System

HTMLHTML

Most web document are expressed in HTML.

An HTML file contains small markup tags telling the Web browser how to display the page.

An HTML file must has .htm or .html extension.

Create the HTML file by simple text editor

Page 8: World Wide Web (WWW) A Distributed Document-Based System

Example of HTMLExample of HTML

<html><head><title>Title of page</title></head><body>This is my first homepage. <b>This text is bold</b></body></html>

Page 9: World Wide Web (WWW) A Distributed Document-Based System

Document Object ModelDocument Object Model

DOM provides a standard programming interface to parsed web documents.

The interface is specified in CORBA IDL.The interface is used by the scripts

embedded in a document.Scripts can be used to inspect and modify

the document that they are part of.

Page 10: World Wide Web (WWW) A Distributed Document-Based System

XML XML (Extensible Markup (Extensible Markup

Language)Language)XML is a meta-markup language providing

a format for describing structured data This facilitates more precise declarations of

content and more meaningful search results across multiple platforms.

Page 11: World Wide Web (WWW) A Distributed Document-Based System

XML ExampleXML Example

<?xml version="1.0" ?><?xml-stylesheet href="greeting.xsl" type="text/xsl"?

><message> <greeting>Hi</greeting> <target>you all</target></message>

Page 12: World Wide Web (WWW) A Distributed Document-Based System

Other Document TypesOther Document Types

There are many types of documents besides HTML and XML:

Audio: .mp3Others: .pdf, etcImage : .gif and .jpeg

Page 13: World Wide Web (WWW) A Distributed Document-Based System

MIME (Multipurpose Internet MIME (Multipurpose Internet Mail Extensions)Mail Extensions)

It was originally developed to provide information on the content of a message body that was sent as part of E-mail.

It is a specification for enhancing the capabilities of standard Internet E-mail.

It offers a simple standardized way to represent and encode a wide variety of media types for transmission via Internet mail.

Page 14: World Wide Web (WWW) A Distributed Document-Based System

The 7 Content-types defined The 7 Content-types defined in MIMEin MIME

Text - represent textual information Image - transmit still images Audio - transmit audio or voice data Video - transmit video data or moving image data Message - encapsulate an entire RFC 822 format

messages Multipart - combine several body parts of possibly

different types & subtypes Application - transmit application or binary data

Page 15: World Wide Web (WWW) A Distributed Document-Based System

CGICGI(Common Gateway Interface)(Common Gateway Interface)

It is a standard for interfacing external applications with information servers. Such as HTTP or Web severs

It is executed real time and give dynamic information.

Page 16: World Wide Web (WWW) A Distributed Document-Based System

The principle of using server-The principle of using server-side CGI programs side CGI programs

Page 17: World Wide Web (WWW) A Distributed Document-Based System

Server-side scriptServer-side script

It is executed by the server when the document has been fetched locally.

Client-side using JavaScript

<script language="JavaScript">

<!--

script code here

--!>

</script>

Server-Side Using ASP

<%

'

'script code here

'

%>

Page 18: World Wide Web (WWW) A Distributed Document-Based System

Client-side scriptClient-side script

Client-side script is just software designed to be run by the browser

Page 19: World Wide Web (WWW) A Distributed Document-Based System

AppletApplet

It is another method to pass precompiled programs to a client

Applet is a Small Java program embedded in an HTML page.

For security reasons applets cannot read or write data on client computer.

The applet can only be executed if your browser supports Java.

Page 20: World Wide Web (WWW) A Distributed Document-Based System

ServletServlet

Servlet is a precompiled program that is executed in the address space of the server.

Servlet is Java technology's answer to CGI programming.

The Web page is based on data submitted by the user.

The data change frequently.

Page 21: World Wide Web (WWW) A Distributed Document-Based System

Architectural details of a client Architectural details of a client and server in the Weband server in the Web

Page 22: World Wide Web (WWW) A Distributed Document-Based System

HTTP ConnectionsHTTP Connections

HTTP is a client-server protocol by which two machines can communicate over a TCP/IP connection.

HTTP is the protocol used for document exchange in the World-Wide-Web.

Everything that happens on the web happens over HTTP transactions.

Page 23: World Wide Web (WWW) A Distributed Document-Based System

HTTP Headers HTTP Headers

General Header Field (Use in both request and response messages)

Request Header Fields (Use in request messages only)

Response Header Fields (Used in response message only)

Entity Header Fields (Use in both request and response messages, containing the information about the entity-body of the message)

Page 24: World Wide Web (WWW) A Distributed Document-Based System

Request Header Example Request Header Example

GET /articles/news/today.asp HTTP/1.1Accept: */*Accept-Language: en-usConnection: Keep-AliveHost: localhostReferer: http://localhost/links.aspUser-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Wi

ndows NT 5.0)Accept-Encoding: gzip, deflate

Page 25: World Wide Web (WWW) A Distributed Document-Based System

Response Header ExampleResponse Header Example

HTTP/1.1 200 OKServer: Microsoft-IIS/5.0Date: Thu, 13 Jul 2000 05:46:53 GMTContent-Length: 2291Content-Type: text/htmlSet-Cookie:

ASPSESSIONIDQQGGGNCG=LKLDFFKCINFLDMFHCBCBMFLJ; path=/

Cache-control: private

Page 26: World Wide Web (WWW) A Distributed Document-Based System

Web ServerWeb Server

A Web server uses the client/server model and the WWW Hypertext Transfer Protocol

Every computer on the Internet that contains a Web site must have a Web server program.

Two leading Web servers are Apache the most widely-installed Web server, and Microsoft's Internet Information Server (IIS).

Page 27: World Wide Web (WWW) A Distributed Document-Based System

Apache Server Apache Server

Page 28: World Wide Web (WWW) A Distributed Document-Based System

Processing HTTP Requests in Processing HTTP Requests in Apache ServerApache Server

1. Resolving the document reference to a local file name.

2. Client authentication. 3. Client access control. 4. Request access control. 5. MIME type determination of the response. 6. General phase for handling leftovers. 7. Transmission of the response. 8. Logging data on the processing of the request.

Page 29: World Wide Web (WWW) A Distributed Document-Based System

Server ClusterServer Cluster

Page 30: World Wide Web (WWW) A Distributed Document-Based System

The principle of TCP handoff The principle of TCP handoff

Page 31: World Wide Web (WWW) A Distributed Document-Based System

Scalable content-aware Scalable content-aware cluster of web servers cluster of web servers

Page 32: World Wide Web (WWW) A Distributed Document-Based System

Uniform Resource Identifiers Uniform Resource Identifiers (URI) (URI)

A URI (Uniform Resource Identifier) is the way to identify the points of content.

The most common form of URI is the Web page address.

A URI typically describes: The mechanism used to access the resource

The specific computer that the resource is housed in

The specific name of the resource (a file name) on the computer

Page 33: World Wide Web (WWW) A Distributed Document-Based System

Uniform Resource Locator Uniform Resource Locator (URL) (URL)

A URL contains information on how and where to access a document.

Page 34: World Wide Web (WWW) A Distributed Document-Based System

Uniform Resource Name Uniform Resource Name (URN) (URN)

A URN is an Internet resource with a name that has persistent significance.

A URN looks something like a Web page address or URL

Example: urn:def://blue_laser Both URN and URL are types of a concept called

the URI. The URN is still being developed by members of

the Internet Engineering Task Force (IETF).

Page 35: World Wide Web (WWW) A Distributed Document-Based System

Web Distributed Authoring and Versioning (WebDAV)

An extension to HTTP is called WebDAVWebDAV provides a simple means to lock a

shared document, and to create, delete, copy, and move documents from remote Web servers.

WebDAV supports a simple locking mechanism.

There are two types of write locks, the exclusive write lock, and the shared write lock.

Page 36: World Wide Web (WWW) A Distributed Document-Based System

Web Proxy Caching Web Proxy Caching

Simply caching facility of BrowserWeb-proxy cachingcache cover region or even country

hierarchical caching.

Page 37: World Wide Web (WWW) A Distributed Document-Based System

Neighbor Proxy CachingNeighbor Proxy Caching

Page 38: World Wide Web (WWW) A Distributed Document-Based System

Server Replication Server Replication

Fault tolerance in the Web is mainly achieved through client-side caching and server replication.

High availability in the Web is achieved through redundancy that makes use of generally available techniques in crucial services such as DNS.

Page 39: World Wide Web (WWW) A Distributed Document-Based System

Security Security

Most of the security issues in the Web deal with setting up a secure channel between a client and server.

The predominant approach for setting up a secure channel in the Web is to use the Secure Socket Layer (SSL)

Transport Layer Security (TLS) an update of SSL.

Page 40: World Wide Web (WWW) A Distributed Document-Based System

The position of TLS in the The position of TLS in the Internet protocol stack Internet protocol stack

Page 41: World Wide Web (WWW) A Distributed Document-Based System

TLS with mutual TLS with mutual authenticationauthentication

Page 42: World Wide Web (WWW) A Distributed Document-Based System

The EndThe End