Development Process - KU · • The development process occurs in phases. ... devices such as...
Transcript of Development Process - KU · • The development process occurs in phases. ... devices such as...
16.10.2012
1
IT Infrastructure for EC:The Internet,
World Wide Web,WWW Technologies
The Development Process
• The development process occurs in phases.
• Output is output PHP code and a MySQL database
What should the site do?What should it look like?
create the templates Good design principlesand performance
PHP acts as the glue betweenthe user/browser
You cannot test yoursite too much!
revisit all the legaland security issues
frequent backups ofyour site’s data
customer feedback, orChanges in technologies
16.10.2012
2
The Development Process in reality
The real design process is best represented by a non linear process
IT Infrastructure in summary
16.10.2012
3
The Internet
• Wikipedia: http://en.wikipedia.org/wiki/Internet• a connection of computer networks using the Internet
Protocol (IP)• What's the difference between the Internet and the World
Wide Web (WWW)?– the Web is the collection of web sites and pages around the
world; the Internet is larger and also– includes other services such as email, chat, online games, etc
Brief history
• began as a US Department of Defense network called ARPANET (1960s‐70s)
• initial services: electronic mail, file transfer• opened to commercial interests in late 80s• WWW created in 1989‐91 by Tim Berners‐Lee• popular web browsers released: Netscape 1994, IE 1995• Amazon.com opens in 1995; Google January 1996• Hamster Dance web page created in 1999
http://en.wikipedia.org/wiki/Hampster_Dance
16.10.2012
4
Key aspects of the internet
• subnetworks can stand on their own
• computers can dynamically join and leave the network
• built on open standards; anyone can create a new internet device
• lack of centralized control (mostly)
• everyone can use it with simple, commonly available software
People and organizations
• Internet Engineering Task Force (IETF): internet protocol standards
• Internet Corporation for Assigned Names and Numbers (ICANN):– decides top‐level domain names
• World Wide Web Consortium (W3C): web standards
16.10.2012
5
Layered architecture
The internet uses a layered hardware/software architecture (also called the "OSI model"):
• physical layer : devices such as ethernet, coaxial cables, fiber‐optic lines, modems
• data link layer : basic hardware protocols (ethernet, wifi, DSL PPP)
• network / internet layer : basic software protocol (IP)
• transport layer : adds reliability to network layer (TCP, UDP)
• application layer : implements specificcommunication for each kind of program (HTTP, POP3/IMAP, SSH, FTP)
Internet Protocol (IP)• a simple protocol for attempting to send data between two
computers
• each device has a 32‐bit IP address written as four 8‐bit numbers (0‐255)
• find out your internet IP address: whatismyip.com
• find out your local IP address:– in a terminal, type: ipconfig (Windows) or ifconfig (Mac/Linux)
16.10.2012
6
Transmission Control Protocol (TCP)
• adds multiplexing, guaranteed message delivery on top of IP• multiplexing: multiple programs using the same IP address
– port: uniquely identifies applications on a single computer and enable them to share a single physical connection to Internet
– port 80: web browser (port 443 for secure browsing)– port 25: email– port 22: ssh– port 5190: AOL Instant Messenger– more common ports:
http://en.wikipedia.org/wiki/List_of_TCP_and_UDP_port_numbers
• some programs (games, streaming media programs) use simpler UDP protocol instead of TCPhttp://en.wikipedia.org/wiki/User_Datagram_Protocol
Web servers and browsers
• web server: software that listens for web page requests– Apache
– Microsoft Internet Information Server (IIS) (part of Windows)
• web browser: fetches/displays documents from web servers– Mozilla Firefox
– Microsoft Internet Explorer (IE)
– Apple Safari
– Google Chrome
– Opera
16.10.2012
7
Domain Name System (DNS)
• a set of servers that map written names to IP addresses
– Example: www.cs.washington.edu → 128.208.3.88
• many systems maintain a local cache called a hosts file
– Windows: C:\Windows\system32\drivers\etc\hosts
– Mac: /private/etc/hosts
– Linux: /etc/hosts
Uniform Resource Locator (URL)
• an identifier for the location of a document on a web site
• a basic URL:
• upon entering this URL into the browser, it would:– ask the DNS server for the IP address of www.aw‐bc.com
– connect to that IP address at port 80
– ask the server to GET /info/regesstepp/index.html
– display the resulting page on the screen
16.10.2012
8
More advanced URLs
Hypertext Transport Protocol (HTTP)
• the set of commands understood by a web server and sent from a browser
• some HTTP commands (your browser sends these internally):– GET filename : download
– POST filename : send a web form response
– PUT filename : upload
• simulating a browser with a terminal window:
http://en.wikipedia.org/wiki/Telnet
16.10.2012
9
HTTP error codes
• when something goes wrong, the web server returns a special "error code" number to the browser, possibly followed by an HTML document
• common error codes:
Internet media ("MIME") types
• sometimes when including resources in a page (style sheet, icon, multimedia object), we specify their type of data
• Lists of common MIME types:
http://en.wikipedia.org/wiki/Internet_media_type#List_of_common_media_types
16.10.2012
10
Web languages / technologies
• Hypertext Markup Language (HTML): used for writing web pages
• Cascading Style Sheets (CSS): stylistic info for web pages• PHP Hypertext Processor (PHP): dynamically create pages
on a web server• Structured Query Language (SQL): interaction with
databases• JavaScript: interactive and programmable web pages• Asynchronous JavaScript and XML (Ajax): accessing data for
web applications• eXtensible Markup Language (XML): metalanguage for
organizing data
Client‐side: HTML, CSS, XML
16.10.2012
11
HTML
• HyperText Markup Language• Primary document type for the web
– Transmitted using the HyperText Transfer Protocol• Client sends request string (with parameters)• Server returns a document
– Stateless protocol
• Describes document content and structure– Precise formatting directives added later– Content and structure in same document
• Browser or formatter responsible for rendering– Can partially render malformed documents– Different browsers render differently
HTML evolution
• HTML 1 [early ‘90s]
– Invented by Tim Berners‐Lee of CERN
• Aimed as standard format to faciliate collaboration between physicists
– Based on the SGML framework
• Old ISO standard for structuring documents– Tags for paragraphs, headings, lists, etc.
• HTML added the hyperlinks, thus creating the web
– Rendered on prototype formatters
16.10.2012
12
23
HTML evolution
• HTML+ [mid ‘94]– Defined by small group of researchers
– Several new tags• Most notably, IMG for embedding images
– Many browsers • First text‐based browser (Lynx) released in 03/93
• First graphical browser (Mosaic) released in 04/93
• First W3 conference [5/94]– HTML+ presented
24
HTML evolution
• HTML 2 [7/94‐11/95]
– Prompted by variety of diverging language variants and additions of different browsers
– Adds many widely used tags
• e.g., forms
– No custom style support
• e.g., no colors
• W3 consortium formed [Late 94]
– Mission: Open standards for the web
16.10.2012
13
25
HTML evolution
• Netscape formed [11/94]
– Becomes immediate market leader
• Support for home users
– Forms a de‐facto standard
• Use of “Netscape proprietary tags”– Difficult for other browsers to replicate
– Documents start rendering differently
• Addition of stylistic tags– e.g., font color and size, backgrounds, image alignment
– Frowned upon by structure‐only advocates
26
HTML evolution
• HTML 3.0 draft proposed
– Huge language overhaul
• Tables, math, footnotes
• Support for style sheets (discussed later)
– Too difficult for browsers to adapt
• Every browser implemented different subset– But claimed to support the standard
» And added new tags…
• Standard abandoned– Incremental changes from here on
16.10.2012
14
27
HTML evolution
• Microsoft introduces Internet explorer [8/95]– First serious competition to Netscape
– Starts introducing its own tags
• e.g., MARQUEE• Effectively splitting web sites into Microsoft and Netscape pages
– Many sites have two versions
• Microsoft starts supporting interactive application embedding with ActiveX– Netscape responds with the emerging Java technology
– Starts supporting JavaScript
• Microsoft introduces VBScript
28
HTML evolution
• HTML 3.2 [1/97]
– Implements some of the HTML 3.0 proposals
• Essentially catches up with some widespread features.
– Supports applets
– Placeholders for scripting and stylesheet support
16.10.2012
15
29
HTML evolution
• HTML 4 [12/97]– Major overhaul
• Stylesheet support • Tag identifier attribute• Internationalization and bidirectional text• Accessibility• Frames and inline frames• <object> tag for multimedia and embedded objects
– Adapted by IE (market leader)• Slow adaptation by Netscape
• XML 1.0 standard [2/98]• XHTML 1.0 [1/00, 8/02]
30
Limitations of HTML
• No support for accessibility until HTML 4
• No support for internationalization until HTML 4
• No dynamic content in original definition
• No inherent support for different display configurations (e.g., grayscale screen)– Except for alt tag for images
– Added in CSS2
• No separation of data, structure and formatting
– Until version 4
16.10.2012
16
31
Wireless Markup Language (WML)
• Markup language for WAP browsers– WAP = Wireless Application Protocol
– Based on limited HTML, uses XML notation
– Uses WMLScript scripting language, based on JavaScript
• A page is called a “deck”, displayed in individual sections called “cards”– Tasks are used to perform events
– Variables used to maintain state between cards
32
Why CSS?
• HTML was not meant to support styling information– But browsers started supporting inline style changes to make web look better
• Inline styling information is problematic– Difficult to change– Lack of consistency– No support for different display formats– Bloats pages – No support for some styling features
16.10.2012
17
33
CSS
• HTML document typically refers to external style sheet<HEAD>
<LINK rel="stylesheet" type="text/css“ href="fluorescent.css">
</HEAD>
• Style sheets can be embedded:
<HEAD><STYLE type="text/css"><!-- …CSS DEFINITIONS.. -->
</STYLE></HEAD>
34
XML
• Extensible Markup Language – Based on SGML format
– Intended to facilitate data exchange
• Documents consist of tags and data– Data is usually untyped characters
– Tags have names and attributes
• Document has tree structure– Tags are nested
– Data areas are considered leafs
– One root
<?xml version="1.0"?> <person>
<name type=“full”>John Doe</name><tel type=“home”>412-555-4444</tel><tel type=“work”>412-268-5555</tel><email>[email protected]</email>
</person>
16.10.2012
18
Client Side:Scripting Languages
JavaScript, VBScript, DHTML
36
JavaScript
• The most common scripting language– Originally supported by Netscape, eventually by IE
• Typically embedded in HTML page– Executable computer code within the HTML content– Interpreted at runtime on the client side
• Can be used to dynamically manipulate an HTML document– Has access to the document’s object model– Can react to events– Can be used to dynamically place data in the first place– Often used to validate form data
• Weak typing
16.10.2012
19
37
VBScript
• Microsoft’s answer to JavaScript– Never been supported by Netscape– Less in use now
• Use <script type="text/vbscript">
• Similar to JavaScript– Follows Visual Basic look and feel– Possible to declare variables
• Use “option explicit” to force declaration
– Separates procedures and functions
38
DHTML
• DHTML is a marketing buzzword
– It is not a W3C standard
– Every browser supports different flavour
– It is HTML 4 + CSS stylesheets + scripting language with access to document model
16.10.2012
20
Client Side: Embedding Interactive Content
Java Applets, ActiveX, .NET controls, Flash
40
Java Applets
• Precompiled Java programs which can run independently within a browser– Main applet class inherits from java.applet.Applet
• Sandboxed by a variety of security measures and functional limitations – Cannot load libraries or native methods
– Cannot read/write most files on host
– Most network connections are blocked
– Cannot start external programs
– Limited access to system properties
– Different window style
16.10.2012
21
Server side: Scripting and low‐level languages
CGI, PHP
42
Common Gateway Interface (CGI)
• Standard interface allowing web server to delegate page creation to external programs
• Any programming language can be used
– Compile into executable binary
– Run scripts (e.g., perl) with executable intepreter
• Arguments passed via environment variables– QUERY_STRING
• Everything after the first ? Symbol in the URL
– PATH_INFO, PATH_TRANSLATED• Additional information in addition to the page URL
• Document returned via standard output
– Should return content-type header
– Can refer to other document with Location
16.10.2012
22
43
CGI Limitations
• Not appropriate for busy servers
– Each program instance is a separate process
• Security risks
– Only web‐master has install privileges
– Bad code can cause serious trouble
44
PHP
• Personal Home Page tools– Open‐source language for server‐side scripting
• Commercial 3rd party optimizers available
– Adopted in popular large‐scale web‐applications
• PhPBB bulletin board system
• Software behind Wikis and WikiPedia
– Some standalone rich‐client applications
• Built‐in facilities for popular protocols and services
• Shifts towards OOP
• Requires special server support– Web master must allow php scripts
16.10.2012
23
Server side: High‐level languages
Java servlets and JSPs
46
Java Servlets
• Java analogue of a CGI script
– Servlet‐enabled server activates servlet
• A servlet can service multiple requests in its lifetime
– More efficient than creating separate processes
• User servlet implements Servlet interface
– init(ServletConfig config)
– Service(ServletRequest req, ServletResponse res)
– destroy()
• Preferable to inherit from HttpServlet
• Filter infrastructure allows transformation of response
16.10.2012
24
Architectures for Web Services
48
Overall Architecture
• UDDI
– Information on available web service
• WSDL
– A description of how to communicate using the web service
• SOAP
– Protocol for exchanging messages
16.10.2012
25
49
Universal Description, Discovery, and Integration (UDDI)
• Platform‐independent XML registry
• Allows businesses to list services they provide
• Registration consists of:
• While pages info – real address and contact information
• Yellow pages info – industrial categorization
• Green pages info – technical information on exposed services
50
Web Services Description Language (WSDL)
• XML format for describing public interface of web services– Services are collection of abstract endpoints called “ports”– Each port has a protocol (“binding”) and address– Each port has a type that defines valid “operations”– An operation consists of messages and data formats
WSDL document describes:– Data formats– Valid messages– Ports types with their supported operations– Binding of ports to types and addresses– Services in terms of ports they provide and documentation
16.10.2012
26
51
Simple Object Access Protocol (SOAP)
• Lightweight protocol for message exchange– Enable “access” to objects, including RPCs
– Defines formats for requests, responses, errors
• XML based, runs on top of HTTP
• Optional header with information on – Security requirements
– Routing
– Transactions
• Body contains actual data