Web Technologies - Alexandru Ioan Cuza Universitybusaco/teach/courses/web/presentations/...HTTP/1.1...
Transcript of Web Technologies - Alexandru Ioan Cuza Universitybusaco/teach/courses/web/presentations/...HTTP/1.1...
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/Web Technologies
Web programming (I)
⥁HTTP protocol
cookies & sessions
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
“There are 2 ways to write error-free programs; only the third one works.”
Alan Perlis
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
What the Web means?
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
World Wide Web
an information space containing elements (things) of interest, called resources,
denoted by global identifiers – URI/IRI
details at www.w3.org/TR/webarch/W3C Recommendation (2004)
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
Web resources
Aspects of interest
identification
interaction
representation by using data formats
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
Web resources
Aspects of interest
identification
interaction
representation by using data formats
URI/IRIprotocol:
HTTP
markup language(s)
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
How about the interaction between client(s) and Web server(s)?
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP
HyperText Transfer Protocol
based on TCP/IP
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP
situated on the application layer
access control to the data transmission medium (MAC – Medium Access Control)
network interconnection + data routing(IP – Internet Protocol)
reliable transport via sockets(TCP – Transmission Control Protocol)
hypertext/hypermedia transfer(HTTP – HyperText Transfer Protocol)
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP
HyperText Transfer Protocol
a reliable request/response protocol
standard access port: 80
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP
HTTP/1.1
Internet standard: RFC 2616 (1999)
from 2014, defined by RFC 7230—7235
www.w3.org/Protocols/
http://devdocs.io/http/
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP
HTTP/2.0
RFC 7540 (2015)
focused on performance
http://royal.pingdom.com/2015/06/11/http2-new-protocol/
http://http2.github.io/
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: architecture
Web Server
daemon – “attendant spirit”
Web Client
browser, Web bot (crawler), multimedia player,…
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: architecture
Web ServerApache, Internet Information Services, Lighttpd, Nginx,…
Web ClientMosaicNetscapeMozillaFirefox,
Internet Explorer, Chromium, wget, iTunes, Echofon, etc.
details in “Web browser architecture” presentation:http://profs.info.uaic.ro/~busaco/teach/courses/cliw/web-film.html#week2
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP
Request and responseaccessing – possibly, changing – a resource
representation by using its URI
Web Server
Web Client
request
response
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: concepts
Message
base unit of the HTTP communication(request or response)
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: concepts
Intermediary
proxygatewaytunnel
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: concepts
Proxylocated in the client/server proximity
having the role of both server and client
Web Server
Web Client p
rox
y
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: concepts
Proxy
forward proxyintermediary for a group of clients
acts on behalf of clients
reverse proxyintermediary for a group of servers
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: concepts
Gatewayintermediary hiding the target (origin) server
the client has no knowledge about it
Web Gate-way
Web Client
Web Server
Web Server
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: concepts
Gateway
can assure: traffic distribution across servers – load balancing
short-term data storage – cachingmessage or request translation (e.g., HTTPSHTTP)
other negotiation operations – role of mediator/broker
open source solutions: HAProxy, Squid, Varnishcloud-based: Amazon ELB (Elastic Load Balancing)
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: concepts
Tunnel
retransmits – usually, encrypted – HTTP messages
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: concepts
Tunnel
retransmits – usually, encrypted – HTTP messages
context: HTTPS protocol – to assure a “secure” HTTP communication via TLS (Transport Layer Security)
authentication based on digital certificates+ bidirectional data encryption
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: concepts
details about a HTTPS
connection
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: concepts
Cache
local storage area – in memory, on a disc –for the messages (data)
server- and/or client-side
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: concepts
Cache
local storage area – in memory, on a disc –for the messages (data)
future requests for that data can be served faster
context: Web application performance
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: messages
HTTP message = header + body
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: messages
Header
includes a set of fields
field-name ":" [ field-value ] CRLF
CR = Carriage Return \r – code 13LF = Line Feed \n – code 10
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: messages
HTTP request
Method Request-URI ProtocolVersion CRLF
[ Message-header ] [ CRLF MIME-data ]
GET /~busaco/teach/courses/web/ HTTP/1.1 CRLF
Host: profs.info.uaic.ro
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: messages
HTTP response
HTTP-version Digit Digit Digit Reason
CRLF Content
HTTP/1.1 200 OK CRLF …
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: methods
GET
request – performed by a client – to access a resource representation
HTML document, CSS stylesheet, image in PNG format, vector illustration as SVG,
JavaScript program, Atom or RSS (XML) news feed,PDF presentation, JSON data,…
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: methods
HEAD
similar to GETusually, offers only meta-data
e.g., MIME type of a resource, last update,…
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: methods
PUT
updates a resource representation or, possibly, creates a resource on the Web server
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: methods
POST
creates a resource, usually sending entities (data, actions) to the server
e.g., data entered into a Web form’ fields
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: methods
DELETE
erases a resource – its representation –from the server
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: methods
Remark
traditionally, the Web browser only permits the use of GET and POST methods
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: methods
A method is considered safeif it does not modify the server state
i.e. no side-effect actions are performed on the server
GET and HEAD are safe
POST, PUT and DELETE are not safe
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: methods
A method is considered idempotent when it can be called many times without different outcomes,returning the same response (representation)
GET, HEAD, PUT and DELETE are idempotent
POST is not idempotent
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: resource representations
Character set encodings
ISO-8859-1ISO-8859-2
KOI8-RISO-2022-JP
UTF-8UTF-16 Little Endian
…
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: resource representations
Message (content) encodings
compression, identity and/or integrity
in most cases: gzip
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: resource representations
Representation formats
textHTML, CSS, plain text, JavaScript code, XML document
or
binaryimage (JPEG, PNG), PDF document, multimedia resource
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: resource representations
Types of the resource content
media types
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/HTTP: header fields (attributes)
Content-Type
permits the transfer of any kind of data
Content-Type: type/subtype
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/HTTP: header fields (attributes)
Content-Type
specified by Media Types – MIME(Multipurpose Internet Mail Extensions)
denotes a set of primary content types+ additional sub-types
initially, used in the e-mail context
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: header fields (attributes)
Primary types
text indicates textual formats
text/plain – unformatted texttext/html – HTML document
text/css – CSS (Cascading Style Sheets) resource
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: header fields (attributes)
Primary types
image specifies graphical formats
image/gif – GIF (Graphics Interchange Format) imagesimage/jpeg – JPEG (Joint Picture Experts Group) photosimage/png – PNG (Portable Network Graphics) pictures
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: header fields (attributes)
Primary types
audio denotes audio content
audio/mpeg – resource encoded in MP3 formatspecification for audio data according to
the MPEG (Motion Picture Experts Group) standard
audio/ac3 – compressed audio resourceconforming to AC-3 standard
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: header fields (attributes)
Primary types
video defines video content: animations, films
video/h264 – resource in H.264 format
video/ogg – content encoded in OGG open format
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: header fields (attributes)
Primary types
application signifies formats that can be processed by applications on the client-side
application/javascript – JavaScript programapplication/json – JSON (JavaScript Object Notation) data
application/octet-stream – stream of arbitrary bytes
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: header fields (attributes)
Primary types
multipart used to transfer composed data
multipart/mixed – mixed contentmultipart/alternative – alternative contents
e.g., different qualities of multimedia streams
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/N. Freed et al., Media Types (February 2017)
http://www.iana.org/assignments/media-types/media-types.xhtml
calendar+json application/calendar+json Calendar in JSON format
csv text/csv CSV data
opus audio/opus Opus audio resource
msword application/msword Word (MS Office) document
tiff image/tiff Image in TIFF format
vnd.rar application/vnd.rar RAR archive
zip application/zip ZIP archive
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/HTTP: header fields (attributes)
Location
Location ":" "http(s)://" authority [ ":" port ] [ abs_path ]
redirects the client to the other resource representation(HTTP redirect)
Location: http://somewhere.info:8080/moved.html
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/HTTP: header fields (attributes)
Referer
denotes the URI of a Web resource that refers to the current resource
used to know the URI source of the requests to a given document (i.e. back-links)
for analytics, logging, caching,…
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/HTTP: header fields (attributes)
Host
specifies the target address – IP or symbolic domain – of the machine supposed to provide
a requested resource
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/HTTP: header fields (attributes)
Other existing fields concern the following:
accepted content (content negotiation) – e.g., Accept
authentication & authorization – WWW-Authenticate Authorization
conditional access to resources – If-Match, If-Modified-Since,…caching policies – Cache-Control, Expires, ETag, etc.proxy – Proxy-Authenticate, Proxy-Authorization, Via
…and others
www.iana.org/assignments/message-headers/message-headers.xhtml
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: status
Informational (1xx)
100 Continue, 101 Switching Protocols
switching protocols: from HTTP to WebSocket (RFC 6455)
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: status
Success (2xx)
200 Ok, 201 Created, 202 Accepted,204 No Content, 206 Partial Content
OPTIONS – method to determine server capabilities or requirements for a resource
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: status
Redirection (3xx)
300 Multiple Choices, 301 Moved Permanently, 302 Found,303 See Other, 304 Not Modified, 305 Use Proxy
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: status
Client Error (4xx)
400 Bad Request, 401 Unauthorized, 403 Forbidden,
405 Method Not Allowed, 408 Request Timeout,
414 Request-URI Too Long
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: status
Server Error (5xx)
500 Internal Server Error, 502 Bad Gateway,
503 Service Unavailable, 504 Gateway Timeout
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: logging
Requests sent to a Web server are logged
Common Log Format
standardized text file format
for Apache HTTP Server: mod_log_config module
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
c12.uaic.ro - msi2013 [13/Feb/2014:14:53:14 +0200] "GET /~vidrascu/MasterSI2/note/Restanta.pdf HTTP/1.1" 206 25227 "http://profs.info.uaic.ro/~vidrascu/MasterSI2/index.html" "...Firefox/27.0"
82-137-8-231.rdsnet.ro - - [13/Feb/2014:15:38:23 +0200] "POST /~computernetworks/login.php HTTP/1.1" 302 1115 "http://profs.info.uaic.ro/~computernetworks/login.php" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:26.0) Gecko/20100101 Firefox/26.0"
ec2-23-21-0-202.compute-1.amazonaws.com - - [13/Feb/2014:15:48:29 +0200] "GET /~busaco/teach/courses/web/presentations/web01ArhitecturaWeb.pdf HTTP/1.1" 200 2081804 "-" "HTTP_Request2/2.2.0 (http://pear.php.net/package/http_request2)..."
199.16.156.126 - - [13/Feb/2014:15:58:58 +0200] "GET /robots.txt HTTP/1.1" 404 182 "-" "Twitterbot/1.0"
psihologie-c-113.psih.uaic.ro - - [13/Feb/2014:16:03:04 +0200] "GET /~busaco/ HTTP/1.1" 200 1942 "-" "Mozilla/5.0 (X11; Linux x86_64; ...) Firefox/27.0"
psihologie-c-113.psih.uaic.ro - - [13/Feb/2014:16:03:04 +0200] "GET /~busaco/csb.css HTTP/1.1" 200 852 "http://profs.info.uaic.ro/~busaco/" "Mozilla/5.0 (X11; Linux x86_64; rv:27.0) Gecko/20100101 Firefox/27.0"
proxy-220-255-2-224.singnet.com.sg - - [13/Feb/2014:16:23:23 +0200] "GET /favicon.ico HTTP/1.1" 200 1406 "-" "Dalvik/1.6.0 (Linux; U; Android 4.0.4; ...)"
c2.uaic.ro - - [13/Feb/2014:16:33:43 +0200] "GET /~busaco/teach/courses/web/ HTTP/1.1" 304 - "-" "... Chrome/32.0.1700.107..."
220.181.51.219 - - [13/Feb/2014:19:20:20 +0200] "HEAD /%7Ebusaco/music/09.Sabin%20Buraga%20-...mp3 HTTP/1.0" 200 - "-" "NSPlayer/10.0.0.4072 WMFSDK/10.0"
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
GET /~busaco/teach/courses/web/web-film.html HTTP/1.1
Host: profs.info.uaic.ro
User-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 10_1_1
like Mac OS X) AppleWebKit/602.2.14 (KHTML, like Gecko)
Version/10.0 Mobile/14B100 Safari/602.1
Accept: text/html,application/xhtml+xml;q=0.9,*/*;q=0.8
Accept-Language: en-us, en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Referer: http://profs.info.uaic.ro/~busaco/teach/courses/web/
HTTP: request – example
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP/1.1 200 OK
Date: Mon, 27 Feb 2017 15:18:01 GMT
Server: Apache
Last-Modified: Mon, 27 Feb 2017 07:46:02 GMT
Content-Encoding: gzip
Content-Length: 11064
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/html
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml"
lang="ro" xml:lang="ro">
…
</html>
con
ten
t
header fields(meta-data)
HTTP: response – example
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
advanced
online inspection of HTTP messageswith www.hurl.it
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
avansat
X- fields are not standardized
expires in the past(not stored in cache)
actual content(Atom feed)
processed by client
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: APIs (libraries)
cURL + libcurl
(C, Java, Haskell, .NET, PHP, Ruby,…) – http://curl.haxx.se/
Apache HttpComponents (Java) – http://hc.apache.org/
httplib (Python 2) + http.client (Python 3)
neon (C library): http://www.webdav.org/neon/
WinHTTP
(Windows specific: C/C++) – http://tinyurl.com/6eemqqc
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: client-side tools
Google Chrome Developer Toolshttps://developers.google.com/web/tools/chrome-devtools/
Firefox Developer Toolshttps://developer.mozilla.org/docs/Tools
Fiddler – a free Web debugging proxywww.telerik.com/fiddler
avansat
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
(instead of) break
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
How about the Web server architecture?
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: Web server
Fulfills multiple requests from the clients respecting the HTTP protocol
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: Web server
Fulfills multiple requests from the clients respecting the HTTP protocol
each request is considered independent from others, although it was issued by the same Web clientconnection state is not kept – stateless
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: Web server
Traditionally, the Web server implementation
is either pre-forked or pre-threaded
on initialization, a number of child processes or threads are created, each process/thread interacting to
a distinct client
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: server Web
advanced
http://strongloop.com/strongblog/node-js-is-faster-than-java/
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: Web server
Server behavior can be controlled by various configuration parameters (directives)
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: Web server
Case study: Apache HTTP Server configuration (from April 1996, the most popular Web server)
http://httpd.apache.org/
global configuration: httpd.conf file6 httpd instances are created by default
a user specific configuration (per directory/URI) is defined via .htaccess – see also https://github.com/phanan/htaccess
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: Web server
Case study: Apache HTTP Server configuration
possibility to define virtual hosts – virtual hosting:same server can host (run) multiple Web sites,
with different symbolic domain names
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP request
post-read-request
IRI translation
headerparsing
accesscontrol
authen-tication
authori-zation
media typechecker
response
log
cleanup
datato theclient
advanced
Apache server: request processing
loop
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: Web server
Usually, the Web server architecture is modular
kernel (core) +
modules implementing specific functionalities
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: Web server
Usually, the Web server architecture is modular
kernel (core) +
modules implementing specific functionalities
provides a C language-based API (Application Programming Interface) to create modules
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: Web server
Usually, the Web server architecture is modular
kernel (core) +
modules implementing specific functionalities
examples (Apache): mod_auth_basic, mod_cache, mod_deflate, mod_include, mod_proxy, mod_session, mod_ssl
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP: Web server
Other approach: asynchronous (non-blocked) single threaded strategies
reference examples: nginx
Node.js
avansat
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
How can we develop the back-end of Web applications?
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
necessity
Dynamic generation – on the server –of representations of resources
requested by clients
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
necessity
Dynamic generation – on the server –of representations of resources
requested by clients
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
solutions
CGI – Common Gateway Interface
Web application servers
Web frameworks
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
solution: cgi
Language-independent programming interfacefacilitating the interaction between clients and
programs invoked on the Web server
de facto standard
RFC 3875http://www.w3.org/CGI/
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cgi
A CGI program (script) is invoked on server
directly
i.e., retrieving data from a Web form after the submit button is pressed
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cgi
A CGI program (script) is invoked on server
indirectly
example: at each visit a new ad (e.g., banner) is generated
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cgi
CGI scripts can be written in any language available on the server
interpreted languagesbash, Perl – e.g., Perl::CGI module –, Python, Ruby,...
compiled languagesC, C++ etc.
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cgi: programming
Each CGI program will write data – the representation of a Web resource –
at standard output (stdout)
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cgi: programming
To denote the type of generated representation, HTTP headers are used – MIME (Media Types)
example: Content-type: text/html
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cgi: programming
Interaction between the client and Web server
Web Server
Web Client
request
response
script
invo-cation
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cgi: variables
A CGI script has access to environment variables
associated to the request sent to the CGI program:
REQUEST_METHOD – HTTP method (GET, POST,…)QUERY_STRING – data transmitted to the clientREMOTE_HOST, REMOTE_ADDR – client address
CONTENT_TYPE – content type as MIME (Media Type)CONTENT_LENGTH – content length in bytes
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cgi: variables
Additional variablesusually, generated by the Web server:
HTTP_ACCEPT – MIME types accepted by client (browser)HTTP_COOKIE – data about cookiesHTTP_HOST – information regarding the host (client)HTTP_USER_AGENT – information about the client
…and others
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
a result received by Web client after the invocation via GET on Web server
of variabile.cgi script(having read & execution rights)
#!/bin/bash# Setting the content typeecho "Content-type: text/plain"; echo
# Executing 'set' command in Linux# to show environment variablesset
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
/* hello.c
(compile with gcc hello.c –o hello.cgi) */
#include <stdio.h>
int main() {
int msgs; /* number of messages */
printf ("Content-type: text/html\n\n");
for (msgs = 0; msgs < 10; msgs++) {
printf ("<p>Hello, world!</p>");
}
return 0;
}
#!/usr/bin/python
# hello.py.cgi
print "Content-type: text/html\n"
for messages in range (0, 10):
print "<p>Hello, world!</p>"
#!/bin/bash
# hello.sh.cgi
echo "Content-type: text/html"
echo
MESSAGES=0
while [ $MESSAGES -lt 10 ]
do
echo "<p>Hello, world!</p>"
let MESSAGES=MESSAGES+1
done
CGI programs written in C, bash, Python generating the same HTML content
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cgi: invocare
experimenting other MIME types, the browser displays the following:
Content-type: text/plain Content-type: text/xml
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cgi: invocation
<form action="http://profs.info.uaic.ro/~.../get-max.cgi"method="GET">
<p>Enter two numbers :<input type="text" name="no1" /> <input type="text" name="no2" /> </p><input type="submit" value="Compute maximum" />
</form>
invocation from an interactive Web formin this case, using the GET method
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/cgi: invocation
special URL in GET case
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cgi: invocation
For each form field, a field_name=value pair – delimited by & – is generated and added to the URL
of the CGI script to be invoked on server
http://profs.info.uaic.ro/~busaco/cgi/get-max.cgi?no1=7&no2=4
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cgi: invocation
Real-life examples:
http://usabilitygeek.com/?s=web+design
https://www.youtube.com/watch?v=hEzmy93zr0Y#t=540
https://twitter.com/search?q=web%20development&src=typd
https://developer.mozilla.org/search?q=ajax&topic=apps
this URL is encoded – URL encodingsee first lecture
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cgi: invocation
The server will invoke a CGI script passing the dataat standard input (stdin)
orvia environment variables
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cgi: invocation
Data processing when GET method is used
data available in QUERY_STRING variable
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cgi: invocation
Data processing when POST method is used
data read from stdin, the length in bytes being specified by CONTENT_LENGTH variable
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cgi: invocation
Data processing – GET and/or POST
in case of application servers or frameworks, data is encapsulated into specific structures/types
ASP.NET (C#) – HttpRequest classPHP – associative arrays: $_GET[] $_POST[] $_REQUEST[]
Play (Java, Scala) – play.api.mvc.Request
Node.js (JavaScript) – http.ClientRequest
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
GET vs. POST
GET method is used to generate the representations of the requested resources
e.g., HTML documents, JPEG images, Atom/RSS news feeds, ZIP archives, etc.
the server state should not be modified
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
GET vs. POST
GET method is used to generate the representations of the requested resources
obtaining data with GET, the user can set a bookmark for further accesses to the Web resource
(by using the URL of the generated representation)
e.g., https://duckduckgo.com/?q=web+programming&ia=videos
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
GET vs. POST
POST method is used when the data transmitted to the server is large (e.g., upload of file content)
or sensitive – typically, passwords
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
GET vs. POST
POST method is used when the data transmitted to the server is large (e.g., upload of file content)
or sensitive – typically, passwords
plus, when the script invocation can produce a state change on the server:
adding a record, altering a file,...
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cgi: support
Web server should support CGI script invocation
example: Apache HTTP Server provides the mod_cgi module
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cgi: ssi
CGI scripts could be directly invoked from a HTML document via SSI (Server Side Includes)
http://www.ssi-developer.net/ssi/
Apache: http://httpd.apache.org/docs/trunk/howto/ssi.html
Nginx: http://nginx.org/en/docs/http/ngx_http_ssi_module.html
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cgi: fastcgi
FastCGIan alternative to CGI focused on performance
implementations:Apache – https://httpd.apache.org/mod_fcgid/
Nginx – nginx.org/en/docs/http/ngx_http_fastcgi_module.html
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
How about a manner to – temporarily – store on front-end (browser) the data transmitted
by the back-end of Web application?
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies
A script running on a Web server can put data on the client-computer via the user’s Web browser
subsequently, the navigator will return that data to the same script available on the same server
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies
A (quasi-)persistent way to store data on the machine of a Web client in order to be
further accessed by a program running on a server
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies: usages
Storing user preferences
typical examples: options regarding interaction – visual theme
(e.g., chromatics), lingual preferences, etc.geographical location, interests on shopping
…
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies: usages
Automatic form completion
using already entered values for certain fields
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies: usages
Monitoring the access to a Web resource
aspect of interest:Web analytics
collecting information about clients(hardware platform, browser, screen resolution, etc.)
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies: usages
Monitoring the access to a Web resource
aspect of interest:user tracking
monitoring the user behaviorDo Not Track initiative – http://donottrack.us/
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies: usages
Storing the authentication info
e.g., keeping data about the user account in the e-commerce context
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies: usages
Transaction status
e.g., current state of the virtual shopping cart provided by an e-shop application
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies: usages
Web session management
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies: types
Persistent cookies
not destroyed when Web browser closes
kept into a file – client-side
time-to-live set by the cookie creator
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies: types
Non-persistent (volatile) cookies
disappear when the browser is closed
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies
A cookie can be considered as a variable
its value is transferred via HTTP between the Web server (back-end application)
and the client (browser)
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies
A cookie can be considered as a variable
name=value
the value is an URL encoded string
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies
Data about a cookie is received by the browser
a list of cookies for each server (domain)
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies
A cookie is sent to a client by using the Set-Cookie
header field of a HTTP response message
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies
Set-Cookie: name=value; expires=date; path=path;
domain=Internet-domain; secure
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies
Set-Cookie: name=value; expires=date; path=path;
domain=Internet-domain; secure
expires – indicates date and time when cookie will expire (Web client should destroy expired cookies)
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies
Set-Cookie: name=value; expires=date; path=path;
domain=Internet-domain; secure
domain – signifies the symbolic name of the Web server that generated the cookie
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies
Set-Cookie: name=value; expires=date; path=path;
domain=Internet-domain; secure
path – specifies a subset of URLs from the cookie’s domain
distinguishes multiple applications existing on the same server
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies
Set-Cookie: name=value; expires=date; path=path;
domain=Internet-domain; secure
secure – indicates that cookie will be sent back to the server only if the communication channel is “secure”
(via HTTPS)
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookie-uri
also, consult Cookiepediahttps://cookiepedia.co.uk/
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies
A cookie is transmitted back from the client to the Web server only if it satisfies
all validity conditions
domain, path, expire date & time, and communication channel security are matching
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies
Server will receive, in the headerof a HTTP request message, the following:
Cookie: name1=value1; name2=value2...
the list of cookies which satisfy the validity conditions
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies
A script invocation consists of returning a representation + placing various cookies
Web Server
Web Client
HTTP requestscript invocation
HTTP responseSet-Cookie: color=green
Script
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies
Cookies – persistent or not –are processed and stored by the browser
Web Server
Web Client
Script
color=
green
persistent cookies are stored in files or databases (SQLite)
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies
Next access to the script is made by transmitting the cookies to the server
according to the validity conditions
Web Server
Web Client
Script
color=
green HTTP requestCookie: color=green
HTTP response
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies: creating
An example for PHP – function setcookie ()
<?php
setcookie ("other_color", "blue"); // non-persistent – why?
echo "A cookie of color " . $_COOKIE["other_color"];
?>
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies: expiring
Nullifying the value and expiration date;optionally, the other cookie attributes
example – PHP:
<?php
setcookie ($cookie_name, "", 0, "/", "", 0);
?>
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies: consulting
Cookies reside in the header field of a HTTP message
HTTP_COOKIE
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies: consulting
PHP – a cookie is specified (accessed) like a variable
$_COOKIE ['cookie_name']
associative array
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
cookies
Other information of interest is available inRFC 6265
HTTP State Management Mechanism
http://tools.ietf.org/html/rfc6265
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
How can we identify successive requests expressed by the same client instance?
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
HTTP is stateless protocol
can not tell if specific successive requests are received from the same client
(from the same instance of a Web browser)
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
necessity
Preserving certain data for a sequence of relatedHTTP messages (requests/responses)
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
necessity
Preserving certain data for a sequence of relatedHTTP messages (requests/responses)
examples: shopping cart status
multi-step Web formscontent pagination
user authentication stateetc.
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
sessions
Each visitor of a Website will have associated a unique identifier – session ID (SID)
stored by a cookie(e.g., ASP.NET_SessionId, PHPSESSID, session-id, _wp_session)
orpropagated via a URL
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
sessions
Each visitor of a Website will have associated a unique identifier – session ID (SID)
in this way, consecutive visits (requests) made by the same user could be identified
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
sessions
Various variables could be attached to a session
their values will be kept (stored) between consecutive – e.g., related – requests from the same instance
of a Web client (browser)
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
sessions
A session could be implicitly (automatically) or explicitly (manually, by programmer) registered,
depending on the Web application server or the default configuration
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
sessions
A session could be implicitly (automatically) or explicitly (manually, by programmer) registered,
depending on the Web application server or the default configuration
Web session info is persistently stored on the server by using non-relational database systems – e.g., DynamoDB,
Memcached, Redis,… – or, in most cases, files
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
POST / HTTP/1.1
Accept: text/html,application/xhtml+xml,
application/xml;q=0.9,*/*;q=0.8
Accept-Encoding: gzip, deflate
Accept-Language: en,en-GB;q=0.5
Connection: keep-alive
Cookie: language=en_US
Host: mail.info.uaic.ro
Referer: http://mail.info.uaic.ro/?_task=login
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 … Gecko/20100101 Firefox/51.0
user authentication by using POST method(already existing cookies are transmitted)
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
sesiuni: exemplificare
HTTP/1.1 302 Found
Cache-Control: private, no-cache, no-store, must-revalidate…
Connection: Keep-Alive
Content-Length: 0
Content-Type: text/html; charset=UTF-8
Date: Thu, 23 Feb 2017 10:25:44 GMT
Keep-Alive: timeout=5, max=100
Last-Modified: Thu, 23 Feb 2017 10:25:44 GMT
Location: ./?_task=mail&_token=cb1924…c9c97819
Server: Apache/2.4.6 (CentOS) mod_fcgid/2.3.9 PHP/5.4.16
Set-Cookie: roundcube_sessid=vnqrt4…2uv2; path=/; HttpOnly
roundcube_sessauth=S92ee64…2c71; path=/; HttpOnly
<!DOCTYPE html>
…
HTTP response a Web session-related cookie is set
advanced
redirection after
authentication
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
sessions: programming
In the case of CGI, session management must be entirely implemented by the programmer
there is no standard way for Web session processing
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
sessions: programming
PHP – functions: session_start(), session_register(),session_id(), session_unset(), session_destroy()
<?php
session_start (); // creating a session
if (!isset ($_SESSION['accesses'])) {
$_SESSION['accesses'] = 0; } else {
$_SESSION['accesses']++; }
?>
accesses variable attached to the session
details at http://php.net/manual/en/book.session.php
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
sessions: programming
By using an application server or framework, the cookie and session management is simpler
various examples:HttpSession class (ASP.NET), HttpSession interface (Java servlets),
HTTP::Session (Perl), session (Flask – Python framework), web.session(web.py), HttpFoundation (component of Symfony – PHP framework),
SessionComponent class (CakePHP), session array (Ruby on Rails),play.mvc.Http.Cookie (Play! for Java/Scala), sessions (Gorilla – Go)cookie-parser and express-session (Node.js modules for Express)
advanced
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
alternatives
HTML5 provides Web Storage
W3C recommendation (2015)
browser-level storage for lists of key—value pairs via sessionStorage and localStorage attributes
for details, studyprofs.info.uaic.ro/~busaco/teach/courses/cliw/web-film.html#week11
avansat
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/“conclusion”
⥁from HTTP to cookies and Web sessions
many thanks to Ciprian Amariei, MSc.
Dr.
Sab
in B
ura
ga
profs.in
fo.uaic.ro/~busa
co/
next episode: Web programmingWeb application servers, Web application architecture
brow-ser
presen-tation
pro-cessing
data access
<Web/> pages
HTML, CSS,…
fat serverdumb client
frontend backend