Common Gateway Interface

fraterna 28,352 views 31 slides Mar 20, 2013
Slide 1
Slide 1 of 31
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31

About This Presentation

No description available for this slideshow.


Slide Content

Common Gateway Interface
Web Technologies
Piero Fraternali

Outline
•Architectures for dynamic content
publishing
–CGI
–Java Servlet
–Server-side scripting
–JSP tag libraries

Motivations
•Creating pages on the fly based on the user’s
request and from structured data (e.g.,
database content)
•Client-side scripting & components do not
suffice
–They manipulate an existing document/page, do
not create a new one from strutured content
•Solution:
–Server-side architectures for dynamic content
production

Common Gateway Interface
•An interface that allows the Web Server to launch
external applications that create pages dynamically
•A kind of «double client-server loop»

•Is is not
–A programming language
–A telecommunication protocol
•It is
–An interface between the web server and tha applications that
defines some standard communication variables
•The interface is implemented through system variables, a
universal mechanism present in all operating systems
•A CGI program can be written in any programming language
What CGI is/is not

Invocation
•The client specifies in the URI the name of
the program to invoke
•The program must be deployed in a
specified location at the web server (e.g.,
the cgi-bin directory)
–http://my.server.web/cgi-bin/xyz.exe

Execution
•The server recognizes from the URI that
the requested resource is an executable
–Permissions must be set in the web server for
allowing program execution
–E.g., the extensions of executable files must
be explicitly specified
•http://my.server.web/cgi-bin/xyz.exe

•The web server decodes the paramaters
sent by the client and initializes the CGI
variables
•request_method, query_string, content_length,
content_type
•http://my.server.web/cgi-bin/xyz.exe?par=val
Execution

Execution
•The server lauches the program in a new
process

Execution
•The program executes and «prints» the
response on the standard output

Execution
•The server builds the response from the
content emitted to the standard output and
sends it to the client

Handling request parameters
•Client paramaters can be sent in two ways
–With the HTTP GET method
•parameters are appended to the URL
(1)
•http://www.myserver.it/cgi-bin/xyz?par=val
–With the HTTP POST method
•Parameters are inserted as an HTTP entity in the
body of the request (when their size is substantial)
•Requires the use of HTML forms to allow users
input data onto the body of the request
– (1) The specification of HTTP does not specify any maximum URI length,
practical limits are imposed by web browser and server software

<HTML>
<BODY>
<FORM
action="http://www.mysrvr.it/cgi-bin/xyz.exe"
method=post>
<P> Tell me your name:<p>
<P><INPUT type="text"
NAME="whoareyou"> </p>
<INPUT type="submit"
VALUE="Send">
</FORM>
</BODY>
</HTML>
HTML Form

Read environment variable
Execute business logic
Print MIME heading
Print HTML markup
"Content-type: text/html"
Structure of a CGI program

Read variable
content_length
Read content_length
bytes from the
standard input
Read variable
Query_string
Read variable
Request_method
Parameter decoding

CGI development
•A CGI program can be written in any programming language:
–C/C++
–Fortran
–PERL
–TCL
–Unix shell
–Visual Basic
•In case a compiled programming language is used, the
source code must be compiled
–Normally source files are in cgi-src
–Executable binaries are in cgi-bin
•If instead an interpreted scripting language is used the source
files are deployed
–Normally in the cgi-bin folder

Overview of CGI variables
•Clustered per type:
–server
–request
–headers

Server variables
•These variables are always available, i.e.,
they do not depend on the request
–SERVER_SOFTWARE: name and version of
the server software
•Format: name/version
–SERVER_NAME: hostname or IP of the
server
–GATEWAY_INTERFACE: supported CGI
version
•Format: CGI/version

Request variables
•These variables depend on the request
–SERVER_PROTOCOL: transport protocol name
and version
•Format: protocol/version
–SERVER_PORT: port to which the request is
sent
–REQUEST_METHOD : HTTP request method
–PATH_INFO: extra path information
–PATH_TRANSLATED: translation of PATH_INFO
from virtual to physical
–SCRIPT_NAME: invoked script URL
–QUERY_STRING: the query string

Other request variables
•REMOTE_HOST: client hostname
•REMOTE_ADDR: client IP address
•AUTH_TYPE: authentication type used by
the protocol
•REMOTE_USER: username used during the
authentication
•CONTENT_TYPE : content type in case of
POST and PUT request methods
•CONTENT_LENGTH : content length

Environment variables: headers
•The HTTP headers contained in the request
are stored in the environment with the prefix
HTTP_
–HTTP_USER_AGENT: browser used for the
request
–HTTP_ACCEPT_ENCODING: encoding type
accepted by the client
–HTTP_ACCEPT_CHARSET: charset accepted
by the client
–HTTP_ACCEPT_LANGUAGE: language
accepted by the client

CGI script for inspecting variables
#include <stdlib.h>
#include <stdio.h>
int main (void){
printf("content-type: text/html\n\n");
printf("<html><head><title>Request variables</title></head>");
printf("<body><h1>Some request header variables:</h1>");
fflush(stdout);
printf("SERVER_SOFTWARE: %s<br>\n",getenv("SERVER_SOFTWARE"));
printf("GATEWAY_INTERFACE: %s<br>\n",getenv("GATEWAY_INTERFACE"));
printf("REQUEST_METHOD: %s<br>\n",getenv("REQUEST_METHOD"));
printf("QUERY_STRING: %s<br>\n",getenv("QUERY_STRING"));
printf("HTTP_USER_AGENT: %s<br>\n",getenv("HTTP_USER_AGENT"));
printf("HTTP_ACCEPT_ENCODING: %s<br>\n",getenv("HTTP_ACCEPT_ENCODING"));
printf("HTTP_ACCEPT_CHARSET: %s<br>\n",getenv("HTTP_ACCEPT_CHARSET"));
printf("HTTP_ACCEPT_LANGUAGE: %s<br>\n",getenv("HTTP_ACCEPT_LANGUAGE"));
printf("HTTP_REFERER: %s<br>\n",getenv("HTTP_REFERER"));
printf("REMOTE_ADDR: %s<br>\n",getenv("REMOTE_ADDR"));
printf("</body></html>");
return 0;
}

Example output

Problems with CGI
•Performance and security issues in web server to
application communication
•When the server receives a request, it creates a new
process in order to run the CGI program
•This requires time and significant server resources
•A CGI program cannot interact back with the web server
•The process of the CGI program is terminated when
the program finishes
•No sharing of resources between subsequen calls (e.g., reuse of
database connections)
•No main memory preservation of the user’s session (database
storage is necessary if session data are to be preserved)
•Exposing to the web the physical path to an
executable program can breach security

•CGI reference:
–http://www.w3.org/CGI/
•Security and CGI:
– http://www.w3.org/Security/Faq/index.html
Riferimenti

Form.html
Mult.c
Mult.cgi
Precedentemente
compilato in...
1. Prima
richiesta
2. Recupero
risorsa
3. Risposta
4. Seconda
richiesta
5. Set variabili
d'ambiente e
chiamata
6. Calcolo
risposta
7. Invio
risposta
Form.html
Mult.cgi
Esempio completo

La form (form.html)
<HTML>
<HEAD><TITLE>Form di
moltiplicazione</TITLE><HEAD>
<BODY>
<FORM ACTION="http://www.polimi.it/cgi-bin/run/mult.cgi">
<P>Introdurre i moltiplicandi</P>
<INPUT NAME="m" SIZE="5"><BR/>
<INPUT NAME="n" SIZE="5"><BR/>
<INPUT TYPE="SUBMIT" VALUE="Moltiplica">
</FORM>
<BODY>
</HTML>
URL
chiamata
Vista in un
browser

Lo script
#include <stdio.h>
#include <stdlib.h>
int main(void){
char *data;
long m,n;
printf("%s%c%c\n", "Content-Type:text/html;charset=iso-8859-
1",13,10);
printf("<HTML>\n<HEAD>\n<TITLE>Risultato
moltiplicazione</TITLE>\n<HEAD>\n");
printf("<BODY>\n<H3>Risultato moltiplicazione</H3>\n");
data = getenv("QUERY_STRING");
if(data == NULL)
printf("<P>Errore! Errore nel ricevere i dati dalla form.</P>\n");
else if(sscanf(data,"m=%ld&n=%ld",&m,&n)!=2)
printf("<P>Errore! Dati non validi. Devono essere numerici.</P>\n");
else
printf("<P>Risultato: %ld * %ld = %ld</P>\n",m,n,m*n);
printf("<BODY>\n");
return 0;
}
Istruzioni di
stampa della
risposta
sull'output
Recupero di
valori dalle
variabili
d'ambiente

Compilazione e test locale
•Compilazione:
$ gcc -o mult.cgi mult.c
•Test locale:
$ export QUERY_STRING="m=2&n=3"
$ ./mult.cgi
•Risultato:
Content-Type:text/html;charset=iso-8859-1
<HTML>
<HEAD>
<TITLE>Risultato moltiplicazione</TITLE>
<HEAD>
<BODY>
<H3>Risultato moltiplicazione</H3>
<P>Risultato: 2 * 3 = 6</P>
<BODY>
Set manuale della
variabile
d'ambiente
contenente la
query string

Considerazioni su CGI
•Possibili problemi di sicurezza
•Prestazioni (overhead)
–creare e terminare processi richiede tempo
–cambi di contesto richiedono tempo
•Processi CGI:
–creati a ciascuna invocazione
–non ereditano stato di processo da invocazioni
precedenti (e.g., connessioni a database)

•CGI reference:
http://hoohoo.ncsa.uiuc.edu/cgi/overview.ht
ml
•Sicurezza e CGI:
http://www.w3.org/Security/Faq/wwwsf4.ht
ml
Riferimenti