Reformatting only.

This commit is contained in:
Guido van Rossum 1996-03-06 19:11:33 +00:00
parent 6d627754c1
commit 391b4e60e6

View file

@ -9,57 +9,58 @@ Python.
Introduction Introduction
------------ ------------
A CGI script is invoked by an HTTP server, usually to process user input A CGI script is invoked by an HTTP server, usually to process user
submitted through an HTML <FORM> or <ISINPUT> element. input submitted through an HTML <FORM> or <ISINPUT> element.
Most often, CGI scripts live in the server's special cgi-bin directory. Most often, CGI scripts live in the server's special cgi-bin
The HTTP server places all sorts of information about the request (such as directory. The HTTP server places all sorts of information about the
the client's hostname, the requested URL, the query string, and lots of request (such as the client's hostname, the requested URL, the query
other goodies) in the script's shell environment, executes the script, and string, and lots of other goodies) in the script's shell environment,
sends the script's output back to the client. executes the script, and sends the script's output back to the client.
The script's input is connected to the client too, and sometimes the form The script's input is connected to the client too, and sometimes the
data is read this way; at other times the form data is passed via the form data is read this way; at other times the form data is passed via
"query string" part of the URL. This module (cgi.py) is intended to take the "query string" part of the URL. This module (cgi.py) is intended
care of the different cases and provide a simpler interface to the Python to take care of the different cases and provide a simpler interface to
script. It also provides a number of utilities that help in debugging the Python script. It also provides a number of utilities that help
scripts, and the latest addition is support for file uploads from a form in debugging scripts, and the latest addition is support for file
(if your browser supports it -- Grail 0.3 and Netscape 2.0 do). uploads from a form (if your browser supports it -- Grail 0.3 and
Netscape 2.0 do).
The output of a CGI script should consist of two sections, separated by a The output of a CGI script should consist of two sections, separated
blank line. The first section contains a number of headers, telling the by a blank line. The first section contains a number of headers,
client what kind of data is following. Python code to generate a minimal telling the client what kind of data is following. Python code to
header section looks like this: generate a minimal header section looks like this:
print "Content-type: text/html" # HTML is following print "Content-type: text/html" # HTML is following
print # blank line, end of headers print # blank line, end of headers
The second section is usually HTML, which allows the client software to The second section is usually HTML, which allows the client software
display nicely formatted text with header, in-line images, etc. Here's to display nicely formatted text with header, in-line images, etc.
Python code that prints a simple piece of HTML: Here's Python code that prints a simple piece of HTML:
print "<TITLE>CGI script output</TITLE>" print "<TITLE>CGI script output</TITLE>"
print "<H1>This is my first CGI script</H1>" print "<H1>This is my first CGI script</H1>"
print "Hello, world!" print "Hello, world!"
(It may not be fully legal HTML according to the letter of the standard, (It may not be fully legal HTML according to the letter of the
but any browser will understand it.) standard, but any browser will understand it.)
Using the cgi module Using the cgi module
-------------------- --------------------
Begin by writing "import cgi". Don't use "from cgi import *" -- the module Begin by writing "import cgi". Don't use "from cgi import *" -- the
defines all sorts of names for its own use that you don't want in your module defines all sorts of names for its own use that you don't want
namespace. in your namespace.
If you have a standard form, it's best to use the SvFormContentDict class. If you have a standard form, it's best to use the SvFormContentDict
Instantiate the SvFormContentDict class exactly once: it consumes any input class. Instantiate the SvFormContentDict class exactly once: it
on standard input, which can't be wound back (it's a network connection, consumes any input on standard input, which can't be wound back (it's
not a disk file). a network connection, not a disk file).
The SvFormContentDict instance can be accessed as if it were a Python The SvFormContentDict instance can be accessed as if it were a Python
dictionary. For instance, the following code checks that the fields dictionary. For instance, the following code checks that the fields
"name" and "addr" are both set to a non-empty string: "name" and "addr" are both set to a non-empty string:
form = SvFormContentDict() form = SvFormContentDict()
@ -73,40 +74,41 @@ dictionary. For instance, the following code checks that the fields
return return
...actual form processing here... ...actual form processing here...
If you have an input item of type "file" in your form and the client If you have an input item of type "file" in your form and the client
supports file uploads, the value for that field, if present in the form, supports file uploads, the value for that field, if present in the
is not a string but a tuple of (filename, content-type, data). form, is not a string but a tuple of (filename, content-type, data).
Overview of classes Overview of classes
------------------- -------------------
SvFormContentDict: single value form content as dictionary; described SvFormContentDict: single value form content as dictionary; described
above. above.
FormContentDict: multiple value form content as dictionary (the form items FormContentDict: multiple value form content as dictionary (the form
are lists of values). Useful if your form contains multiple fields with items are lists of values). Useful if your form contains multiple
the same name. fields with the same name.
Other classes (FormContent, InterpFormContentDict) are present for Other classes (FormContent, InterpFormContentDict) are present for
backwards compatibility only. backwards compatibility only.
Overview of functions Overview of functions
--------------------- ---------------------
These are useful if you want more control, or if you want to employ some These are useful if you want more control, or if you want to employ
of the algorithms implemented in this module in other circumstances. some of the algorithms implemented in this module in other
circumstances.
parse(): parse a form into a Python dictionary. parse(): parse a form into a Python dictionary.
parse_qs(qs): parse a query string. parse_qs(qs): parse a query string.
parse_multipart(...): parse input of type multipart/form-data (for file parse_multipart(...): parse input of type multipart/form-data (for
uploads). file uploads).
parse_header(string): parse a header like Content-type into a main value parse_header(string): parse a header like Content-type into a main
and a dictionary of parameters. value and a dictionary of parameters.
test(): complete test program. test(): complete test program.
@ -114,58 +116,62 @@ print_environ(): format the shell environment in HTML.
print_form(form): format a form in HTML. print_form(form): format a form in HTML.
print_environ_usage(): print a list of useful environment variables in HTML. print_environ_usage(): print a list of useful environment variables in
HTML.
escape(): convert the characters "&", "<" and ">" to HTML-safe sequences. escape(): convert the characters "&", "<" and ">" to HTML-safe
sequences. Use this if you need to display text that might contain
such characters in HTML. To translate URLs for inclusion in the HREF
attribute of an <A> tag, use urllib.quote().
Caring about security Caring about security
--------------------- ---------------------
There's one important rule: if you invoke an external program (e.g. via There's one important rule: if you invoke an external program (e.g.
the os.system() or os.popen() functions), make very sure you don't pass via the os.system() or os.popen() functions), make very sure you don't
arbitrary strings received from the client to the shell. This is a pass arbitrary strings received from the client to the shell. This is
well-known security hole whereby clever hackers anywhere on the web can a well-known security hole whereby clever hackers anywhere on the web
exploit a gullible CGI script to invoke arbitrary shell commands. Even can exploit a gullible CGI script to invoke arbitrary shell commands.
parts of the URL or field names cannot be trusted, since the request Even parts of the URL or field names cannot be trusted, since the
doesn't have to come from your form! request doesn't have to come from your form!
To be on the safe side, if you must pass a string gotten from a form to a To be on the safe side, if you must pass a string gotten from a form
shell command, you should make sure the string contains only alphanumeric to a shell command, you should make sure the string contains only
characters, dashes, underscores, and periods. alphanumeric characters, dashes, underscores, and periods.
Installing your CGI script on a Unix system Installing your CGI script on a Unix system
------------------------------------------- -------------------------------------------
Read the documentation for your HTTP server and check with your local Read the documentation for your HTTP server and check with your local
system administrator to find the directory where CGI scripts should be system administrator to find the directory where CGI scripts should be
installed; usually this is in a directory cgi-bin in the server tree. installed; usually this is in a directory cgi-bin in the server tree.
Make sure that your script is readable and executable by "others"; the Unix Make sure that your script is readable and executable by "others"; the
file mode should be 755 (use "chmod 755 filename"). Make sure that the Unix file mode should be 755 (use "chmod 755 filename"). Make sure
first line of the script contains "#!" starting in column 1 followed by the that the first line of the script contains "#!" starting in column 1
pathname of the Python interpreter, for instance: followed by the pathname of the Python interpreter, for instance:
#!/usr/local/bin/python #!/usr/local/bin/python
Make sure the Python interpreter exists and is executable by "others". Make sure the Python interpreter exists and is executable by "others".
Make sure that any files your script needs to read or write are readable or Make sure that any files your script needs to read or write are
writable, respectively, by "others" -- their mode should be 644 for readable or writable, respectively, by "others" -- their mode should
readable and 666 for writable. This is because, for security reasons, the be 644 for readable and 666 for writable. This is because, for
HTTP server executes your script as user "nobody", without any special security reasons, the HTTP server executes your script as user
privileges. It can only read (write, execute) files that everybody can "nobody", without any special privileges. It can only read (write,
read (write, execute). The current directory at execution time is also execute) files that everybody can read (write, execute). The current
different (it is usually the server's cgi-bin directory) and the set of directory at execution time is also different (it is usually the
environment variables is also different from what you get at login. in server's cgi-bin directory) and the set of environment variables is
particular, don't count on the shell's search path for executables ($PATH) also different from what you get at login. in particular, don't count
or the Python module search path ($PYTHONPATH) to be set to anything on the shell's search path for executables ($PATH) or the Python
interesting. module search path ($PYTHONPATH) to be set to anything interesting.
If you need to load modules from a directory which is not on Python's If you need to load modules from a directory which is not on Python's
default module search path, you can change the path in your script, before default module search path, you can change the path in your script,
importing other modules, e.g.: before importing other modules, e.g.:
import sys import sys
sys.path.insert(0, "/usr/home/joe/lib/python") sys.path.insert(0, "/usr/home/joe/lib/python")
@ -173,71 +179,75 @@ importing other modules, e.g.:
(This way, the directory inserted last will be searched first!) (This way, the directory inserted last will be searched first!)
Instructions for non-Unix systems will vary; check your HTTP server's Instructions for non-Unix systems will vary; check your HTTP server's
documentation (it will usually have a section on CGI scripts). documentation (it will usually have a section on CGI scripts).
Testing your CGI script Testing your CGI script
----------------------- -----------------------
Unfortunately, a CGI script will generally not run when you try it from the Unfortunately, a CGI script will generally not run when you try it
command line, and a script that works perfectly from the command line may from the command line, and a script that works perfectly from the
fail mysteriously when run from the server. There's one reason why you command line may fail mysteriously when run from the server. There's
should still test your script from the command line: if it contains a one reason why you should still test your script from the command
syntax error, the python interpreter won't execute it at all, and the HTTP line: if it contains a syntax error, the python interpreter won't
server will most likely send a cryptic error to the client. execute it at all, and the HTTP server will most likely send a cryptic
error to the client.
Assuming your script has no syntax errors, yet it does not work, you have Assuming your script has no syntax errors, yet it does not work, you
no choice but to read the next section: have no choice but to read the next section:
Debugging CGI scripts Debugging CGI scripts
--------------------- ---------------------
First of all, check for trivial installation errors -- reading the section First of all, check for trivial installation errors -- reading the
above on installing your CGI script carefully can save you a lot of time. section above on installing your CGI script carefully can save you a
If you wonder whether you have understood the installation procedure lot of time. If you wonder whether you have understood the
correctly, try installing a copy of this module file (cgi.py) as a CGI installation procedure correctly, try installing a copy of this module
script. When invoked as a script, the file will dump its environment and file (cgi.py) as a CGI script. When invoked as a script, the file
the contents of the form in HTML form. Give it the right mode etc, and will dump its environment and the contents of the form in HTML form.
send it a request. If it's installed in the standard cgi-bin directory, it Give it the right mode etc, and send it a request. If it's installed
should be possible to send it a request by entering a URL into your browser in the standard cgi-bin directory, it should be possible to send it a
of the form: request by entering a URL into your browser of the form:
http://yourhostname/cgi-bin/cgi.py?name=Joe+Blow&addr=At+Home http://yourhostname/cgi-bin/cgi.py?name=Joe+Blow&addr=At+Home
If this gives an error of type 404, the server cannot find the script -- If this gives an error of type 404, the server cannot find the script
perhaps you need to install it in a different directory. If it gives -- perhaps you need to install it in a different directory. If it
another error (e.g. 500), there's an installation problem that you should gives another error (e.g. 500), there's an installation problem that
fix before trying to go any further. If you get a nicely formatted listing you should fix before trying to go any further. If you get a nicely
of the environment and form content (in this example, the fields should be formatted listing of the environment and form content (in this
listed as "addr" with value "At Home" and "name" with value "Joe Blow"), example, the fields should be listed as "addr" with value "At Home"
the cgi.py script has been installed correctly. If you follow the same and "name" with value "Joe Blow"), the cgi.py script has been
procedure for your own script, you should now be able to debug it. installed correctly. If you follow the same procedure for your own
script, you should now be able to debug it.
The next step could be to call the cgi module's test() function from your The next step could be to call the cgi module's test() function from
script: replace its main code with the single statement your script: replace its main code with the single statement
cgi.test() cgi.test()
This should produce the same results as those gotten from installing the This should produce the same results as those gotten from installing
cgi.py file itself. the cgi.py file itself.
When an ordinary Python script raises an unhandled exception (e.g. because When an ordinary Python script raises an unhandled exception
of a typo in a module name, a file that can't be opened, etc.), the Python (e.g. because of a typo in a module name, a file that can't be opened,
interpreter prints a nice traceback and exits. While the Python etc.), the Python interpreter prints a nice traceback and exits.
interpreter will still do this when your CGI script raises an exception, While the Python interpreter will still do this when your CGI script
most likely the traceback will end up in one of the HTTP server's log raises an exception, most likely the traceback will end up in one of
file, or be discarded altogether. the HTTP server's log file, or be discarded altogether.
Fortunately, once you have managed to get your script to execute *some* Fortunately, once you have managed to get your script to execute
code, it is easy to catch exceptions and cause a traceback to be printed. *some* code, it is easy to catch exceptions and cause a traceback to
The test() function below in this module is an example. Here are the be printed. The test() function below in this module is an example.
rules: Here are the rules:
1. Import the traceback module (before entering the try-except!) 1. Import the traceback module (before entering the
try-except!)
2. Make sure you finish printing the headers and the blank line early 2. Make sure you finish printing the headers and the blank
line early
3. Assign sys.stderr to sys.stdout 3. Assign sys.stderr to sys.stdout
@ -258,13 +268,13 @@ For example:
print "\n\n<PRE>" print "\n\n<PRE>"
traceback.print_exc() traceback.print_exc()
Notes: The assignment to sys.stderr is needed because the traceback prints Notes: The assignment to sys.stderr is needed because the traceback
to sys.stderr. The print "\n\n<PRE>" statement is necessary to disable the prints to sys.stderr. The print "\n\n<PRE>" statement is necessary to
word wrapping in HTML. disable the word wrapping in HTML.
If you suspect that there may be a problem in importing the traceback If you suspect that there may be a problem in importing the traceback
module, you can use an even more robust approach (which only uses built-in module, you can use an even more robust approach (which only uses
modules): built-in modules):
import sys import sys
sys.stderr = sys.stdout sys.stderr = sys.stdout
@ -272,12 +282,13 @@ modules):
print print
...your code here... ...your code here...
This relies on the Python interpreter to print the traceback. The content This relies on the Python interpreter to print the traceback. The
type of the output is set to plain text, which disables all HTML content type of the output is set to plain text, which disables all
processing. If your script works, the raw HTML will be displayed by your HTML processing. If your script works, the raw HTML will be displayed
client. If it raises an exception, most likely after the first two lines by your client. If it raises an exception, most likely after the
have been printed, a traceback will be displayed. Because no HTML first two lines have been printed, a traceback will be displayed.
interpretation is going on, the traceback will readable. Because no HTML interpretation is going on, the traceback will
readable.
Good luck! Good luck!
@ -285,40 +296,40 @@ Good luck!
Common problems and solutions Common problems and solutions
----------------------------- -----------------------------
- Most HTTP servers buffer the output from CGI scripts until the script is - Most HTTP servers buffer the output from CGI scripts until the
completed. This means that it is not possible to display a progress report script is completed. This means that it is not possible to display a
on the client's display while the script is running. progress report on the client's display while the script is running.
- Check the installation instructions above. - Check the installation instructions above.
- Check the HTTP server's log files. ("tail -f logfile" in a separate - Check the HTTP server's log files. ("tail -f logfile" in a separate
window may be useful!) window may be useful!)
- Always check a script for syntax errors first, by doing something like - Always check a script for syntax errors first, by doing something
"python script.py". like "python script.py".
- When using any of the debugging techniques, don't forget to add - When using any of the debugging techniques, don't forget to add
"import sys" to the top of the script. "import sys" to the top of the script.
- When invoking external programs, make sure they can be found. Usually, - When invoking external programs, make sure they can be found.
this means using absolute path names -- $PATH is usually not set to a Usually, this means using absolute path names -- $PATH is usually not
very useful value in a CGI script. set to a very useful value in a CGI script.
- When reading or writing external files, make sure they can be read or - When reading or writing external files, make sure they can be read
written by every user on the system. or written by every user on the system.
- Don't try to give a CGI script a set-uid mode. This doesn't work on most - Don't try to give a CGI script a set-uid mode. This doesn't work on
systems, and is a security liability as well. most systems, and is a security liability as well.
History History
------- -------
Michael McLay started this module. Steve Majewski changed the interface to Michael McLay started this module. Steve Majewski changed the
SvFormContentDict and FormContentDict. The multipart parsing was inspired interface to SvFormContentDict and FormContentDict. The multipart
by code submitted by Andreas Paepcke. Guido van Rossum rewrote, parsing was inspired by code submitted by Andreas Paepcke. Guido van
reformatted and documented the module and is currently responsible for its Rossum rewrote, reformatted and documented the module and is currently
maintenance. responsible for its maintenance.
""" """
@ -376,7 +387,7 @@ def parse_qs(qs):
if len(nv) != 2: if len(nv) != 2:
continue continue
name = nv[0] name = nv[0]
value = urllib.unquote(regsub.gsub('+',' ',nv[1])) value = urllib.unquote(regsub.gsub('+', ' ', nv[1]))
if len(value): if len(value):
if dict.has_key (name): if dict.has_key (name):
dict[name].append(value) dict[name].append(value)
@ -528,13 +539,13 @@ class FormContentDict:
class SvFormContentDict(FormContentDict): class SvFormContentDict(FormContentDict):
"""Strict single-value expecting form content as dictionary. """Strict single-value expecting form content as dictionary.
IF you only expect a single value for each field, then form[key] IF you only expect a single value for each field, then
will return that single value. form[key] will return that single value. It will raise an
It will raise an IndexError if that expectation is not true. IndexError if that expectation is not true. IF you expect a
IF you expect a field to have possible multiple values, than you field to have possible multiple values, than you can use
can use form.getlist(key) to get all of the values. form.getlist(key) to get all of the values. values() and
values() and items() are a compromise: they return single strings items() are a compromise: they return single strings where
where there is a single value, and lists of strings otherwise. there is a single value, and lists of strings otherwise.
""" """
def __getitem__(self, key): def __getitem__(self, key):
@ -627,7 +638,7 @@ def test():
print_environ() print_environ()
print_form(FormContentDict()) print_form(FormContentDict())
print print
print "<H3>Current Working Directory</H3>" print "<H3>Current Working Directory:</H3>"
try: try:
pwd = os.getcwd() pwd = os.getcwd()
except os.error, msg: except os.error, msg: