mirror of
				https://github.com/python/cpython.git
				synced 2025-11-03 19:34:08 +00:00 
			
		
		
		
	Done with this for 1.4.
This commit is contained in:
		
							parent
							
								
									3a26dd88af
								
							
						
					
					
						commit
						f73f79b5fd
					
				
					 4 changed files with 192 additions and 132 deletions
				
			
		| 
						 | 
				
			
			@ -1,34 +1,35 @@
 | 
			
		|||
\chapter{Restricted Execution}
 | 
			
		||||
 | 
			
		||||
In general, executing Python programs have complete access to the
 | 
			
		||||
underlying operating system through the various functions and classes
 | 
			
		||||
contained in Python's modules.  For example, a Python program can open
 | 
			
		||||
any file\footnote{Provided the underlying OS gives you permission!}
 | 
			
		||||
for reading and writing by using the
 | 
			
		||||
\code{open()} built-in function.  This is exactly what you want for
 | 
			
		||||
most applications.
 | 
			
		||||
In general, Python programs have complete access to the underlying
 | 
			
		||||
operating system throug the various functions and classes, For
 | 
			
		||||
example, a Python program can open any file for reading and writing by
 | 
			
		||||
using the \code{open()} built-in function (provided the underlying OS
 | 
			
		||||
gives you permission!).  This is exactly what you want for most
 | 
			
		||||
applications.
 | 
			
		||||
 | 
			
		||||
There is a class of applications for which this ``openness'' is
 | 
			
		||||
inappropriate.  Imagine a web browser that accepts ``applets'', snippets of
 | 
			
		||||
Python code, from anywhere on the Internet for execution on the local
 | 
			
		||||
system.  Since the originator of the code is unknown, it is obvious that it
 | 
			
		||||
cannot be trusted with the full resources of the local machine.
 | 
			
		||||
There exists a class of applications for which this ``openness'' is
 | 
			
		||||
inappropriate.  Take Grail: a web browser that accepts ``applets'',
 | 
			
		||||
snippets of Python code, from anywhere on the Internet for execution
 | 
			
		||||
on the local system.  This can be used to improve the user interface
 | 
			
		||||
of forms, for instance.  Since the originator of the code is unknown,
 | 
			
		||||
it is obvious that it cannot be trusted with the full resources of the
 | 
			
		||||
local machine.
 | 
			
		||||
 | 
			
		||||
\emph{Restricted execution} is the basic Python framework that allows
 | 
			
		||||
\emph{Restricted execution} is the basic framework in Python that allows
 | 
			
		||||
for the segregation of trusted and untrusted code.  It is based on the
 | 
			
		||||
notion that trusted Python code (a \emph{supervisor}) can create a
 | 
			
		||||
``padded cell' (or environment) of limited permissions, and run the
 | 
			
		||||
``padded cell' (or environment) with limited permissions, and run the
 | 
			
		||||
untrusted code within this cell.  The untrusted code cannot break out
 | 
			
		||||
of its cell, and can only interact with sensitive system resources
 | 
			
		||||
through interfaces defined, and managed by the trusted code.  The term
 | 
			
		||||
``restricted execution'' is favored over the term ``safe-Python''
 | 
			
		||||
through interfaces defined and managed by the trusted code.  The term
 | 
			
		||||
``restricted execution'' is favored over ``safe-Python''
 | 
			
		||||
since true safety is hard to define, and is determined by the way the
 | 
			
		||||
restricted environment is created.  Note that the restricted
 | 
			
		||||
environments can be nested, with inner cells creating subcells of
 | 
			
		||||
lesser, but never greater, privledge.
 | 
			
		||||
lesser, but never greater, privilege.
 | 
			
		||||
 | 
			
		||||
An interesting aspect of Python's restricted execution model is that
 | 
			
		||||
the attributes presented to untrusted code usually have the same names
 | 
			
		||||
the interfaces presented to untrusted code usually have the same names
 | 
			
		||||
as those presented to trusted code.  Therefore no special interfaces
 | 
			
		||||
need to be learned to write code designed to run in a restricted
 | 
			
		||||
environment.  And because the exact nature of the padded cell is
 | 
			
		||||
| 
						 | 
				
			
			@ -42,11 +43,22 @@ may redefine the built-in
 | 
			
		|||
\code{chroot()}-like operation on the \var{filename} parameter, such
 | 
			
		||||
that root is always relative to some safe ``sandbox'' area of the
 | 
			
		||||
filesystem.  In this case, the untrusted code would still see an
 | 
			
		||||
\code{open()} function in its \code{__builtin__} module, with the same
 | 
			
		||||
built-in \code{open()} function in its environment, with the same
 | 
			
		||||
calling interface.  The semantics would be identical too, with
 | 
			
		||||
\code{IOError}s being raised when the supervisor determined that an
 | 
			
		||||
unallowable parameter is being used.
 | 
			
		||||
 | 
			
		||||
The Python run-time determines whether a particular code block is
 | 
			
		||||
executing in restricted execution mode based on the identity of the
 | 
			
		||||
\code{__builtins__} object in its global variables: if this is (the
 | 
			
		||||
dictionary of) the standard \code{__builtin__} module, the code is
 | 
			
		||||
deemed to be unrestricted, else it is deemed to be restricted.
 | 
			
		||||
 | 
			
		||||
Python code executing in restricted mode faces a number of limitations
 | 
			
		||||
that are designed to prevent it from escaping from the padded cell.
 | 
			
		||||
For instance, the function object attribute \code{func_globals} and the
 | 
			
		||||
class and instance object attribute \code{__dict__} are unavailable.
 | 
			
		||||
 | 
			
		||||
Two modules provide the framework for setting up restricted execution
 | 
			
		||||
environments:
 | 
			
		||||
 | 
			
		||||
| 
						 | 
				
			
			
 | 
			
		|||
| 
						 | 
				
			
			@ -6,7 +6,8 @@ This module contains the \code{RExec} class, which supports
 | 
			
		|||
\code{r_exec()}, \code{r_eval()}, \code{r_execfile()}, and
 | 
			
		||||
\code{r_import()} methods, which are restricted versions of the standard
 | 
			
		||||
Python functions \code{exec()}, \code{eval()}, \code{execfile()}, and
 | 
			
		||||
\code{import()}.  Code executed in this restricted environment will
 | 
			
		||||
the \code{import} statement.
 | 
			
		||||
Code executed in this restricted environment will
 | 
			
		||||
only have access to modules and functions that are deemed safe; you
 | 
			
		||||
can subclass \code{RExec} to add or remove capabilities as desired.
 | 
			
		||||
 | 
			
		||||
| 
						 | 
				
			
			@ -14,14 +15,13 @@ can subclass \code{RExec} to add or remove capabilities as desired.
 | 
			
		|||
unsafe operations like reading or writing disk files, or using TCP/IP
 | 
			
		||||
sockets.  However, it does not protect against code using extremely
 | 
			
		||||
large amounts of memory or CPU time.  
 | 
			
		||||
% XXX is there any protection against this?
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{RExec}{\optional{hooks\, verbose} }
 | 
			
		||||
\begin{funcdesc}{RExec}{\optional{hooks\optional{\, verbose}}}
 | 
			
		||||
Returns an instance of the \code{RExec} class.  
 | 
			
		||||
 | 
			
		||||
% XXX is ihooks.py documented?  If yes, there should be a ref here
 | 
			
		||||
 | 
			
		||||
\var{hooks} is an instance of the \code{RHooks} class or a subclass of it.
 | 
			
		||||
If it is omitted or \code{None}, the default \code{RHooks} class is
 | 
			
		||||
instantiated.
 | 
			
		||||
Whenever the RExec module searches for a module (even a built-in one)
 | 
			
		||||
or reads a module's code, it doesn't actually go out to the file
 | 
			
		||||
system itself.  Rather, it calls methods of an RHooks instance that
 | 
			
		||||
| 
						 | 
				
			
			@ -30,7 +30,7 @@ object doesn't make these calls---they are made by a module loader
 | 
			
		|||
object that's part of the RExec object.  This allows another level of
 | 
			
		||||
flexibility, e.g. using packages.)
 | 
			
		||||
 | 
			
		||||
By providing an alternate RHooks object, we can control the actual
 | 
			
		||||
By providing an alternate RHooks object, we can control the
 | 
			
		||||
file system accesses made to import a module, without changing the
 | 
			
		||||
actual algorithm that controls the order in which those accesses are
 | 
			
		||||
made.  For instance, we could substitute an RHooks object that passes
 | 
			
		||||
| 
						 | 
				
			
			@ -38,12 +38,11 @@ all filesystem requests to a file server elsewhere, via some RPC
 | 
			
		|||
mechanism such as ILU.  Grail's applet loader uses this to support
 | 
			
		||||
importing applets from a URL for a directory.
 | 
			
		||||
 | 
			
		||||
% XXX does verbose actually do anything at the moment?
 | 
			
		||||
If \var{verbose} is true, additional debugging output will be sent to
 | 
			
		||||
If \var{verbose} is true, additional debugging output may be sent to
 | 
			
		||||
standard output.
 | 
			
		||||
\end{funcdesc}
 | 
			
		||||
 | 
			
		||||
RExec instances have the following attributes, which are used by the
 | 
			
		||||
The RExec class has the following class attributes, which are used by the
 | 
			
		||||
\code{__init__} method.  Changing them on an existing instance won't
 | 
			
		||||
have any effect; instead, create a subclass of \code{RExec} and assign
 | 
			
		||||
them new values in the class definition.  Instances of the new class
 | 
			
		||||
| 
						 | 
				
			
			@ -54,22 +53,31 @@ strings.
 | 
			
		|||
\begin{datadesc}{nok_builtin_names}
 | 
			
		||||
Contains the names of built-in functions which will \emph{not} be
 | 
			
		||||
available to programs running in the restricted environment.  The
 | 
			
		||||
 value for \code{RExec} is \code{('open',} \code{reload',}
 | 
			
		||||
 \code{__import__')}.
 | 
			
		||||
value for \code{RExec} is \code{('open',} \code{'reload',}
 | 
			
		||||
\code{'__import__')}.  (This gives the exceptions, because by far the
 | 
			
		||||
majority of built-in functions are harmless.  A subclass that wants to
 | 
			
		||||
override this variable should probably start with the value from the
 | 
			
		||||
base class and concatenate additional forbidden functions --- when new
 | 
			
		||||
dangerous built-in functions are added to Python, they will also be
 | 
			
		||||
added to this module.)
 | 
			
		||||
\end{datadesc}
 | 
			
		||||
 | 
			
		||||
\begin{datadesc}{ok_builtin_modules}
 | 
			
		||||
Contains the names of built-in modules which can be safely imported.
 | 
			
		||||
The value for \code{RExec} is \code{('array',} \code{'binascii',} \code{'audioop',}
 | 
			
		||||
\code{'imageop',} \code{'marshal',} \code{'math',} \code{'md5',} \code{'parser',} \code{'regex',} \code{'rotor',}
 | 
			
		||||
\code{'select',} \code{'strop',} \code{'struct',} \code{'time')}.
 | 
			
		||||
The value for \code{RExec} is \code{('audioop',} \code{'array',}
 | 
			
		||||
\code{'binascii',} \code{'cmath',} \code{'errno',} \code{'imageop',}
 | 
			
		||||
\code{'marshal',} \code{'math',} \code{'md5',} \code{'operator',}
 | 
			
		||||
\code{'parser',} \code{'regex',} \code{'rotor',} \code{'select',}
 | 
			
		||||
\code{'strop',} \code{'struct',} \code{'time')}.  A similar remark
 | 
			
		||||
about overriding this variable applies --- use the value from the base
 | 
			
		||||
class as a starting point.
 | 
			
		||||
\end{datadesc}
 | 
			
		||||
 | 
			
		||||
\begin{datadesc}{ok_path}
 | 
			
		||||
Contains the directories which will be searched when an \code{import}
 | 
			
		||||
is performed in the restricted environment.  
 | 
			
		||||
The value for \code{RExec} is the same as \code{sys.path} for
 | 
			
		||||
unrestricted code.
 | 
			
		||||
The value for \code{RExec} is the same as \code{sys.path} (at the time
 | 
			
		||||
the module is loaded) for unrestricted code.
 | 
			
		||||
\end{datadesc}
 | 
			
		||||
 | 
			
		||||
\begin{datadesc}{ok_posix_names}
 | 
			
		||||
| 
						 | 
				
			
			@ -84,35 +92,38 @@ value for \code{RExec} is \code{('error',} \code{'fstat',}
 | 
			
		|||
\end{datadesc}
 | 
			
		||||
 | 
			
		||||
\begin{datadesc}{ok_sys_names}
 | 
			
		||||
Contains the names of the functions and variables in the \code{sys} module which will be
 | 
			
		||||
available to programs running in the restricted environment.  The
 | 
			
		||||
value for \code{RExec} is \code{('ps1',} \code{'ps2',}
 | 
			
		||||
\code{'copyright',} \code{'version',} \code{'platform',} \code{'exit',}
 | 
			
		||||
\code{'maxint')}.
 | 
			
		||||
Contains the names of the functions and variables in the \code{sys}
 | 
			
		||||
module which will be available to programs running in the restricted
 | 
			
		||||
environment.  The value for \code{RExec} is \code{('ps1',}
 | 
			
		||||
\code{'ps2',} \code{'copyright',} \code{'version',} \code{'platform',}
 | 
			
		||||
\code{'exit',} \code{'maxint')}.
 | 
			
		||||
\end{datadesc}
 | 
			
		||||
 | 
			
		||||
RExec instances support the following methods:
 | 
			
		||||
\renewcommand{\indexsubitem}{(RExec object method)}
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{r_eval}{code}
 | 
			
		||||
\var{code} must either be a string containing a Python expression, or a compiled code object, which will
 | 
			
		||||
be evaluated in the restricted environment.  The value of the expression or code object will be returned.
 | 
			
		||||
\var{code} must either be a string containing a Python expression, or
 | 
			
		||||
a compiled code object, which will be evaluated in the restricted
 | 
			
		||||
environment's \code{__main__} module.  The value of the expression or
 | 
			
		||||
code object will be returned.
 | 
			
		||||
\end{funcdesc}
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{r_exec}{code}
 | 
			
		||||
\var{code} must either be a string containing one or more lines of Python code,  or a compiled code object,
 | 
			
		||||
which will be executed in the restricted environment.  
 | 
			
		||||
\var{code} must either be a string containing one or more lines of
 | 
			
		||||
Python code, or a compiled code object, which will be executed in the
 | 
			
		||||
restricted environment's \code{__main__} module.
 | 
			
		||||
\end{funcdesc}
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{r_execfile}{filename}
 | 
			
		||||
Execute the Python code contained in the file \var{filename} in the
 | 
			
		||||
restricted environment.
 | 
			
		||||
restricted environment's \code{__main__} module.
 | 
			
		||||
\end{funcdesc}
 | 
			
		||||
 | 
			
		||||
Methods whose names begin with \code{s_} are similar to the functions
 | 
			
		||||
beginning with \code{r_}, but the code will be granted access to
 | 
			
		||||
restricted versions of \code{sys.stdin}, \code{sys.stderr}, and
 | 
			
		||||
\code{sys.stdout}.  
 | 
			
		||||
restricted versions of the standard I/O streans \code{sys.stdin},
 | 
			
		||||
\code{sys.stderr}, and \code{sys.stdout}.  
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{s_eval}{code}
 | 
			
		||||
\var{code} must be a string containing a Python expression, which will
 | 
			
		||||
| 
						 | 
				
			
			@ -129,13 +140,14 @@ Execute the Python code contained in the file \var{filename} in the
 | 
			
		|||
restricted environment.
 | 
			
		||||
\end{funcdesc}
 | 
			
		||||
 | 
			
		||||
\code{RExec} objects must also support various methods which will be implicitly called 
 | 
			
		||||
by code executing in the restricted environment.  Overriding these
 | 
			
		||||
methods in a subclass is used to change the policies enforced by a restricted environment.
 | 
			
		||||
\code{RExec} objects must also support various methods which will be
 | 
			
		||||
implicitly called by code executing in the restricted environment.
 | 
			
		||||
Overriding these methods in a subclass is used to change the policies
 | 
			
		||||
enforced by a restricted environment.
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{r_import}{modulename\optional{\, globals, locals, fromlist}}
 | 
			
		||||
Import the module \var{modulename}, raising an \code{ImportError} exception
 | 
			
		||||
if the module is considered unsafe.  
 | 
			
		||||
\begin{funcdesc}{r_import}{modulename\optional{\, globals\, locals\, fromlist}}
 | 
			
		||||
Import the module \var{modulename}, raising an \code{ImportError}
 | 
			
		||||
exception if the module is considered unsafe.
 | 
			
		||||
\end{funcdesc}
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{r_open}{filename\optional{\, mode\optional{\, bufsize}}}
 | 
			
		||||
| 
						 | 
				
			
			@ -144,7 +156,8 @@ environment.  The arguments are identical to those of \code{open()},
 | 
			
		|||
and a file object (or a class instance compatible with file objects)
 | 
			
		||||
should be returned.  \code{RExec}'s default behaviour is allow opening
 | 
			
		||||
any file for reading, but forbidding any attempt to write a file.  See
 | 
			
		||||
the example below for an implementation of a less restrictive \code{r_open()}.
 | 
			
		||||
the example below for an implementation of a less restrictive
 | 
			
		||||
\code{r_open()}.
 | 
			
		||||
\end{funcdesc}
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{r_reload}{module}
 | 
			
		||||
| 
						 | 
				
			
			@ -152,13 +165,15 @@ Reload the module object \var{module}, re-parsing and re-initializing it.
 | 
			
		|||
\end{funcdesc}
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{r_unload}{module}
 | 
			
		||||
Unload the module object \var{module}.   
 | 
			
		||||
% XXX what are the semantics of this?  
 | 
			
		||||
Unload the module object \var{module} (i.e., remove it from the
 | 
			
		||||
restricted environment's \code{sys.modules} dictionary).
 | 
			
		||||
\end{funcdesc}
 | 
			
		||||
 | 
			
		||||
And their equivalents with access to restricted standard I/O streams:
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{s_import}{modulename\optional{\, globals, locals, fromlist}}
 | 
			
		||||
Import the module \var{modulename}, raising an \code{ImportError} exception
 | 
			
		||||
if the module is considered unsafe.  
 | 
			
		||||
Import the module \var{modulename}, raising an \code{ImportError}
 | 
			
		||||
exception if the module is considered unsafe.
 | 
			
		||||
\end{funcdesc}
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{s_reload}{module}
 | 
			
		||||
| 
						 | 
				
			
			@ -179,13 +194,16 @@ standard RExec class.  For example, if we're willing to allow files in
 | 
			
		|||
\bcode\begin{verbatim}
 | 
			
		||||
class TmpWriterRExec(rexec.RExec):
 | 
			
		||||
    def r_open(self, file, mode='r', buf=-1):
 | 
			
		||||
        if mode in ('r', 'rb'): pass 
 | 
			
		||||
	elif mode in ('w', 'wb'):
 | 
			
		||||
        if mode in ('r', 'rb'):
 | 
			
		||||
            pass
 | 
			
		||||
        elif mode in ('w', 'wb', 'a', 'ab'):
 | 
			
		||||
            # check filename : must begin with /tmp/
 | 
			
		||||
	    if file[0:5]!='/tmp/': 
 | 
			
		||||
		raise IOError, "can't open files for writing outside of /tmp"
 | 
			
		||||
	    elif string.find(file, '/../')!=-1:
 | 
			
		||||
		raise IOError, "'..' in filename; open for writing forbidden"
 | 
			
		||||
            if file[:5]!='/tmp/': 
 | 
			
		||||
                raise IOError, "can't write outside /tmp"
 | 
			
		||||
            elif (string.find(file, '/../') >= 0 or
 | 
			
		||||
                 file[:3] == '../' or file[-3:] == '/..'):
 | 
			
		||||
                raise IOError, "'..' in filename forbidden"
 | 
			
		||||
        else: raise IOError, "Illegal open() mode"
 | 
			
		||||
        return open(file, mode, buf)
 | 
			
		||||
\end{verbatim}\ecode
 | 
			
		||||
 | 
			
		||||
| 
						 | 
				
			
			
 | 
			
		|||
| 
						 | 
				
			
			@ -1,34 +1,35 @@
 | 
			
		|||
\chapter{Restricted Execution}
 | 
			
		||||
 | 
			
		||||
In general, executing Python programs have complete access to the
 | 
			
		||||
underlying operating system through the various functions and classes
 | 
			
		||||
contained in Python's modules.  For example, a Python program can open
 | 
			
		||||
any file\footnote{Provided the underlying OS gives you permission!}
 | 
			
		||||
for reading and writing by using the
 | 
			
		||||
\code{open()} built-in function.  This is exactly what you want for
 | 
			
		||||
most applications.
 | 
			
		||||
In general, Python programs have complete access to the underlying
 | 
			
		||||
operating system throug the various functions and classes, For
 | 
			
		||||
example, a Python program can open any file for reading and writing by
 | 
			
		||||
using the \code{open()} built-in function (provided the underlying OS
 | 
			
		||||
gives you permission!).  This is exactly what you want for most
 | 
			
		||||
applications.
 | 
			
		||||
 | 
			
		||||
There is a class of applications for which this ``openness'' is
 | 
			
		||||
inappropriate.  Imagine a web browser that accepts ``applets'', snippets of
 | 
			
		||||
Python code, from anywhere on the Internet for execution on the local
 | 
			
		||||
system.  Since the originator of the code is unknown, it is obvious that it
 | 
			
		||||
cannot be trusted with the full resources of the local machine.
 | 
			
		||||
There exists a class of applications for which this ``openness'' is
 | 
			
		||||
inappropriate.  Take Grail: a web browser that accepts ``applets'',
 | 
			
		||||
snippets of Python code, from anywhere on the Internet for execution
 | 
			
		||||
on the local system.  This can be used to improve the user interface
 | 
			
		||||
of forms, for instance.  Since the originator of the code is unknown,
 | 
			
		||||
it is obvious that it cannot be trusted with the full resources of the
 | 
			
		||||
local machine.
 | 
			
		||||
 | 
			
		||||
\emph{Restricted execution} is the basic Python framework that allows
 | 
			
		||||
\emph{Restricted execution} is the basic framework in Python that allows
 | 
			
		||||
for the segregation of trusted and untrusted code.  It is based on the
 | 
			
		||||
notion that trusted Python code (a \emph{supervisor}) can create a
 | 
			
		||||
``padded cell' (or environment) of limited permissions, and run the
 | 
			
		||||
``padded cell' (or environment) with limited permissions, and run the
 | 
			
		||||
untrusted code within this cell.  The untrusted code cannot break out
 | 
			
		||||
of its cell, and can only interact with sensitive system resources
 | 
			
		||||
through interfaces defined, and managed by the trusted code.  The term
 | 
			
		||||
``restricted execution'' is favored over the term ``safe-Python''
 | 
			
		||||
through interfaces defined and managed by the trusted code.  The term
 | 
			
		||||
``restricted execution'' is favored over ``safe-Python''
 | 
			
		||||
since true safety is hard to define, and is determined by the way the
 | 
			
		||||
restricted environment is created.  Note that the restricted
 | 
			
		||||
environments can be nested, with inner cells creating subcells of
 | 
			
		||||
lesser, but never greater, privledge.
 | 
			
		||||
lesser, but never greater, privilege.
 | 
			
		||||
 | 
			
		||||
An interesting aspect of Python's restricted execution model is that
 | 
			
		||||
the attributes presented to untrusted code usually have the same names
 | 
			
		||||
the interfaces presented to untrusted code usually have the same names
 | 
			
		||||
as those presented to trusted code.  Therefore no special interfaces
 | 
			
		||||
need to be learned to write code designed to run in a restricted
 | 
			
		||||
environment.  And because the exact nature of the padded cell is
 | 
			
		||||
| 
						 | 
				
			
			@ -42,11 +43,22 @@ may redefine the built-in
 | 
			
		|||
\code{chroot()}-like operation on the \var{filename} parameter, such
 | 
			
		||||
that root is always relative to some safe ``sandbox'' area of the
 | 
			
		||||
filesystem.  In this case, the untrusted code would still see an
 | 
			
		||||
\code{open()} function in its \code{__builtin__} module, with the same
 | 
			
		||||
built-in \code{open()} function in its environment, with the same
 | 
			
		||||
calling interface.  The semantics would be identical too, with
 | 
			
		||||
\code{IOError}s being raised when the supervisor determined that an
 | 
			
		||||
unallowable parameter is being used.
 | 
			
		||||
 | 
			
		||||
The Python run-time determines whether a particular code block is
 | 
			
		||||
executing in restricted execution mode based on the identity of the
 | 
			
		||||
\code{__builtins__} object in its global variables: if this is (the
 | 
			
		||||
dictionary of) the standard \code{__builtin__} module, the code is
 | 
			
		||||
deemed to be unrestricted, else it is deemed to be restricted.
 | 
			
		||||
 | 
			
		||||
Python code executing in restricted mode faces a number of limitations
 | 
			
		||||
that are designed to prevent it from escaping from the padded cell.
 | 
			
		||||
For instance, the function object attribute \code{func_globals} and the
 | 
			
		||||
class and instance object attribute \code{__dict__} are unavailable.
 | 
			
		||||
 | 
			
		||||
Two modules provide the framework for setting up restricted execution
 | 
			
		||||
environments:
 | 
			
		||||
 | 
			
		||||
| 
						 | 
				
			
			
 | 
			
		|||
							
								
								
									
										108
									
								
								Doc/librexec.tex
									
										
									
									
									
								
							
							
						
						
									
										108
									
								
								Doc/librexec.tex
									
										
									
									
									
								
							| 
						 | 
				
			
			@ -6,7 +6,8 @@ This module contains the \code{RExec} class, which supports
 | 
			
		|||
\code{r_exec()}, \code{r_eval()}, \code{r_execfile()}, and
 | 
			
		||||
\code{r_import()} methods, which are restricted versions of the standard
 | 
			
		||||
Python functions \code{exec()}, \code{eval()}, \code{execfile()}, and
 | 
			
		||||
\code{import()}.  Code executed in this restricted environment will
 | 
			
		||||
the \code{import} statement.
 | 
			
		||||
Code executed in this restricted environment will
 | 
			
		||||
only have access to modules and functions that are deemed safe; you
 | 
			
		||||
can subclass \code{RExec} to add or remove capabilities as desired.
 | 
			
		||||
 | 
			
		||||
| 
						 | 
				
			
			@ -14,14 +15,13 @@ can subclass \code{RExec} to add or remove capabilities as desired.
 | 
			
		|||
unsafe operations like reading or writing disk files, or using TCP/IP
 | 
			
		||||
sockets.  However, it does not protect against code using extremely
 | 
			
		||||
large amounts of memory or CPU time.  
 | 
			
		||||
% XXX is there any protection against this?
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{RExec}{\optional{hooks\, verbose} }
 | 
			
		||||
\begin{funcdesc}{RExec}{\optional{hooks\optional{\, verbose}}}
 | 
			
		||||
Returns an instance of the \code{RExec} class.  
 | 
			
		||||
 | 
			
		||||
% XXX is ihooks.py documented?  If yes, there should be a ref here
 | 
			
		||||
 | 
			
		||||
\var{hooks} is an instance of the \code{RHooks} class or a subclass of it.
 | 
			
		||||
If it is omitted or \code{None}, the default \code{RHooks} class is
 | 
			
		||||
instantiated.
 | 
			
		||||
Whenever the RExec module searches for a module (even a built-in one)
 | 
			
		||||
or reads a module's code, it doesn't actually go out to the file
 | 
			
		||||
system itself.  Rather, it calls methods of an RHooks instance that
 | 
			
		||||
| 
						 | 
				
			
			@ -30,7 +30,7 @@ object doesn't make these calls---they are made by a module loader
 | 
			
		|||
object that's part of the RExec object.  This allows another level of
 | 
			
		||||
flexibility, e.g. using packages.)
 | 
			
		||||
 | 
			
		||||
By providing an alternate RHooks object, we can control the actual
 | 
			
		||||
By providing an alternate RHooks object, we can control the
 | 
			
		||||
file system accesses made to import a module, without changing the
 | 
			
		||||
actual algorithm that controls the order in which those accesses are
 | 
			
		||||
made.  For instance, we could substitute an RHooks object that passes
 | 
			
		||||
| 
						 | 
				
			
			@ -38,12 +38,11 @@ all filesystem requests to a file server elsewhere, via some RPC
 | 
			
		|||
mechanism such as ILU.  Grail's applet loader uses this to support
 | 
			
		||||
importing applets from a URL for a directory.
 | 
			
		||||
 | 
			
		||||
% XXX does verbose actually do anything at the moment?
 | 
			
		||||
If \var{verbose} is true, additional debugging output will be sent to
 | 
			
		||||
If \var{verbose} is true, additional debugging output may be sent to
 | 
			
		||||
standard output.
 | 
			
		||||
\end{funcdesc}
 | 
			
		||||
 | 
			
		||||
RExec instances have the following attributes, which are used by the
 | 
			
		||||
The RExec class has the following class attributes, which are used by the
 | 
			
		||||
\code{__init__} method.  Changing them on an existing instance won't
 | 
			
		||||
have any effect; instead, create a subclass of \code{RExec} and assign
 | 
			
		||||
them new values in the class definition.  Instances of the new class
 | 
			
		||||
| 
						 | 
				
			
			@ -54,22 +53,31 @@ strings.
 | 
			
		|||
\begin{datadesc}{nok_builtin_names}
 | 
			
		||||
Contains the names of built-in functions which will \emph{not} be
 | 
			
		||||
available to programs running in the restricted environment.  The
 | 
			
		||||
 value for \code{RExec} is \code{('open',} \code{reload',}
 | 
			
		||||
 \code{__import__')}.
 | 
			
		||||
value for \code{RExec} is \code{('open',} \code{'reload',}
 | 
			
		||||
\code{'__import__')}.  (This gives the exceptions, because by far the
 | 
			
		||||
majority of built-in functions are harmless.  A subclass that wants to
 | 
			
		||||
override this variable should probably start with the value from the
 | 
			
		||||
base class and concatenate additional forbidden functions --- when new
 | 
			
		||||
dangerous built-in functions are added to Python, they will also be
 | 
			
		||||
added to this module.)
 | 
			
		||||
\end{datadesc}
 | 
			
		||||
 | 
			
		||||
\begin{datadesc}{ok_builtin_modules}
 | 
			
		||||
Contains the names of built-in modules which can be safely imported.
 | 
			
		||||
The value for \code{RExec} is \code{('array',} \code{'binascii',} \code{'audioop',}
 | 
			
		||||
\code{'imageop',} \code{'marshal',} \code{'math',} \code{'md5',} \code{'parser',} \code{'regex',} \code{'rotor',}
 | 
			
		||||
\code{'select',} \code{'strop',} \code{'struct',} \code{'time')}.
 | 
			
		||||
The value for \code{RExec} is \code{('audioop',} \code{'array',}
 | 
			
		||||
\code{'binascii',} \code{'cmath',} \code{'errno',} \code{'imageop',}
 | 
			
		||||
\code{'marshal',} \code{'math',} \code{'md5',} \code{'operator',}
 | 
			
		||||
\code{'parser',} \code{'regex',} \code{'rotor',} \code{'select',}
 | 
			
		||||
\code{'strop',} \code{'struct',} \code{'time')}.  A similar remark
 | 
			
		||||
about overriding this variable applies --- use the value from the base
 | 
			
		||||
class as a starting point.
 | 
			
		||||
\end{datadesc}
 | 
			
		||||
 | 
			
		||||
\begin{datadesc}{ok_path}
 | 
			
		||||
Contains the directories which will be searched when an \code{import}
 | 
			
		||||
is performed in the restricted environment.  
 | 
			
		||||
The value for \code{RExec} is the same as \code{sys.path} for
 | 
			
		||||
unrestricted code.
 | 
			
		||||
The value for \code{RExec} is the same as \code{sys.path} (at the time
 | 
			
		||||
the module is loaded) for unrestricted code.
 | 
			
		||||
\end{datadesc}
 | 
			
		||||
 | 
			
		||||
\begin{datadesc}{ok_posix_names}
 | 
			
		||||
| 
						 | 
				
			
			@ -84,35 +92,38 @@ value for \code{RExec} is \code{('error',} \code{'fstat',}
 | 
			
		|||
\end{datadesc}
 | 
			
		||||
 | 
			
		||||
\begin{datadesc}{ok_sys_names}
 | 
			
		||||
Contains the names of the functions and variables in the \code{sys} module which will be
 | 
			
		||||
available to programs running in the restricted environment.  The
 | 
			
		||||
value for \code{RExec} is \code{('ps1',} \code{'ps2',}
 | 
			
		||||
\code{'copyright',} \code{'version',} \code{'platform',} \code{'exit',}
 | 
			
		||||
\code{'maxint')}.
 | 
			
		||||
Contains the names of the functions and variables in the \code{sys}
 | 
			
		||||
module which will be available to programs running in the restricted
 | 
			
		||||
environment.  The value for \code{RExec} is \code{('ps1',}
 | 
			
		||||
\code{'ps2',} \code{'copyright',} \code{'version',} \code{'platform',}
 | 
			
		||||
\code{'exit',} \code{'maxint')}.
 | 
			
		||||
\end{datadesc}
 | 
			
		||||
 | 
			
		||||
RExec instances support the following methods:
 | 
			
		||||
\renewcommand{\indexsubitem}{(RExec object method)}
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{r_eval}{code}
 | 
			
		||||
\var{code} must either be a string containing a Python expression, or a compiled code object, which will
 | 
			
		||||
be evaluated in the restricted environment.  The value of the expression or code object will be returned.
 | 
			
		||||
\var{code} must either be a string containing a Python expression, or
 | 
			
		||||
a compiled code object, which will be evaluated in the restricted
 | 
			
		||||
environment's \code{__main__} module.  The value of the expression or
 | 
			
		||||
code object will be returned.
 | 
			
		||||
\end{funcdesc}
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{r_exec}{code}
 | 
			
		||||
\var{code} must either be a string containing one or more lines of Python code,  or a compiled code object,
 | 
			
		||||
which will be executed in the restricted environment.  
 | 
			
		||||
\var{code} must either be a string containing one or more lines of
 | 
			
		||||
Python code, or a compiled code object, which will be executed in the
 | 
			
		||||
restricted environment's \code{__main__} module.
 | 
			
		||||
\end{funcdesc}
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{r_execfile}{filename}
 | 
			
		||||
Execute the Python code contained in the file \var{filename} in the
 | 
			
		||||
restricted environment.
 | 
			
		||||
restricted environment's \code{__main__} module.
 | 
			
		||||
\end{funcdesc}
 | 
			
		||||
 | 
			
		||||
Methods whose names begin with \code{s_} are similar to the functions
 | 
			
		||||
beginning with \code{r_}, but the code will be granted access to
 | 
			
		||||
restricted versions of \code{sys.stdin}, \code{sys.stderr}, and
 | 
			
		||||
\code{sys.stdout}.  
 | 
			
		||||
restricted versions of the standard I/O streans \code{sys.stdin},
 | 
			
		||||
\code{sys.stderr}, and \code{sys.stdout}.  
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{s_eval}{code}
 | 
			
		||||
\var{code} must be a string containing a Python expression, which will
 | 
			
		||||
| 
						 | 
				
			
			@ -129,13 +140,14 @@ Execute the Python code contained in the file \var{filename} in the
 | 
			
		|||
restricted environment.
 | 
			
		||||
\end{funcdesc}
 | 
			
		||||
 | 
			
		||||
\code{RExec} objects must also support various methods which will be implicitly called 
 | 
			
		||||
by code executing in the restricted environment.  Overriding these
 | 
			
		||||
methods in a subclass is used to change the policies enforced by a restricted environment.
 | 
			
		||||
\code{RExec} objects must also support various methods which will be
 | 
			
		||||
implicitly called by code executing in the restricted environment.
 | 
			
		||||
Overriding these methods in a subclass is used to change the policies
 | 
			
		||||
enforced by a restricted environment.
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{r_import}{modulename\optional{\, globals, locals, fromlist}}
 | 
			
		||||
Import the module \var{modulename}, raising an \code{ImportError} exception
 | 
			
		||||
if the module is considered unsafe.  
 | 
			
		||||
\begin{funcdesc}{r_import}{modulename\optional{\, globals\, locals\, fromlist}}
 | 
			
		||||
Import the module \var{modulename}, raising an \code{ImportError}
 | 
			
		||||
exception if the module is considered unsafe.
 | 
			
		||||
\end{funcdesc}
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{r_open}{filename\optional{\, mode\optional{\, bufsize}}}
 | 
			
		||||
| 
						 | 
				
			
			@ -144,7 +156,8 @@ environment.  The arguments are identical to those of \code{open()},
 | 
			
		|||
and a file object (or a class instance compatible with file objects)
 | 
			
		||||
should be returned.  \code{RExec}'s default behaviour is allow opening
 | 
			
		||||
any file for reading, but forbidding any attempt to write a file.  See
 | 
			
		||||
the example below for an implementation of a less restrictive \code{r_open()}.
 | 
			
		||||
the example below for an implementation of a less restrictive
 | 
			
		||||
\code{r_open()}.
 | 
			
		||||
\end{funcdesc}
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{r_reload}{module}
 | 
			
		||||
| 
						 | 
				
			
			@ -152,13 +165,15 @@ Reload the module object \var{module}, re-parsing and re-initializing it.
 | 
			
		|||
\end{funcdesc}
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{r_unload}{module}
 | 
			
		||||
Unload the module object \var{module}.   
 | 
			
		||||
% XXX what are the semantics of this?  
 | 
			
		||||
Unload the module object \var{module} (i.e., remove it from the
 | 
			
		||||
restricted environment's \code{sys.modules} dictionary).
 | 
			
		||||
\end{funcdesc}
 | 
			
		||||
 | 
			
		||||
And their equivalents with access to restricted standard I/O streams:
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{s_import}{modulename\optional{\, globals, locals, fromlist}}
 | 
			
		||||
Import the module \var{modulename}, raising an \code{ImportError} exception
 | 
			
		||||
if the module is considered unsafe.  
 | 
			
		||||
Import the module \var{modulename}, raising an \code{ImportError}
 | 
			
		||||
exception if the module is considered unsafe.
 | 
			
		||||
\end{funcdesc}
 | 
			
		||||
 | 
			
		||||
\begin{funcdesc}{s_reload}{module}
 | 
			
		||||
| 
						 | 
				
			
			@ -179,13 +194,16 @@ standard RExec class.  For example, if we're willing to allow files in
 | 
			
		|||
\bcode\begin{verbatim}
 | 
			
		||||
class TmpWriterRExec(rexec.RExec):
 | 
			
		||||
    def r_open(self, file, mode='r', buf=-1):
 | 
			
		||||
        if mode in ('r', 'rb'): pass 
 | 
			
		||||
	elif mode in ('w', 'wb'):
 | 
			
		||||
        if mode in ('r', 'rb'):
 | 
			
		||||
            pass
 | 
			
		||||
        elif mode in ('w', 'wb', 'a', 'ab'):
 | 
			
		||||
            # check filename : must begin with /tmp/
 | 
			
		||||
	    if file[0:5]!='/tmp/': 
 | 
			
		||||
		raise IOError, "can't open files for writing outside of /tmp"
 | 
			
		||||
	    elif string.find(file, '/../')!=-1:
 | 
			
		||||
		raise IOError, "'..' in filename; open for writing forbidden"
 | 
			
		||||
            if file[:5]!='/tmp/': 
 | 
			
		||||
                raise IOError, "can't write outside /tmp"
 | 
			
		||||
            elif (string.find(file, '/../') >= 0 or
 | 
			
		||||
                 file[:3] == '../' or file[-3:] == '/..'):
 | 
			
		||||
                raise IOError, "'..' in filename forbidden"
 | 
			
		||||
        else: raise IOError, "Illegal open() mode"
 | 
			
		||||
        return open(file, mode, buf)
 | 
			
		||||
\end{verbatim}\ecode
 | 
			
		||||
 | 
			
		||||
| 
						 | 
				
			
			
 | 
			
		|||
		Loading…
	
	Add table
		Add a link
		
	
		Reference in a new issue