Fixed #22812 -- Refactored lookup API documentation.

Thanks Anssi and Tim for reviews.
This commit is contained in:
Jorge C. Leitão 2014-06-11 17:09:10 +02:00 committed by Tim Graham
parent 503e59c9b0
commit 8780849da0
8 changed files with 223 additions and 153 deletions

View file

@ -1,407 +0,0 @@
==============
Custom lookups
==============
.. versionadded:: 1.7
.. module:: django.db.models.lookups
:synopsis: Custom lookups
.. currentmodule:: django.db.models
By default Django offers a wide variety of :ref:`built-in lookups
<field-lookups>` for filtering (for example, ``exact`` and ``icontains``). This
documentation explains how to write custom lookups and how to alter the working
of existing lookups.
A simple Lookup example
~~~~~~~~~~~~~~~~~~~~~~~
Let's start with a simple custom lookup. We will write a custom lookup ``ne``
which works opposite to ``exact``. ``Author.objects.filter(name__ne='Jack')``
will translate to the SQL::
"author"."name" <> 'Jack'
This SQL is backend independent, so we don't need to worry about different
databases.
There are two steps to making this work. Firstly we need to implement the
lookup, then we need to tell Django about it. The implementation is quite
straightforward::
from django.db.models import Lookup
class NotEqual(Lookup):
lookup_name = 'ne'
def as_sql(self, qn, connection):
lhs, lhs_params = self.process_lhs(qn, connection)
rhs, rhs_params = self.process_rhs(qn, connection)
params = lhs_params + rhs_params
return '%s <> %s' % (lhs, rhs), params
To register the ``NotEqual`` lookup we will just need to call
``register_lookup`` on the field class we want the lookup to be available. In
this case, the lookup makes sense on all ``Field`` subclasses, so we register
it with ``Field`` directly::
from django.db.models.fields import Field
Field.register_lookup(NotEqual)
We can now use ``foo__ne`` for any field ``foo``. You will need to ensure that
this registration happens before you try to create any querysets using it. You
could place the implementation in a ``models.py`` file, or register the lookup
in the ``ready()`` method of an ``AppConfig``.
Taking a closer look at the implementation, the first required attribute is
``lookup_name``. This allows the ORM to understand how to interpret ``name__ne``
and use ``NotEqual`` to generate the SQL. By convention, these names are always
lowercase strings containing only letters, but the only hard requirement is
that it must not contain the string ``__``.
We then need to define the ``as_sql`` method. This takes a ``SQLCompiler``
object, called ``qn``, and the active database connection. ``SQLCompiler``
objects are not documented, but the only thing we need to know about them is
that they have a ``compile()`` method which returns a tuple containing a SQL
string, and the parameters to be interpolated into that string. In most cases,
you don't need to use it directly and can pass it on to ``process_lhs()`` and
``process_rhs()``.
A ``Lookup`` works against two values, ``lhs`` and ``rhs``, standing for
left-hand side and right-hand side. The left-hand side is usually a field
reference, but it can be anything implementing the :ref:`query expression API
<query-expression>`. The right-hand is the value given by the user. In the
example ``Author.objects.filter(name__ne='Jack')``, the left-hand side is a
reference to the ``name`` field of the ``Author`` model, and ``'Jack'`` is the
right-hand side.
We call ``process_lhs`` and ``process_rhs`` to convert them into the values we
need for SQL using the ``qn`` object described before. These methods return
tuples containing some SQL and the parameters to be interpolated into that SQL,
just as we need to return from our ``as_sql`` method. In the above example,
``process_lhs`` returns ``('"author"."name"', [])`` and ``process_rhs`` returns
``('"%s"', ['Jack'])``. In this example there were no parameters for the left
hand side, but this would depend on the object we have, so we still need to
include them in the parameters we return.
Finally we combine the parts into a SQL expression with ``<>``, and supply all
the parameters for the query. We then return a tuple containing the generated
SQL string and the parameters.
A simple transformer example
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The custom lookup above is great, but in some cases you may want to be able to
chain lookups together. For example, let's suppose we are building an
application where we want to make use of the ``abs()`` operator.
We have an ``Experiment`` model which records a start value, end value and the
change (start - end). We would like to find all experiments where the change
was equal to a certain amount (``Experiment.objects.filter(change__abs=27)``),
or where it did not exceed a certain amount
(``Experiment.objects.filter(change__abs__lt=27)``).
.. note::
This example is somewhat contrived, but it demonstrates nicely the range of
functionality which is possible in a database backend independent manner,
and without duplicating functionality already in Django.
We will start by writing a ``AbsoluteValue`` transformer. This will use the SQL
function ``ABS()`` to transform the value before comparison::
from django.db.models import Transform
class AbsoluteValue(Transform):
lookup_name = 'abs'
def as_sql(self, qn, connection):
lhs, params = qn.compile(self.lhs)
return "ABS(%s)" % lhs, params
Next, lets register it for ``IntegerField``::
from django.db.models import IntegerField
IntegerField.register_lookup(AbsoluteValue)
We can now run the queries we had before.
``Experiment.objects.filter(change__abs=27)`` will generate the following SQL::
SELECT ... WHERE ABS("experiments"."change") = 27
By using ``Transform`` instead of ``Lookup`` it means we are able to chain
further lookups afterwards. So
``Experiment.objects.filter(change__abs__lt=27)`` will generate the following
SQL::
SELECT ... WHERE ABS("experiments"."change") < 27
Subclasses of ``Transform`` usually only operate on the left-hand side of the
expression. Further lookups will work on the transformed value. Note that in
this case where there is no other lookup specified, Django interprets
``change__abs=27`` as ``change__abs__exact=27``.
When looking for which lookups are allowable after the ``Transform`` has been
applied, Django uses the ``output_field`` attribute. We didn't need to specify
this here as it didn't change, but supposing we were applying ``AbsoluteValue``
to some field which represents a more complex type (for example a point
relative to an origin, or a complex number) then we may have wanted to specify
``output_field = FloatField``, which will ensure that further lookups like
``abs__lte`` behave as they would for a ``FloatField``.
Writing an efficient abs__lt lookup
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When using the above written ``abs`` lookup, the SQL produced will not use
indexes efficiently in some cases. In particular, when we use
``change__abs__lt=27``, this is equivalent to ``change__gt=-27`` AND
``change__lt=27``. (For the ``lte`` case we could use the SQL ``BETWEEN``).
So we would like ``Experiment.objects.filter(change__abs__lt=27)`` to generate
the following SQL::
SELECT .. WHERE "experiments"."change" < 27 AND "experiments"."change" > -27
The implementation is::
from django.db.models import Lookup
class AbsoluteValueLessThan(Lookup):
lookup_name = 'lt'
def as_sql(self, qn, connection):
lhs, lhs_params = qn.compile(self.lhs.lhs)
rhs, rhs_params = self.process_rhs(qn, connection)
params = lhs_params + rhs_params + lhs_params + rhs_params
return '%s < %s AND %s > -%s' % (lhs, rhs, lhs, rhs), params
AbsoluteValue.register_lookup(AbsoluteValueLessThan)
There are a couple of notable things going on. First, ``AbsoluteValueLessThan``
isn't calling ``process_lhs()``. Instead it skips the transformation of the
``lhs`` done by ``AbsoluteValue`` and uses the original ``lhs``. That is, we
want to get ``27`` not ``ABS(27)``. Referring directly to ``self.lhs.lhs`` is
safe as ``AbsoluteValueLessThan`` can be accessed only from the
``AbsoluteValue`` lookup, that is the ``lhs`` is always an instance of
``AbsoluteValue``.
Notice also that as both sides are used multiple times in the query the params
need to contain ``lhs_params`` and ``rhs_params`` multiple times.
The final query does the inversion (``27`` to ``-27``) directly in the
database. The reason for doing this is that if the self.rhs is something else
than a plain integer value (for example an ``F()`` reference) we can't do the
transformations in Python.
.. note::
In fact, most lookups with ``__abs`` could be implemented as range queries
like this, and on most database backends it is likely to be more sensible to
do so as you can make use of the indexes. However with PostgreSQL you may
want to add an index on ``abs(change)`` which would allow these queries to
be very efficient.
Writing alternative implementations for existing lookups
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sometimes different database vendors require different SQL for the same
operation. For this example we will rewrite a custom implementation for
MySQL for the NotEqual operator. Instead of ``<>`` we will be using ``!=``
operator. (Note that in reality almost all databases support both, including
all the official databases supported by Django).
We can change the behavior on a specific backend by creating a subclass of
``NotEqual`` with a ``as_mysql`` method::
class MySQLNotEqual(NotEqual):
def as_mysql(self, qn, connection):
lhs, lhs_params = self.process_lhs(qn, connection)
rhs, rhs_params = self.process_rhs(qn, connection)
params = lhs_params + rhs_params
return '%s != %s' % (lhs, rhs), params
Field.register_lookup(MySQLNotExact)
We can then register it with ``Field``. It takes the place of the original
``NotEqual`` class as it has the same ``lookup_name``.
When compiling a query, Django first looks for ``as_%s % connection.vendor``
methods, and then falls back to ``as_sql``. The vendor names for the in-built
backends are ``sqlite``, ``postgresql``, ``oracle`` and ``mysql``.
How Django determines the lookups and transforms which are used
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In some cases you may which to dynamically change which ``Transform`` or
``Lookup`` is returned based on the name passed in, rather than fixing it. As
an example, you could have a field which stores coordinates or an arbitrary
dimension, and wish to allow a syntax like ``.filter(coords__x7=4)`` to return
the objects where the 7th coordinate has value 4. In order to do this, you
would override ``get_lookup`` with something like::
class CoordinatesField(Field):
def get_lookup(self, lookup_name):
if lookup_name.startswith('x'):
try:
dimension = int(lookup_name[1:])
except ValueError:
pass
finally:
return get_coordinate_lookup(dimension)
return super(CoordinatesField, self).get_lookup(lookup_name)
You would then define ``get_coordinate_lookup`` appropriately to return a
``Lookup`` subclass which handles the relevant value of ``dimension``.
There is a similarly named method called ``get_transform()``. ``get_lookup()``
should always return a ``Lookup`` subclass, and ``get_transform()`` a
``Transform`` subclass. It is important to remember that ``Transform``
objects can be further filtered on, and ``Lookup`` objects cannot.
When filtering, if there is only one lookup name remaining to be resolved, we
will look for a ``Lookup``. If there are multiple names, it will look for a
``Transform``. In the situation where there is only one name and a ``Lookup``
is not found, we look for a ``Transform`` and then the ``exact`` lookup on that
``Transform``. All call sequences always end with a ``Lookup``. To clarify:
- ``.filter(myfield__mylookup)`` will call ``myfield.get_lookup('mylookup')``.
- ``.filter(myfield__mytransform__mylookup)`` will call
``myfield.get_transform('mytransform')``, and then
``mytransform.get_lookup('mylookup')``.
- ``.filter(myfield__mytransform)`` will first call
``myfield.get_lookup('mytransform')``, which will fail, so it will fall back
to calling ``myfield.get_transform('mytransform')`` and then
``mytransform.get_lookup('exact')``.
Lookups and transforms are registered using the same API - ``register_lookup``.
.. _query-expression:
The Query Expression API
~~~~~~~~~~~~~~~~~~~~~~~~
A lookup can assume that the lhs responds to the query expression API.
Currently direct field references, aggregates and ``Transform`` instances respond
to this API.
.. method:: as_sql(qn, connection)
Responsible for producing the query string and parameters for the
expression. The ``qn`` is a ``SQLCompiler`` object, which has a
``compile()`` method that can be used to compile other expressions. The
``connection`` is the connection used to execute the query.
Calling expression.as_sql() directly is usually incorrect - instead
``qn.compile(expression)`` should be used. The ``qn.compile()`` method will
take care of calling vendor-specific methods of the expression.
.. method:: as_vendorname(qn, connection)
Works like ``as_sql()`` method. When an expression is compiled by
``qn.compile()``, Django will first try to call ``as_vendorname()``, where
vendorname is the vendor name of the backend used for executing the query.
The vendorname is one of ``postgresql``, ``oracle``, ``sqlite`` or
``mysql`` for Django's built-in backends.
.. method:: get_lookup(lookup_name)
The ``get_lookup()`` method is used to fetch lookups. By default the
lookup is fetched from the expression's output type in the same way
described in registering and fetching lookup documentation below.
It is possible to override this method to alter that behavior.
.. method:: get_transform(lookup_name)
The ``get_transform()`` method is used when a transform is needed rather
than a lookup, or if a lookup is not found. This is a more complex
situation which is useful when there arbitrary possible lookups for a
field. Generally speaking, you will not need to override ``get_lookup()``
or ``get_transform()``, and can use ``register_lookup()`` instead.
.. attribute:: output_field
The ``output_field`` attribute is used by the ``get_lookup()`` method to
check for lookups. The ``output_field`` should be a field.
Note that this documentation lists only the public methods of the API.
Lookup reference
~~~~~~~~~~~~~~~~
.. class:: Lookup
In addition to the attributes and methods below, lookups also support
``as_sql`` and ``as_vendorname`` from the query expression API.
.. attribute:: lhs
The ``lhs`` (left-hand side) of a lookup tells us what we are comparing the
rhs to. It is an object which implements the query expression API. This is
likely to be a field, an aggregate or a subclass of ``Transform``.
.. attribute:: rhs
The ``rhs`` (right-hand side) of a lookup is the value we are comparing the
left hand side to. It may be a plain value, or something which compiles
into SQL, for example an ``F()`` object or a ``Queryset``.
.. attribute:: lookup_name
This class level attribute is used when registering lookups. It determines
the name used in queries to trigger this lookup. For example, ``contains``
or ``exact``. This should not contain the string ``__``.
.. method:: process_lhs(qn, connection)
This returns a tuple of ``(lhs_string, lhs_params)``. In some cases you may
wish to compile ``lhs`` directly in your ``as_sql`` methods using
``qn.compile(self.lhs)``.
.. method:: process_rhs(qn, connection)
Behaves the same as ``process_lhs`` but acts on the right-hand side.
Transform reference
~~~~~~~~~~~~~~~~~~~
.. class:: Transform
In addition to implementing the query expression API Transforms have the
following methods and attributes.
.. attribute:: lhs
The ``lhs`` (left-hand-side) of a transform contains the value to be
transformed. The ``lhs`` implements the query expression API.
.. attribute:: lookup_name
This class level attribute is used when registering lookups. It determines
the name used in queries to trigger this lookup. For example, ``year``
or ``dayofweek``. This should not contain the string ``__``.
.. _lookup-registration-api:
Registering and fetching lookups
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The lookup registration API is explained below.
.. classmethod:: register_lookup(lookup)
Registers the Lookup or Transform for the class. For example
``DateField.register_lookup(YearExact)`` will register ``YearExact`` for
all ``DateFields`` in the project, but also for fields that are instances
of a subclass of ``DateField`` (for example ``DateTimeField``). You can
register a Lookup or a Transform using the same class method.
.. method:: get_lookup(lookup_name)
Django uses ``get_lookup(lookup_name)`` to fetch lookups. The
implementation of ``get_lookup()`` looks for a subclass which is registered
for the current class with the correct ``lookup_name``.
.. method:: get_transform(lookup_name)
Django uses ``get_transform(lookup_name)`` to fetch transforms. The
implementation of ``get_transform()`` looks for a subclass which is registered
for the current class with the correct ``transform_name``.
The lookup registration API is available for ``Transform`` and ``Field`` classes.

View file

@ -14,4 +14,4 @@ Model API reference. For introductory material, see :doc:`/topics/db/models`.
instances
querysets
queries
custom-lookups
lookups

207
docs/ref/models/lookups.txt Normal file
View file

@ -0,0 +1,207 @@
====================
Lookup API reference
====================
.. module:: django.db.models.lookups
:synopsis: Lookups API
.. currentmodule:: django.db.models
.. versionadded:: 1.7
This document has the API references of lookups, the Django API for building
the ``WHERE`` clause of a database query. To learn how to *use* lookups, see
:doc:`/topics/db/queries`; to learn how to *create* new lookups, see
:doc:`/howto/custom-lookups`.
The lookup API has two components: a :class:`~lookups.RegisterLookupMixin` class
that registers lookups, and the `Query Expression API <query-expression>`_, a
set of methods that a class has to implement to be registrable as a lookup.
Django has two base classes that follow the query expression API and from where
all Django builtin lookups are derived:
* :class:`Lookup`: to lookup a field (e.g. the ``exact`` of ``field_name__exact``)
* :class:`Transform`: to transform a field
A lookup expression consists of three parts:
* Fields part (e.g. ``Book.objects.filter(author__best_friends__first_name...``);
* Transforms part (may be omitted) (e.g. ``__lower__first3chars__reversed``);
* A lookup (e.g. ``__icontains``) that, if omitted, defaults to ``__exact``.
.. _lookup-registration-api:
Registration API
~~~~~~~~~~~~~~~~
Django uses :class:`~lookups.RegisterLookupMixin` to give a class the interface to
register lookups on itself. The two prominent examples are
:class:`~django.db.models.Field`, the base class of all model fields, and
``Aggregate``, the base class of all Django aggregates.
.. class:: lookups.RegisterLookupMixin
A mixin that implements the lookup API on a class.
.. classmethod:: register_lookup(lookup)
Registers a new lookup in the class. For example
``DateField.register_lookup(YearExact)`` will register ``YearExact``
lookup on ``DateField``. It overrides a lookup that already exists with
the same name.
.. method:: get_lookup(lookup_name)
Returns the :class:`Lookup` named ``lookup_name`` registered in the class.
The default implementation looks recursively on all parent classes
and checks if any has a registered lookup named ``lookup_name``, returning
the first match.
.. method:: get_transform(transform_name)
Returns a :class:`Transform` named ``transform_name``. The default
implementation looks recursively on all parent classes to check if any
has the registered transform named ``transform_name``, returning the first
match.
For a class to be a lookup, it must follow the `Query Expression API
<query-expression>`_. :class:`~Lookup` and :class:`~Transform` naturally
follow this API.
.. _query-expression:
The Query Expression API
~~~~~~~~~~~~~~~~~~~~~~~~
The query expression API is a common set of methods that classes define to be
usable in query expressions to translate themselves into SQL expressions. Direct
field references, aggregates, and ``Transform`` are examples that follow this
API. A class is said to follow the query expression API when it implements the
following methods:
.. method:: as_sql(self, qn, connection)
Responsible for producing the query string and parameters for the expression.
The ``qn`` is an ``SQLCompiler`` object, which has a ``compile()`` method
that can be used to compile other expressions. The ``connection`` is the
connection used to execute the query.
Calling ``expression.as_sql()`` is usually incorrect - instead
``qn.compile(expression)`` should be used. The ``qn.compile()`` method will
take care of calling vendor-specific methods of the expression.
.. method:: as_vendorname(self, qn, connection)
Works like ``as_sql()`` method. When an expression is compiled by
``qn.compile()``, Django will first try to call ``as_vendorname()``, where
``vendorname`` is the vendor name of the backend used for executing the
query. The ``vendorname`` is one of ``postgresql``, ``oracle``, ``sqlite``,
or ``mysql`` for Django's built-in backends.
.. method:: get_lookup(lookup_name)
Must return the lookup named ``lookup_name``. For instance, by returning
``self.output_field.get_lookup(lookup_name)``.
.. method:: get_transform(transform_name)
Must return the lookup named ``transform_name``. For instance, by returning
``self.output_field.get_transform(transform_name)``.
.. attribute:: output_field
Defines the type of class returned by the ``get_lookup()`` method. It must
be a :class:`~django.db.models.Field` instance.
Transform reference
~~~~~~~~~~~~~~~~~~~
.. class:: Transform
A ``Transform`` is a generic class to implement field transformations. A
prominent example is ``__year`` that transforms a ``DateField`` into a
``IntegerField``.
The notation to use a ``Transform`` in an lookup expression is
``<expression>__<transformation>`` (e.g. ``date__year``).
This class follows the `Query Expression API <query-expression>`_, which
implies that you can use ``<expression>__<transform1>__<transform2>``.
.. attribute:: lhs
The left-hand side - what is being transformed. It must follow the
`Query Expression API <query-expression>`_.
.. attribute:: lookup_name
The name of the lookup, used for identifying it on parsing query
expressions. It cannot contain the string ``"__"``.
.. attribute:: output_field
Defines the class this transformation outputs. It must be a
:class:`~django.db.models.Field` instance. By default is the same as
its ``lhs.output_field``.
.. method:: as_sql
To be overridden; raises :exc:`NotImplementedError`.
.. method:: get_lookup(lookup_name)
Same as :meth:`~lookups.RegisterLookupMixin.get_lookup()`.
.. method:: get_transform(transform_name)
Same as :meth:`~lookups.RegisterLookupMixin.get_transform()`.
Lookup reference
~~~~~~~~~~~~~~~~
.. class:: Lookup
A ``Lookup`` is a generic class to implement lookups. A lookup is a query
expression with a left-hand side, :attr:`lhs`; a right-hand side,
:attr:`rhs`; and a ``lookup_name`` that is used to produce a boolean
comparison between ``lhs`` and ``rhs`` such as ``lhs in rhs`` or
``lhs > rhs``.
The notation to use a lookup in an expression is
``<lhs>__<lookup_name>=<rhs>``.
This class doesn't follow the `Query Expression API <query-expression>`_
since it has ``=<rhs>`` on its construction: lookups are always the end of
a lookup expression.
.. attribute:: lhs
The left-hand side - what is being looked up. The object must follow
the `Query Expression API <query-expression>`_.
.. attribute:: rhs
The right-hand side - what ``lhs`` is being compared against. It can be
a plain value, or something that compiles into SQL, typically an
``F()`` object or a ``QuerySet``.
.. attribute:: lookup_name
The name of this lookup, used to identify it on parsing query
expressions. It cannot contain the string ``"__"``.
.. method:: process_lhs(qn, connection[, lhs=None])
Returns a tuple ``(lhs_string, lhs_params)``, as returned by
``qn.compile(lhs)``. This method can be overridden to tune how the
``lhs`` is processed.
``qn`` is an ``SQLCompiler`` object, to be used like ``qn.compile(lhs)``
for compiling ``lhs``. The ``connection`` can be used for compiling
vendor specific SQL. If ``lhs`` is not ``None``, use it as the
processed ``lhs`` instead of ``self.lhs``.
.. method:: process_rhs(qn, connection)
Behaves the same way as :meth:`process_lhs`, for the right-hand side.

View file

@ -2061,7 +2061,7 @@ For an introduction, see :ref:`models and database queries documentation
<field-lookups-intro>`.
Django's inbuilt lookups are listed below. It is also possible to write
:doc:`custom lookups </ref/models/custom-lookups>` for model fields.
:doc:`custom lookups </howto/custom-lookups>` for model fields.
As a convenience when no lookup type is provided (like in
``Entry.objects.get(id=14)``) the lookup type is assumed to be :lookup:`exact`.