Fixed #16187 -- refactored ORM lookup system

Allowed users to specify which lookups or transforms ("nested lookus")
are available for fields. The implementation is now class based.

Squashed commit of the following:

commit fa7a7195f1
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sat Jan 18 10:53:24 2014 +0200

    Added lookup registration API docs

commit eb1c8ce164
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Tue Jan 14 18:59:36 2014 +0200

    Release notes and other minor docs changes

commit 11501c29c9
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sun Jan 12 20:53:03 2014 +0200

    Forgot to add custom_lookups tests in prev commit

commit 83173b960e
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sun Jan 12 19:59:12 2014 +0200

    Renamed Extract -> Transform

commit 3b18d9f3a1
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sun Jan 12 19:51:53 2014 +0200

    Removed suggestion of temporary lookup registration from docs

commit 21d0c7631c
Merge: 2509006 f2dc442
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sun Jan 12 09:38:23 2014 -0800

    Merge pull request #2 from mjtamlyn/lookups_3

    Reworked custom lookups docs.

commit f2dc4429a1
Author: Marc Tamlyn <marc.tamlyn@gmail.com>
Date:   Sun Jan 12 13:15:05 2014 +0000

    Reworked custom lookups docs.

    Mostly just formatting and rewording, but also replaced the example
    using ``YearExtract`` to  use an example which is unlikely to ever be
    possible directly in the ORM.

commit 2509006506
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sun Jan 12 13:19:13 2014 +0200

    Removed unused import

commit 4fba5dfaa0
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sat Jan 11 22:34:41 2014 +0200

    Added docs to index

commit 6d53963f37
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sat Jan 11 22:10:24 2014 +0200

    Dead code removal

commit f9cc039007
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sat Jan 11 19:00:43 2014 +0200

    A new try for docs

commit 33aa18a6e3
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sat Jan 11 14:57:12 2014 +0200

    Renamed get_cols to get_group_by_cols

commit c7d5f8661b
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sat Jan 11 14:45:53 2014 +0200

    Altered query string customization for backends vendors

    The new way is trying to call first method 'as_' + connection.vendor.
    If that doesn't exist, then call as_sql().

    Also altered how lookup registration is done. There is now
    RegisterLookupMixin class that is used by Field, Extract and
    sql.Aggregate. This allows one to register lookups for extracts and
    aggregates in the same way lookup registration is done for fields.

commit 90e7004ec1
Merge: 66649ff f7c2c0a
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sat Jan 11 13:21:01 2014 +0200

    Merge branch 'master' into lookups_3

commit 66649ff891
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sat Jan 11 13:16:01 2014 +0200

    Some rewording in docs

commit 31b8faa627
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sun Dec 29 15:52:29 2013 +0200

    Cleanup based on review comments

commit 1016159f34
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sat Dec 28 18:37:04 2013 +0200

    Proof-of-concept fix for #16731

    Implemented only for SQLite and PostgreSQL, and only for startswith
    and istartswith lookups.

commit 193cd097ca
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sat Dec 28 17:57:58 2013 +0200

    Fixed #11722 -- iexact=F() produced invalid SQL

commit 08ed3c3b49
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sat Dec 21 23:59:52 2013 +0200

    Made Lookup and Extract available from django.db.models

commit b99c8d83c9
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sat Dec 21 23:06:29 2013 +0200

    Fixed review notes by Loic

commit 049eebc070
Merge: ed8fab7 b80a835
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sat Dec 21 22:53:10 2013 +0200

    Merge branch 'master' into lookups_3

    Conflicts:
    	django/db/models/fields/__init__.py
    	django/db/models/sql/compiler.py
    	django/db/models/sql/query.py
    	tests/null_queries/tests.py

commit ed8fab7fe8
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sat Dec 21 22:47:23 2013 +0200

    Made Extracts aware of full lookup path

commit 27a57b7aed
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sun Dec 1 21:10:11 2013 +0200

    Removed debugger import

commit 074e0f5aca
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sun Dec 1 21:02:16 2013 +0200

    GIS lookup support added

commit 760e28e72b
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sun Dec 1 20:04:31 2013 +0200

    Removed usage of Constraint, used Lookup instead

commit eac4776684
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sun Dec 1 02:22:30 2013 +0200

    Minor cleanup of Lookup API

commit 2adf50428d
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sun Dec 1 02:14:19 2013 +0200

    Added documentation, polished implementation

commit 32c04357a8
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sat Nov 30 23:10:15 2013 +0200

    Avoid OrderedDict creation on lookup aggregate check

commit 7c8b3a32cc
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Sat Nov 30 23:04:34 2013 +0200

    Implemented nested lookups

    But there is no support of using lookups outside filtering yet.

commit 4d219d4cde
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Wed Nov 27 22:07:30 2013 +0200

    Initial implementation of custom lookups
This commit is contained in:
Anssi Kääriäinen 2014-01-18 11:09:43 +02:00
parent b87c59b04b
commit 20bab2cf9d
38 changed files with 1324 additions and 185 deletions

View file

@ -662,6 +662,12 @@ Django filter lookups: ``exact``, ``iexact``, ``contains``, ``icontains``,
``endswith``, ``iendswith``, ``range``, ``year``, ``month``, ``day``,
``isnull``, ``search``, ``regex``, and ``iregex``.
.. versionadded:: 1.7
If you are using :doc:`Custom lookups </ref/models/custom-lookups>` the
``lookup_type`` can be any ``lookup_name`` used by the project's custom
lookups.
Your method must be prepared to handle all of these ``lookup_type`` values and
should raise either a ``ValueError`` if the ``value`` is of the wrong sort (a
list when you were expecting an object, for example) or a ``TypeError`` if

View file

@ -81,7 +81,8 @@ manipulating the data of your Web application. Learn more about it below:
:doc:`Transactions <topics/db/transactions>` |
:doc:`Aggregation <topics/db/aggregation>` |
:doc:`Custom fields <howto/custom-model-fields>` |
:doc:`Multiple databases <topics/db/multi-db>`
:doc:`Multiple databases <topics/db/multi-db>` |
:doc:`Custom lookups <ref/models/custom-lookups>`
* **Other:**
:doc:`Supported databases <ref/databases>` |

View file

@ -0,0 +1,336 @@
==============
Custom lookups
==============
.. versionadded:: 1.7
.. module:: django.db.models.lookups
:synopsis: Custom lookups
.. currentmodule:: django.db.models
By default Django offers a wide variety of :ref:`built-in lookups
<field-lookups>` for filtering (for example, ``exact`` and ``icontains``). This
documentation explains how to write custom lookups and how to alter the working
of existing lookups.
A simple Lookup example
~~~~~~~~~~~~~~~~~~~~~~~
Let's start with a simple custom lookup. We will write a custom lookup ``ne``
which works opposite to ``exact``. ``Author.objects.filter(name__ne='Jack')``
will translate to the SQL::
"author"."name" <> 'Jack'
This SQL is backend independent, so we don't need to worry about different
databases.
There are two steps to making this work. Firstly we need to implement the
lookup, then we need to tell Django about it. The implementation is quite
straightforward::
from django.db.models import Lookup
class NotEqual(Lookup):
lookup_name = 'ne'
def as_sql(self, qn, connection):
lhs, lhs_params = self.process_lhs(qn, connection)
rhs, rhs_params = self.process_rhs(qn, connection)
params = lhs_params + rhs_params
return '%s <> %s' % (lhs, rhs), params
To register the ``NotEqual`` lookup we will just need to call
``register_lookup`` on the field class we want the lookup to be available. In
this case, the lookup makes sense on all ``Field`` subclasses, so we register
it with ``Field`` directly::
from django.db.models.fields import Field
Field.register_lookup(NotEqual)
We can now use ``foo__ne`` for any field ``foo``. You will need to ensure that
this registration happens before you try to create any querysets using it. You
could place the implementation in a ``models.py`` file, or register the lookup
in the ``ready()`` method of an ``AppConfig``.
Taking a closer look at the implementation, the first required attribute is
``lookup_name``. This allows the ORM to understand how to interpret ``name__ne``
and use ``NotEqual`` to generate the SQL. By convention, these names are always
lowercase strings containing only letters, but the only hard requirement is
that it must not contain the string ``__``.
A ``Lookup`` works against two values, ``lhs`` and ``rhs``, standing for
left-hand side and right-hand side. The left-hand side is usually a field
reference, but it can be anything implementing the :ref:`query expression API
<query-expression>`. The right-hand is the value given by the user. In the
example ``Author.objects.filter(name__ne='Jack')``, the left-hand side is a
reference to the ``name`` field of the ``Author`` model, and ``'Jack'`` is the
right-hand side.
We call ``process_lhs`` and ``process_rhs`` to convert them into the values we
need for SQL. In the above example, ``process_lhs`` returns
``('"author"."name"', [])`` and ``process_rhs`` returns ``('"%s"', ['Jack'])``.
In this example there were no parameters for the left hand side, but this would
depend on the object we have, so we still need to include them in the
parameters we return.
Finally we combine the parts into a SQL expression with ``<>``, and supply all
the parameters for the query. We then return a tuple containing the generated
SQL string and the parameters.
A simple transformer example
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The custom lookup above is great, but in some cases you may want to be able to
chain lookups together. For example, let's suppose we are building an
application where we want to make use of the ``abs()`` operator.
We have an ``Experiment`` model which records a start value, end value and the
change (start - end). We would like to find all experiments where the change
was equal to a certain amount (``Experiment.objects.filter(change__abs=27)``),
or where it did not exceede a certain amount
(``Experiment.objects.filter(change__abs__lt=27)``).
.. note::
This example is somewhat contrived, but it demonstrates nicely the range of
functionality which is possible in a database backend independent manner,
and without duplicating functionality already in Django.
We will start by writing a ``AbsoluteValue`` transformer. This will use the SQL
function ``ABS()`` to transform the value before comparison::
from django.db.models import Transform
class AbsoluteValue(Transform):
lookup_name = 'abs'
def as_sql(self, qn, connection):
lhs, params = qn.compile(self.lhs)
return "ABS(%s)" % lhs, params
Next, lets register it for ``IntegerField``::
from django.db.models import IntegerField
IntegerField.register_lookup(AbsoluteValue)
We can now run the queris we had before.
``Experiment.objects.filter(change__abs=27)`` will generate the following SQL::
SELECT ... WHERE ABS("experiments"."change") = 27
By using ``Transform`` instead of ``Lookup`` it means we are able to chain
further lookups afterwards. So
``Experiment.objects.filter(change__abs__lt=27)`` will generate the following
SQL::
SELECT ... WHERE ABS("experiments"."change") < 27
Subclasses of ``Transform`` usually only operate on the left-hand side of the
expression. Further lookups will work on the transformed value. Note that in
this case where there is no other lookup specified, Django interprets
``change__abs=27`` as ``change__abs__exact=27``.
When looking for which lookups are allowable after the ``Transform`` has been
applied, Django uses the ``output_type`` attribute. We didn't need to specify
this here as it didn't change, but supposing we were applying ``AbsoluteValue``
to some field which represents a more complex type (for example a point
relative to an origin, or a complex number) then we may have wanted to specify
``output_type = FloatField``, which will ensure that further lookups like
``abs__lte`` behave as they would for a ``FloatField``.
Writing an efficient abs__lt lookup
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When using the above written ``abs`` lookup, the SQL produced will not use
indexes efficiently in some cases. In particular, when we use
``change__abs__lt=27``, this is equivalent to ``change__gt=-27`` AND
``change__lt=27``. (For the ``lte`` case we could use the SQL ``BETWEEN``).
So we would like ``Experiment.objects.filter(change__abs__lt=27)`` to generate
the following SQL::
SELECT .. WHERE "experiments"."change" < 27 AND "experiments"."change" > -27
The implementation is::
from django.db.models import Lookup
class AbsoluteValueLessThan(Lookup):
lookup_name = 'lt'
def as_sql(self, qn, connection):
lhs, lhs_params = qn.compile(self.lhs.lhs)
rhs, rhs_params = self.process_rhs(qn, connection)
params = lhs_params + rhs_params + lhs_params + rhs_params
return '%s > %s AND %s < -%s % (lhs, rhs, lhs, rhs), params
AbsoluteValue.register_lookup(AbsoluteValueLessThan)
There are a couple of notable things going on. First, ``AbsoluteValueLessThan``
isn't calling ``process_lhs()``. Instead it skips the transformation of the
``lhs`` done by ``AbsoluteValue`` and uses the original ``lhs``. That is, we
want to get ``27`` not ``ABS(27)``. Referring directly to ``self.lhs.lhs`` is
safe as ``AbsoluteValueLessThan`` can be accessed only from the
``AbsoluteValue`` lookup, that is the ``lhs`` is always an instance of
``AbsoluteValue``.
Notice also that as both sides are used multiple times in the query the params
need to contain ``lhs_params`` and ``rhs_params`` multiple times.
The final query does the inversion (``27`` to ``-27``) directly in the
database. The reason for doing this is that if the self.rhs is something else
than a plain integer value (for example an ``F()`` reference) we can't do the
transformations in Python.
.. note::
In fact, most lookups with ``__abs`` could be implemented as range queries
like this, and on most database backend it is likely to be more sensible to
do so as you can make use of the indexes. However with PostgreSQL you may
want to add an index on ``abs(change)`` which would allow these queries to
be very efficient.
Writing alternative implemenatations for existing lookups
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sometimes different database vendors require different SQL for the same
operation. For this example we will rewrite a custom implementation for
MySQL for the NotEqual operator. Instead of ``<>`` we will be using ``!=``
operator. (Note that in reality almost all databases support both, including
all the official databases supported by Django).
We can change the behaviour on a specific backend by creating a subclass of
``NotEqual`` with a ``as_mysql`` method::
class MySQLNotEqual(NotEqual):
def as_mysql(self, qn, connection):
lhs, lhs_params = self.process_lhs(qn, connection)
rhs, rhs_params = self.process_rhs(qn, connection)
params = lhs_params + rhs_params
return '%s != %s' % (lhs, rhs), params
Field.register_lookup(MySQLNotExact)
We can then register it with ``Field``. It takes the place of the original
``NotEqual`` class as it has
When compiling a query, Django first looks for ``as_%s % connection.vendor``
methods, and then falls back to ``as_sql``. The vendor names for the in-built
backends are ``sqlite``, ``postgresql``, ``oracle`` and ``mysql``.
.. _query-expression:
The Query Expression API
~~~~~~~~~~~~~~~~~~~~~~~~
A lookup can assume that the lhs responds to the query expression API.
Currently direct field references, aggregates and ``Transform`` instances respond
to this API.
.. method:: as_sql(qn, connection)
Responsible for producing the query string and parameters for the
expression. The ``qn`` has a ``compile()`` method that can be used to
compile other expressions. The ``connection`` is the connection used to
execute the query.
Calling expression.as_sql() directly is usually incorrect - instead
qn.compile(expression) should be used. The qn.compile() method will take
care of calling vendor-specific methods of the expression.
.. method:: get_lookup(lookup_name)
The ``get_lookup()`` method is used to fetch lookups. By default the
lookup is fetched from the expression's output type in the same way
described in registering and fetching lookup documentation below.
It is possible to override this method to alter that behaviour.
.. method:: as_vendorname(qn, connection)
Works like ``as_sql()`` method. When an expression is compiled by
``qn.compile()``, Django will first try to call ``as_vendorname()``, where
vendorname is the vendor name of the backend used for executing the query.
The vendorname is one of ``postgresql``, ``oracle``, ``sqlite`` or
``mysql`` for Django's built-in backends.
.. attribute:: output_type
The ``output_type`` attribute is used by the ``get_lookup()`` method to check for
lookups. The output_type should be a field.
Note that this documentation lists only the public methods of the API.
Lookup reference
~~~~~~~~~~~~~~~~
.. class:: Lookup
In addition to the attributes and methods below, lookups also support
``as_sql`` and ``as_vendorname`` from the query expression API.
.. attribute:: lhs
The ``lhs`` (left-hand side) of a lookup tells us what we are comparing the
rhs to. It is an object which implements the query expression API. This is
likely to be a field, an aggregate or a subclass of ``Transform``.
.. attribute:: rhs
The ``rhs`` (right-hand side) of a lookup is the value we are comparing the
left hand side to. It may be a plain value, or something which compiles
into SQL, for example an ``F()`` object or a ``Queryset``.
.. attribute:: lookup_name
This class level attribute is used when registering lookups. It determines
the name used in queries to trigger this lookup. For example, ``contains``
or ``exact``. This should not contain the string ``__``.
.. method:: process_lhs(qn, connection)
This returns a tuple of ``(lhs_string, lhs_params)``. In some cases you may
wish to compile ``lhs`` directly in your ``as_sql`` methods using
``qn.compile(self.lhs)``.
.. method:: process_rhs(qn, connection)
Behaves the same as ``process_lhs`` but acts on the right-hand side.
Transform reference
~~~~~~~~~~~~~~~~~~~
.. class:: Transform
In addition to implementing the query expression API Transforms have the
following methods and attributes.
.. attribute:: lhs
The ``lhs`` (left-hand-side) of a transform contains the value to be
transformed. The ``lhs`` implements the query expression API.
.. attribute:: lookup_name
This class level attribute is used when registering lookups. It determines
the name used in queries to trigger this lookup. For example, ``year``
or ``dayofweek``. This should not contain the string ``__``.
.. _lookup-registration-api:
Registering and fetching lookups
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The lookup registration API is explained below.
.. classmethod:: register_lookup(lookup)
Registers the Lookup or Transform for the class. For example
``DateField.register_lookup(YearExact)`` will register ``YearExact`` for
all ``DateFields`` in the project, but also for fields that are instances
of a subclass of ``DateField`` (for example ``DateTimeField``).
.. method:: get_lookup(lookup_name)
Django uses ``get_lookup(lookup_name)`` to fetch lookups or transforms.
The implementation of ``get_lookup()`` fetches lookups or transforms
registered for the current class based on their lookup_name attribute.
The lookup registration API is available for ``Transform`` and ``Field`` classes.

View file

@ -343,6 +343,13 @@ underscores to spaces. See :ref:`Verbose field names <verbose-field-names>`.
A list of validators to run for this field. See the :doc:`validators
documentation </ref/validators>` for more information.
Registering and fetching lookups
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``Field`` implements the :ref:`lookup registration API <lookup-registration-api>`.
The API can be used to customize which lookups are available for a field class, and
how lookups are fetched from a field.
.. _model-field-types:
Field types

View file

@ -13,3 +13,4 @@ Model API reference. For introductory material, see :doc:`/topics/db/models`.
instances
querysets
queries
custom-lookups

View file

@ -1995,6 +1995,9 @@ specified as keyword arguments to the ``QuerySet`` methods :meth:`filter()`,
For an introduction, see :ref:`models and database queries documentation
<field-lookups-intro>`.
Django's inbuilt lookups are listed below. It is also possible to write
:doc:`custom lookups </ref/models/custom-lookups>` for model fields.
.. fieldlookup:: exact
exact

View file

@ -180,6 +180,27 @@ for the following, instead of backend specific behavior.
finally:
c.close()
Custom lookups
~~~~~~~~~~~~~~
It is now possible to write custom lookups and transforms for the ORM.
Custom lookups work just like Django's inbuilt lookups (e.g. ``lte``,
``icontains``) while transforms are a new concept.
The :class:`django.db.models.Lookup` class provides a way to add lookup
operators for model fields. As an example it is possible to add ``day_lte``
opertor for ``DateFields``.
The :class:`django.db.models.Transform` class allows transformations of
database values prior to the final lookup. For example it is possible to
write a ``year`` transform that extracts year from the field's value.
Transforms allow for chaining. After the ``year`` transform has been added
to ``DateField`` it is possible to filter on the transformed value, for
example ``qs.filter(author__birthdate__year__lte=1981)``.
For more information about both custom lookups and transforms refer to
:doc:`custom lookups </ref/models/custom-lookups>` documentation.
Minor features
~~~~~~~~~~~~~~