mirror of
https://github.com/django/django.git
synced 2025-08-03 10:34:04 +00:00
Fixed #7052 -- Added support for natural keys in serialization.
git-svn-id: http://code.djangoproject.com/svn/django/trunk@11863 bcc190cf-cafb-0310-a4f2-bffc1f526a37
This commit is contained in:
parent
44b9076bbe
commit
35cc439228
20 changed files with 927 additions and 37 deletions
|
@ -234,6 +234,17 @@ name to ``dumpdata``, the dumped output will be restricted to that model,
|
|||
rather than the entire application. You can also mix application names and
|
||||
model names.
|
||||
|
||||
.. django-admin-option:: --natural
|
||||
|
||||
.. versionadded:: 1.2
|
||||
|
||||
Use :ref:`natural keys <topics-serialization-natural-keys>` to represent
|
||||
any foreign key and many-to-many relationship with a model that provides
|
||||
a natural key definition. If you are dumping ``contrib.auth`` ``Permission``
|
||||
objects or ``contrib.contenttypes`` ``ContentType`` objects, you should
|
||||
probably be using this flag.
|
||||
|
||||
|
||||
flush
|
||||
-----
|
||||
|
||||
|
@ -701,7 +712,7 @@ information.
|
|||
|
||||
.. versionadded:: 1.2
|
||||
|
||||
Use the ``--failfast`` option to stop running tests and report the failure
|
||||
Use the ``--failfast`` option to stop running tests and report the failure
|
||||
immediately after a test fails.
|
||||
|
||||
testserver <fixture fixture ...>
|
||||
|
|
|
@ -267,3 +267,13 @@ include %}`` tags).
|
|||
As a side effect, it is now much easier to support non-Django template
|
||||
languages. For more details, see the :ref:`notes on supporting
|
||||
non-Django template languages<topic-template-alternate-language>`.
|
||||
|
||||
Natural keys in fixtures
|
||||
------------------------
|
||||
|
||||
Fixtures can refer to remote objects using
|
||||
:ref:`topics-serialization-natural-keys`. This lookup scheme is an
|
||||
alternative to the normal primary-key based object references in a
|
||||
fixture, improving readability, and resolving problems referring to
|
||||
objects whose primary key value may not be predictable or known.
|
||||
|
||||
|
|
|
@ -154,10 +154,10 @@ to install third-party Python modules:
|
|||
.. _PyYAML: http://www.pyyaml.org/
|
||||
|
||||
Notes for specific serialization formats
|
||||
----------------------------------------
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
json
|
||||
~~~~
|
||||
^^^^
|
||||
|
||||
If you're using UTF-8 (or any other non-ASCII encoding) data with the JSON
|
||||
serializer, you must pass ``ensure_ascii=False`` as a parameter to the
|
||||
|
@ -191,3 +191,191 @@ them. Something like this will work::
|
|||
|
||||
.. _special encoder: http://svn.red-bean.com/bob/simplejson/tags/simplejson-1.7/docs/index.html
|
||||
|
||||
.. _topics-serialization-natural-keys:
|
||||
|
||||
Natural keys
|
||||
------------
|
||||
|
||||
The default serialization strategy for foreign keys and many-to-many
|
||||
relations is to serialize the value of the primary key(s) of the
|
||||
objects in the relation. This strategy works well for most types of
|
||||
object, but it can cause difficulty in some circumstances.
|
||||
|
||||
Consider the case of a list of objects that have foreign key on
|
||||
:class:`ContentType`. If you're going to serialize an object that
|
||||
refers to a content type, you need to have a way to refer to that
|
||||
content type. Content Types are automatically created by Django as
|
||||
part of the database synchronization process, so you don't need to
|
||||
include content types in a fixture or other serialized data. As a
|
||||
result, the primary key of any given content type isn't easy to
|
||||
predict - it will depend on how and when :djadmin:`syncdb` was
|
||||
executed to create the content types.
|
||||
|
||||
There is also the matter of convenience. An integer id isn't always
|
||||
the most convenient way to refer to an object; sometimes, a
|
||||
more natural reference would be helpful.
|
||||
|
||||
Deserialization of natural keys
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
It is for these reasons that Django provides `natural keys`. A natural
|
||||
key is a tuple of values that can be used to uniquely identify an
|
||||
object instance without using the primary key value.
|
||||
|
||||
Consider the following two models::
|
||||
|
||||
from django.db import models
|
||||
|
||||
class Person(models.Model):
|
||||
first_name = models.CharField(max_length=100)
|
||||
last_name = models.CharField(max_length=100)
|
||||
|
||||
birthdate = models.DateField()
|
||||
|
||||
class Book(models.Model):
|
||||
name = models.CharField(max_length=100)
|
||||
author = models.ForeignKey(Person)
|
||||
|
||||
Ordinarily, serialized data for ``Book`` would use an integer to refer to
|
||||
the author. For example, in JSON, a Book might be serialized as::
|
||||
|
||||
...
|
||||
{
|
||||
"pk": 1,
|
||||
"model": "store.book",
|
||||
"fields": {
|
||||
"name": "Mostly Harmless",
|
||||
"author": 42
|
||||
}
|
||||
}
|
||||
...
|
||||
|
||||
This isn't a particularly natural way to refer to an author. It
|
||||
requires that you know the primary key value for the author; it also
|
||||
requires that this primary key value is stable and predictable.
|
||||
|
||||
However, if we add natural key handling to Person, the fixture becomes
|
||||
much more humane. To add natural key handling, you define a default
|
||||
Manager for Person with a ``get_by_natural_key()`` method. In the case
|
||||
of a Person, a good natural key might be the pair of first and last
|
||||
name::
|
||||
|
||||
from django.db import models
|
||||
|
||||
class PersonManager(models.Manager):
|
||||
def get_by_natural_key(self, first_name, last_name):
|
||||
return self.filter(first_name=first_name, last_name=last_name)
|
||||
|
||||
class Person(models.Model):
|
||||
objects = PersonManager()
|
||||
|
||||
first_name = models.CharField(max_length=100)
|
||||
last_name = models.CharField(max_length=100)
|
||||
|
||||
birthdate = models.DateField()
|
||||
|
||||
Now books can use that natural key to refer to ``Person`` objects::
|
||||
|
||||
...
|
||||
{
|
||||
"pk": 1,
|
||||
"model": "store.book",
|
||||
"fields": {
|
||||
"name": "Mostly Harmless",
|
||||
"author": ["Douglas", "Adams"]
|
||||
}
|
||||
}
|
||||
...
|
||||
|
||||
When you try to load this serialized data, Django will use the
|
||||
``get_by_natural_key()`` method to resolve ``["Douglas", "Adams"]``
|
||||
into the primary key of an actual ``Person`` object.
|
||||
|
||||
Serialization of natural keys
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
So how do you get Django to emit a natural key when serializing an object?
|
||||
Firstly, you need to add another method -- this time to the model itself::
|
||||
|
||||
class Person(models.Model):
|
||||
objects = PersonManager()
|
||||
|
||||
first_name = models.CharField(max_length=100)
|
||||
last_name = models.CharField(max_length=100)
|
||||
|
||||
birthdate = models.DateField()
|
||||
|
||||
def natural_key(self):
|
||||
return (self.first_name, self.last_name)
|
||||
|
||||
Then, when you call ``serializers.serialize()``, you provide a
|
||||
``use_natural_keys=True`` argument::
|
||||
|
||||
>>> serializers.serialize([book1, book2], format='json', indent=2, use_natural_keys=True)
|
||||
|
||||
When ``use_natural_keys=True`` is specified, Django will use the
|
||||
``natural_key()`` method to serialize any reference to objects of the
|
||||
type that defines the method.
|
||||
|
||||
If you are using :djadmin:`dumpdata` to generate serialized data, you
|
||||
use the `--natural` command line flag to generate natural keys.
|
||||
|
||||
.. note::
|
||||
|
||||
You don't need to define both ``natural_key()`` and
|
||||
``get_by_natural_key()``. If you don't want Django to output
|
||||
natural keys during serialization, but you want to retain the
|
||||
ability to load natural keys, then you can opt to not implement
|
||||
the ``natural_key()`` method.
|
||||
|
||||
Conversely, if (for some strange reason) you want Django to output
|
||||
natural keys during serialization, but *not* be able to load those
|
||||
key values, just don't define the ``get_by_natural_key()`` method.
|
||||
|
||||
Dependencies during serialization
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Since natural keys rely on database lookups to resolve references, it
|
||||
is important that data exists before it is referenced. You can't make
|
||||
a `forward reference` with natural keys - the data you are referencing
|
||||
must exist before you include a natural key reference to that data.
|
||||
|
||||
To accommodate this limitation, calls to :djadmin:`dumpdata` that use
|
||||
the :djadminopt:`--natural` optionwill serialize any model with a
|
||||
``natural_key()`` method before it serializes normal key objects.
|
||||
|
||||
However, this may not always be enough. If your natural key refers to
|
||||
another object (by using a foreign key or natural key to another object
|
||||
as part of a natural key), then you need to be able to ensure that
|
||||
the objects on which a natural key depends occur in the serialized data
|
||||
before the natural key requires them.
|
||||
|
||||
To control this ordering, you can define dependencies on your
|
||||
``natural_key()`` methods. You do this by setting a ``dependencies``
|
||||
attribute on the ``natural_key()`` method itself.
|
||||
|
||||
For example, consider the ``Permission`` model in ``contrib.auth``.
|
||||
The following is a simplified version of the ``Permission`` model::
|
||||
|
||||
class Permission(models.Model):
|
||||
name = models.CharField(max_length=50)
|
||||
content_type = models.ForeignKey(ContentType)
|
||||
codename = models.CharField(max_length=100)
|
||||
# ...
|
||||
def natural_key(self):
|
||||
return (self.codename,) + self.content_type.natural_key()
|
||||
|
||||
The natural key for a ``Permission`` is a combination of the codename for the
|
||||
``Permission``, and the ``ContentType`` to which the ``Permission`` applies. This means
|
||||
that ``ContentType`` must be serialized before ``Permission``. To define this
|
||||
dependency, we add one extra line::
|
||||
|
||||
class Permission(models.Model):
|
||||
# ...
|
||||
def natural_key(self):
|
||||
return (self.codename,) + self.content_type.natural_key()
|
||||
natural_key.dependencies = ['contenttypes.contenttype']
|
||||
|
||||
This definition ensures that ``ContentType`` models are serialized before
|
||||
``Permission`` models. In turn, any object referencing ``Permission`` will
|
||||
be serialized after both ``ContentType`` and ``Permission``.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue