Fixed #33788 -- Added TrigramStrictWordSimilarity() and TrigramStrictWordDistance() on PostgreSQL.

This commit is contained in:
Matt Brewer 2022-06-17 08:44:03 +01:00 committed by Mariusz Felisiak
parent 3ef37a5245
commit 8d160f154f
8 changed files with 130 additions and 5 deletions

View file

@ -7,6 +7,9 @@ Trigram similarity
.. fieldlookup:: trigram_similar
``trigram_similar``
-------------------
The ``trigram_similar`` lookup allows you to perform trigram lookups,
measuring the number of trigrams (three consecutive characters) shared, using a
dedicated PostgreSQL extension. A trigram lookup is given an expression and
@ -27,6 +30,9 @@ The ``trigram_similar`` lookup can be used on
.. fieldlookup:: trigram_word_similar
``trigram_word_similar``
------------------------
The ``trigram_word_similar`` lookup allows you to perform trigram word
similarity lookups using a dedicated PostgreSQL extension. It can be
approximately understood as measuring the greatest number of trigrams shared
@ -46,6 +52,25 @@ The ``trigram_word_similar`` lookup can be used on
>>> Sentence.objects.filter(name__trigram_word_similar='Middlesborough')
['<Sentence: Gumby rides on the path of Middlesbrough>']
.. fieldlookup:: trigram_strict_word_similar
``trigram_strict_word_similar``
-------------------------------
.. versionadded:: 4.2
Similar to :lookup:`trigram_word_similar`, except that it forces extent
boundaries to match word boundaries.
To use it, add ``'django.contrib.postgres'`` in your :setting:`INSTALLED_APPS`
and activate the `pg_trgm extension`_ on PostgreSQL. You can install the
extension using the
:class:`~django.contrib.postgres.operations.TrigramExtension` migration
operation.
The ``trigram_strict_word_similar`` lookup can be used on
:class:`~django.db.models.CharField` and :class:`~django.db.models.TextField`.
.. _`pg_trgm extension`: https://www.postgresql.org/docs/current/pgtrgm.html
``Unaccent``

View file

@ -286,9 +286,9 @@ Trigram similarity
==================
Another approach to searching is trigram similarity. A trigram is a group of
three consecutive characters. In addition to the :lookup:`trigram_similar` and
:lookup:`trigram_word_similar` lookups, you can use a couple of other
expressions.
three consecutive characters. In addition to the :lookup:`trigram_similar`,
:lookup:`trigram_word_similar`, and :lookup:`trigram_strict_word_similar`
lookups, you can use a couple of other expressions.
To use them, you need to activate the `pg_trgm extension
<https://www.postgresql.org/docs/current/pgtrgm.html>`_ on PostgreSQL. You can
@ -334,6 +334,18 @@ Usage example::
... ).filter(similarity__gt=0.3).order_by('-similarity')
[<Author: Katy Stevens>]
``TrigramStrictWordSimilarity``
-------------------------------
.. class:: TrigramStrictWordSimilarity(string, expression, **extra)
.. versionadded:: 4.2
Accepts a string or expression, and a field name or expression. Returns the
trigram strict word similarity between the two arguments. Similar to
:class:`TrigramWordSimilarity() <TrigramWordSimilarity>`, except that it forces
extent boundaries to match word boundaries.
``TrigramDistance``
-------------------
@ -371,3 +383,13 @@ Usage example::
... distance=TrigramWordDistance(test, 'name'),
... ).filter(distance__lte=0.7).order_by('distance')
[<Author: Katy Stevens>]
``TrigramStrictWordDistance``
-----------------------------
.. class:: TrigramStrictWordDistance(string, expression, **extra)
.. versionadded:: 4.2
Accepts a string or expression, and a field name or expression. Returns the
trigram strict word distance between the two arguments.