Commit graph

119 commits

Author SHA1 Message Date
Daniël Heres
c24b0e01db
Implement ASSERT statement (#226)
As supported by PostgreSQL and BigQuery (with some differences between them)
2020-07-16 17:28:03 +02:00
Max Countryman
8cc7702a8c
update branch references to main (#215)
* update branch references to `main`

* ensure we point to ballista-compute

* update a couple of links to point to ballista-compute
2020-07-02 21:31:54 +02:00
mz
0c83e5d9e8
Support SQLite's WITHOUT ROWID in CREATE TABLE (#208)
Per https://sqlite.org/lang_createtable.html

Co-authored-by: mashuai <mashuai@bytedance.com>
2020-06-26 15:11:46 +03:00
Daniël Heres
15d5f71646
Add CREATE TABLE AS support (#206)
We parse it as a regular `CREATE TABLE` statement
followed by an `AS <query>`, which is how BigQuery works:
https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#create_table_statement


ANSI SQL and PostgreSQL only support a plain list of columns
after the table name in a CTAS
    `CREATE TABLE t (a) AS SELECT a FROM foo`

We currently only allow specifying a full schema with data
types, or omitting it altogether.

https://www.postgresql.org/docs/12/sql-createtableas.html
https://jakewheat.github.io/sql-overview/sql-2016-foundation-grammar.html#as-subquery-clause


Finally, when no schema is specified, we print empty parens after a
plain `CREATE TABLE t ();` as required by PostgreSQL, but skip them
in a CTAS: `CREATE TABLE t AS ...`. This affects serialization only,
the parser allows omitting the schema in a regular `CREATE TABLE` too
since the first release of the parser:
7d27abdfb4/src/sqlparser.rs (L325-L332)

Co-authored-by: Nickolay Ponomarev <asqueella@gmail.com>
2020-06-23 16:30:22 +03:00
Jovansonlee Cesar
26361fd854
Implement ALTER TABLE DROP COLUMN (#148)
This implements `DROP [ COLUMN ] [ IF EXISTS ] column_name [ CASCADE ]`
sub-command of `ALTER TABLE`, which is what PostgreSQL supports https://www.postgresql.org/docs/12/sql-altertable.html
(except for the RESTRICT option)

Co-authored-by: Nickolay Ponomarev <asqueella@gmail.com>
2020-06-16 23:39:52 +03:00
mz
faeb7d440a
Implement ALTER TABLE ADD COLUMN and RENAME (#203)
Based on sqlite grammar
https://www.sqlite.org/lang_altertable.html
2020-06-16 22:52:37 +03:00
Daniël Heres
fab6e28271
Output DataType capitalized (#202)
This makes it consistent with other output which also prints keywords capitalized.
2020-06-13 16:18:44 +03:00
Daniël Heres
68afa2a764
Make FileFormat case insensitive (#200) 2020-06-12 18:10:44 +03:00
Max Countryman
6cdd4a146d
Support general "typed string" literals (#187)
Fixes #168 by enabling `DATE` and other keywords to be used as
identifiers when not followed by a string literal.

A "typed string" is our term for generalized version of `DATE '...'`/`TIME '...'`/
`TIMESTAMP '...'` literals, represented as `TypedString { data_type, value }`
in the AST.

Unlike DATE/TIME/TIMESTAMP literals, this is a non-standard extension
supported by PostgreSQL at least.

This is a port of MaterializeInc/materialize#3146

Co-authored-by: Nikhil Benesch <nikhil.benesch@gmail.com>
Co-authored-by: Nickolay Ponomarev <asqueella@gmail.com>
2020-06-12 00:04:43 +03:00
Daniël Heres
34548e890b
Change Word::keyword to a enum (#193)
This improves performance and paves the way to future API enhancements as discussed in the PR https://github.com/andygrove/sqlparser-rs/pull/193
2020-06-11 22:00:35 +03:00
Max Countryman
846c52f450
Allow omitting units after INTERVAL (#184)
Alter INTERVAL to support postgres syntax

This patch updates our INTERVAL implementation such that the Postgres
and Redshfit variation of the syntax is supported: namely that 'leading
field' is optional.

Fixes #177.
2020-06-10 09:32:13 +03:00
Daniël Heres
a42121de52
Use binary search to speed up matching keywords (#191) 2020-06-07 20:25:10 +03:00
Daniël Heres
b4699bd4a7
Support bitwise and, or, xor (#181)
Operator precedence is coming from:

https://cloud.google.com/bigquery/docs/reference/standard-sql/operators
2020-06-03 19:02:05 +03:00
Daniël Heres
00dc490f72
Support the string concat operator (#178)
The selected precedence is based on BigQuery documentation, where it is equal to `*` and `/`:

https://cloud.google.com/bigquery/docs/reference/standard-sql/operators
2020-06-02 21:24:30 +03:00
Max Countryman
5f3c1bda01
Provide LISTAGG implementation (#174)
This patch provides an initial implemenation of LISTAGG[1]. Notably this
implemenation deviates from ANSI SQL by allowing both WITHIN GROUP and
the delimiter to be optional. We do so because Redshift SQL works this
way and this approach is ultimately more flexible.

Fixes #169.

[1] https://modern-sql.com/feature/listagg
2020-05-30 18:50:17 +03:00
QP Hou
418b9631ce
add nulls first/last support to order by expression (#176)
Following `<sort specification list>` from the standard https://jakewheat.github.io/sql-overview/sql-2016-foundation-grammar.html#_10_10_sort_specification_list
2020-05-30 17:05:15 +03:00
Alex Dukhno
91f769e460 added create and drop schema 2020-05-28 19:50:16 +03:00
Christoph Müller
98f97d09db
Add support for "on delete cascade" column option (#170)
Specifically, `FOREIGN KEY REFERENCES <foreign_table> (<referred_columns>)`
can now be followed by `ON DELETE <referential_action>` and/or by
`ON UPDATE <referential_action>`.
2020-05-27 18:24:23 +03:00
mashuai
5aacc5ebcd add create index and drop index support 2020-05-27 09:27:57 +08:00
Nickolay Ponomarev
327e6cd9f1 Report an error for unterminated string literals
...updated the TODOs regarding single-quoted literals parsing while at it.
2020-05-10 21:21:01 +03:00
Alex Dukhno
5ad578e3e5
Implement CREATE TABLE IF NOT EXISTS (#163)
A non-standard feature supported at least by Postgres

https://www.postgresql.org/docs/12/sql-createtable.html
2020-04-21 16:28:02 +03:00
Matt Jibson
c0b0b5924d Add support for OFFSET with the ROWS keyword
MySQL doesn't support the ROWS part of OFFSET. Teach the parser to
remember which variant it saw, including just ROW.
2020-04-19 20:06:08 -06:00
Eyal Leshem
3255fd3ea8 Add support to to table_name inside parenthesis 2020-04-12 20:31:09 +03:00
Robert Grimm
b1cbc55128
Turn type Ident into struct Ident
The Ident type was previously an alias for a String. Turn it into a full
fledged struct, so that the parser can preserve the distinction between
identifier value and quote style already made by the tokenizer's Word
structure.
2019-10-20 00:16:41 -04:00
Andy Grove
a2613f9dd1 format 2019-10-17 20:41:49 -06:00
gaffneyk
2bb38c9b27
Parse START TRANSACTION when followed by a semicolon
Co-authored-by: Nikhil Benesch <nikhil.benesch@gmail.com>
2019-09-13 22:54:21 -04:00
Nikhil Benesch
a0aca824e8
Optionally parse numbers into BigDecimals
With `--features bigdecimal`, parse numbers into BigDecimals instead of
leaving them as strings.
2019-09-01 13:21:49 -04:00
Nikhil Benesch
b5621c0fe8
Don't lose precision when parsing decimal fractions
The SQL standard requires that numeric literals with a decimal point,
like 1.23, are represented exactly, up to some precision. That means
that parsing these literals into f64s is invalid, as it is impossible
to represent many decimal numbers exactly in binary floating point (for
example, 0.3).

This commit parses all numeric literals into a new `Value` variant
`Number(String)`, removing the old `Long(u64)` and `Double(f64)`
variants. This is slightly less convenient for downstream consumers, but
far more flexible, as numbers that do not fit into a u64 and f64 are now
representable.
2019-09-01 13:21:30 -04:00
Nikhil Benesch
106c9f8efb
Remove "SQL" prefix from "SQLDateTimeField" struct
I realized a moment too late that I'd missed a type name in
when removing the "SQL" prefix from types in ac555d7e8. As far as I can
tell, this was the only oversight.
2019-06-25 13:24:31 -04:00
Nikhil Benesch
ac555d7e86
Remove "SQL" prefix from types
The rationale here is the same as the last commit: since this crate
exclusively parses SQL, there's no need to restate that in every type
name. (The prefix seems to be an artifact of this crate's history as a
submodule of Datafusion, where it was useful to explicitly call out
which types were related to SQL parsing.)

This commit has the additional benefit of making all type names
consistent; over type we'd added some types which were not prefixed with
"SQL".
2019-06-25 13:11:11 -04:00
Nikhil Benesch
cf655ad1a6
Remove "sql" prefix from module names
Since this crate only deals with SQL parsing, the modules are understood
to refer to SQL and don't need to restate that explicitly.
2019-06-24 12:56:26 -04:00
Nikhil Benesch
5b23ad1d4c
Merge pull request #119 from andygrove/astnode-expr
Rename ASTNode to Expr
2019-06-19 21:45:49 -04:00
Nikhil Benesch
646d1e13ca
Rename ASTNode to Expr
The ASTNode enum was confusingly named. In the past, the name made
sense, as the enum contained nearly all of the nodes in the AST, but
over time, pieces have been split into different structs, like
SQLStatement and SQLQuery. The ASTNode enum now contains only contains
expression nodes, so Expr is a better name.

Also rename the UnnamedExpression and ExpressionWithAlias variants
of SQLSelectItem to UnnamedExpr and ExprWithAlias, respectively, to
match the new shorthand for the word "expression".
2019-06-19 00:00:59 -04:00
Nickolay Ponomarev
4294581ded [mssql] Parse CROSS/OUTER APPLY
T-SQL (and Oracle) support non-standard syntax, which is similar in
functionality to LATERAL joins in ANSI and PostgreSQL
<https://blog.jooq.org/tag/lateral-derived-table/>: it allows to use
the columns from the tables defined to the left of `APPLY` in the
"derived tables" (subqueries) to the right of `APPLY`. Unlike ANSI
LATERAL (but like Postgres' implementation), APPLY is also used with
table-valued function calls.

Despite them being similar, we represent "APPLY" joins with
`JoinOperator`s of its own (`CrossApply` and `OuterApply`). Doing
otherwise seemed like it would cause unnecessary confusion, as those
interested in dialect-specific parsing would probably not expect APPLY
being parsed as LATERAL, and those wanting to forbid non-standard SQL
would not be helped by this either.

This also renames existing JoinOperator::Cross -> CrossJoin to avoid
confusion with CrossApply.
2019-06-19 02:45:47 +03:00
Nikhil Benesch
2c99635709
Don't silently accept naked OUTER JOINS
`SELECT * FROM a OUTER JOIN b` was previously being parsed as an inner
join where table `a` was aliased to `OUTER`. This is extremely
surprising, as the user likely intended to say FULL OUTER JOIN. Since
the SQL specification lists OUTER as a keyword, we are well within our
rights to return an error here.
2019-06-18 12:03:15 -04:00
Nickolay Ponomarev
eb3450dd51 Support HAVING without GROUP BY
...which is weird but allowed:
https://jakewheat.github.io/sql-overview/sql-2011-foundation-grammar.html#table-expression
https://dba.stackexchange.com/a/57453/15599

Also add a test for GROUP BY .. HAVING
2019-06-17 01:06:32 +03:00
Nickolay Ponomarev
d60bdc0b92 Allow LIMIT/OFFSET/FETCH without FROM
Postgres allows it, as does ANSI SQL per the <query expression> definition:
https://jakewheat.github.io/sql-overview/sql-2011-foundation-grammar.html#_7_13_query_expression
2019-06-17 00:54:37 +03:00
Nickolay Ponomarev
c1509b36ec Use FETCH_FIRST_TWO_ROWS_ONLY in tests to reduce duplication 2019-06-17 00:49:25 +03:00
Nickolay Ponomarev
f87e8d5158 Don't duplicate all the parse_simple_select assertions in the LIMIT test 2019-06-17 00:39:00 +03:00
Nickolay Ponomarev
3c073a4c34 Use TableAlias in Cte 2019-06-16 21:18:57 +03:00
Nickolay Ponomarev
dc26c4abd5
Merge pull request #115 from nickolay/pr/followups
Doc improvements and follow-ups to the recent PRs
2019-06-15 01:57:52 +03:00
Nickolay Ponomarev
535505bb96
Update the error message in parse_query_body 2019-06-14 16:28:53 -04:00
Nikhil Benesch
4ee461bae4
Require that nested joins always have one join
The SQL specification prohibits constructions like

    SELECT * FROM a NATURAL JOIN (b)

where b sits alone inside parentheses. Parentheses in a FROM entry
always introduce either a derived table or a join.
2019-06-14 16:28:52 -04:00
Nikhil Benesch
8bee74277a
Handle derived tables with set operations
This commit adds support for derived tables (i.e., subqueries) that
incorporate set operations, like:

    SELECT * FROM (((SELECT 1) UNION (SELECT 2)) t1 AS NATURAL JOIN t2)

This introduces a bit of complexity around determining whether a left
paren starts a subquery, starts a nested join, or belongs to an
already-started subquery. The details are explained in a comment within
the patch.
2019-06-14 16:28:52 -04:00
Nickolay Ponomarev
5c7ff79e78 Add a test for parsing the NULL literal
(Coveralls notices we didn't have one.)
2019-06-13 11:17:36 +03:00
Nickolay Ponomarev
45c9aa1cc2 Use self.expected() more 2019-06-13 11:15:10 +03:00
Nickolay Ponomarev
32cf36e64f Add a testcase, which passes thanks to PR #109 2019-06-12 21:04:31 +03:00
Nikhil Benesch
ae25dce246
Split operators by arity
It is useful downstream to have two separate enums, one for unary
operators and one for binary operators, so that the compiler can check
exhaustiveness. Otherwise downstream consumers need to manually encode
which operators are unary and which operators are binary when matching
on an Operator enum.
2019-06-10 23:03:11 -04:00
Nikhil Benesch
9e33cea9b8
Standardize BinaryOp and UnaryOp nodes
These were previously called "BinaryExpr" and "Unary"; besides being
inconsistent, it's also not correct to say "binary expression" or "unary
expression", as it's the operators that have arities, not the
expression. Adjust the naming of the variants accordingly.
2019-06-10 23:02:17 -04:00
Andy Grove
b379480b7a
Merge pull request #79 from benesch/license
Standardize license headers
2019-06-10 19:39:12 -06:00