Commit graph

290 commits

Author SHA1 Message Date
Nickolay Ponomarev
67cc880fd1 Add comments to the test files 2019-05-04 02:43:00 +03:00
Nickolay Ponomarev
304710d59a Add MSSQL dialect and fix up the postgres' identifier rules
The `@@version` test is MS' dialect of SQL, it seems, so test it with
its own dialect.

Update the rules for identifiers in Postresql dialect per documentation,
while we're at it. The current identifier rules in Postgresql dialect
were introduced in this commit - as a copy of generic rules, it seems:
810cd8e6cf (diff-2808df0fba0aed85f9d35c167bd6a5f1L138)
2019-05-04 01:00:13 +03:00
Nickolay Ponomarev
5047f2c02e Remove the ansi-specific test file and update PG tests
- The ANSI dialect is now tested in `sqlparser_common.rs`
- Some PG testcases are also parsed by the generic dialect successfully,
  so test that.
2019-05-04 01:00:13 +03:00
Nickolay Ponomarev
1347ca0825 Move the rest of tests not specific to PG from the sqlparser_postgres.rs 2019-05-04 01:00:13 +03:00
Nickolay Ponomarev
478dbe940d Factor test helpers into a common module
Also run "generic" tests with all dialects (`parse_select_version`
doesn't work with ANSI dialect, so I moved it to the postgres file
temporarily)
2019-05-04 01:00:13 +03:00
Nickolay Ponomarev
de177f107c Remove dead datetime-related code
1) Removed unused date/time parsing methods from `Parser`

I don't see how the token-based parsing code would ever be used: the
date/time literals are usually quoted, like `DATE 'yyyy-mm-dd'` or
simply `'YYYYMMDD'`, so the date will be a single token.

2) Removed unused date/time related variants from `Value` and the
dependency on `chrono`.

We don't support parsing date/time literals at the moment and when we
do I think we should store the exact String to let the consumer parse
it as they see fit.

3) Removed `parse_timestamps_example` and
`parse_timestamps_with_millis_example` tests. They parsed as
`number(2016) minus number(02) minus number(15) <END OF EXPRESSION>`
(leaving the time part unparsed) as it makes no sense to try parsing
a yyyy-mm-dd value as an SQL expression.
2019-05-04 01:00:13 +03:00
Nickolay Ponomarev
9297ffbe18 Move tests using standard SQL from the postgresql-specific file 2019-05-04 01:00:13 +03:00
Nickolay Ponomarev
d1b088bd43 Switch remaining tests to the standard format 2019-05-04 01:00:13 +03:00
Nickolay Ponomarev
0233604f9b Remove extraneous tests
`parse_example_value` parses as compound identifier, which makes no
sense ("SARAH"."LEWISE@sakilacustomer"."org")

`parse_function_now` is unnecessary since we already test the parsing
of functions in `parse_scalar_function_in_projection`
2019-05-04 01:00:13 +03:00
Nickolay Ponomarev
fe635350f0 Improve INSERT tests
De-duplicate and check for specific error in `parse_insert_invalid`.
2019-05-04 01:00:13 +03:00
Nickolay Ponomarev
098d1c4a17 Enable clippy lints by default in RLS 2019-04-21 04:46:19 +03:00
Nickolay Ponomarev
dee30aabe0 Fix the clippy assert!(false) lint
https://rust-lang.github.io/rust-clippy/master/index.html#assertions_on_constants

While I don't feel it's valid, fixing it lets us act on the other, more
useful, lints.
2019-04-21 04:46:19 +03:00
Nickolay Ponomarev
c223eaf0aa Fix a bunch of trivial clippy lints 2019-04-21 04:46:19 +03:00
Nickolay Ponomarev
b12a19e197 Switch to the Rust 2018 edition
This requires Rust 1.31 (from last year) to build, but is otherwise
compatible with the 2015-edition code.
2019-04-21 04:41:11 +03:00
Zhiyuan Zheng
d8f824c400 merge CreateExternalTable & CreateTable. 2019-04-14 01:05:26 +08:00
Nickolay Ponomarev
c5bbfc33fd Don't Box<ASTNode> in SQLStatement
This used to be needed when it was a variant in the ASTNode enum itself.
2019-02-07 05:33:46 +03:00
Nickolay Ponomarev
39e98cb11a Rename parse_tablename -> parse_object_name (4.2/4.4)
...to match the name of the recently introduced `SQLObjectName` struct
and to avoid any reservations about using it with multi-part names of
objects other than tables (as in the `type_name` case).
2019-02-07 05:31:44 +03:00
Nickolay Ponomarev
523f086be7 Introduce SQLObjectName struct (4.1/4.4)
(To store "A name of a table, view, custom type, etc., possibly
multi-part, i.e. db.schema.obj".)

Before this change

  - some places used `String` for this (these are updated in this commit)

  - while others (notably SQLStatement::SQLDelete::relation, which is
    the reason for this series of commits) relied on
    ASTNode::SQLCompoundIdentifier (which is also backed by a 
    Vec<SQLIdent>, but, as a variant of ASTNode enum, is not convenient
    to use when you know you need that specific variant).
2019-02-07 05:31:40 +03:00
Nickolay Ponomarev
b57c60a78c Only use parse_expr() when we expect an expression (0/4)
Before this commit there was a single `parse_expr(u8)` method, which
was called both

1) from within the expression parser (to parse subexpression consisting
   of operators with higher priority than the current one), and

2) from the top-down parser both

   a) to parse true expressions (such as an item of the SELECT list or
      the condition after WHERE or after ON), and 
   b) to parse sequences which are not exactly "expressions".


This starts cleaning this up by renaming the `parse_expr(u8)` method to
`parse_subexpr()` and using it only for (1) - i.e. usually providing a
non-zero precedence parameter.

The non-intuitively called `parse()` method is renamed to `parse_expr()`,
which became available and is used for (2a).


While reviewing the existing callers of `parse_expr`, four points to
follow up on were identified (marked "TBD (#)" in the commit):

1) Do not lose parens (e.g. `(1+2)*3`) when roundtripping
   String->AST->String by using SQLNested.
2) Incorrect precedence of the NOT unary
3) `parse_table_factor` accepts any expression where a SELECT subquery
   is expected.
4) parse_delete uses parse_expr() to retrieve a table name

These are dealt with in the commits to follow.
2019-02-07 05:24:54 +03:00
Nickolay Ponomarev
707c58ad57 Support parsing of multiple statements (5/5)
Parser::parse_sql() can now parse a semicolon-separated list of
statements, returning them in a Vec<SQLStatement>.

To support this we:

  - Move handling of inter-statement tokens from the end of individual
    statement parsers (`parse_select` and `parse_delete`; this was not
    implemented for other top-level statements) to the common
    statement-list parsing code (`parse_sql`);

  - Change the "Unexpected token at end of ..." error, which didn't have
    tests and prevented us from parsing successive statements  ->
    "Expected end of statement" (i.e. a delimiter - currently only ";" -
    or the EOF);

  - Add PartialEq on ParserError to be able to assert_eq!() that parsing
    statements that do not terminate properly returns an expected error.
2019-02-07 05:24:54 +03:00
Nickolay Ponomarev
2dec65fdb4 Separate statement from expr parsing (4/5)
Continuing from https://github.com/andygrove/sqlparser-rs/pull/33#issuecomment-453060427

This stops the parser from accepting (and the AST from being able to
represent) SQL look-alike code that makes no sense, e.g.

    SELECT ... FROM (CREATE TABLE ...) foo
    SELECT ... FROM (1+CAST(...)) foo

Generally this makes the AST less "partially typed": meaning certain
parts are strongly typed (e.g. SELECT can only contain projections,
relations, etc.), while everything that didn't get its own type is
dumped into ASTNode, effectively untyped. After a few more fixes (yet
to be implemented), `ASTNode` could become an `SQLExpression`. The
Pratt-style expression parser (returning an SQLExpression) would be
invoked from the top-down parser in places where a generic expression
is expected (e.g. after SELECT <...>, WHERE <...>, etc.), while things
like select's `projection` and `relation` could be more appropriately
(narrowly) typed.


Since the diff is quite large due to necessarily large number of
mechanical changes, here's an overview:

1) Interface changes:

   - A new AST enum - `SQLStatement` - is split out of ASTNode:

     - The variants of the ASTNode enum, which _only_ make sense as a top
       level statement (INSERT, UPDATE, DELETE, CREATE, ALTER, COPY) are
       _moved_ to the new enum, with no other changes.
     - SQLSelect is _duplicated_: now available both as a variant in
       SQLStatement::SQLSelect (top-level SELECT) and ASTNode:: (subquery).

   - The main entry point (Parser::parse_sql) now expects an SQL statement
     as input, and returns an `SQLStatement`.

2) Parser changes: instead of detecting the top-level constructs deep
down in the precedence parser (`parse_prefix`) we are able to do it
just right after setting up the parser in the `parse_sql` entry point

(SELECT, again, is kept in the expression parser to demonstrate how
subqueries could be implemented).

The rest of parser changes are mechanical ASTNode -> SQLStatement
replacements resulting from the AST change.

3) Testing changes: for every test - depending on whether the input was
a complete statement or an expresssion -  I used an appropriate helper
function:

   - `verified` (parses SQL, checks that it round-trips, and returns
     the AST) - was replaced by `verified_stmt` or `verified_expr`.

   - `parse_sql` (which returned AST without checking it round-tripped)
     was replaced by:

     - `parse_sql_expr` (same function, for expressions)

     - `one_statement_parses_to` (formerly `parses_to`), extended to
       deal with statements that are not expected to round-trip.
       The weird name is to reduce further churn when implementing
       multi-statement parsing.

     - `verified_stmt` (in 4 testcases that actually round-tripped)
2019-01-31 15:54:57 +03:00
Nickolay Ponomarev
7bbf69f513 Further simplify parse_compound_identifier (5/8)
This part changes behavior:
- Fail when no identifier is found.
- Avoid rewinding if EOF was hit right after the identifier.
2019-01-31 03:57:17 +03:00
Nickolay Ponomarev
9a8b6a8e64 Rework keyword/identifier parsing (1/8)
Fold Token::{Keyword, Identifier, DoubleQuotedString} into one
Token::SQLWord, which has the necessary information (was it a
known keyword and/or was it quoted).

This lets the parser easily accept DoubleQuotedString (a quoted
identifier) everywhere it expects an Identifier in the same match
arm. (To complete support of quoted identifiers, or "delimited
identifiers" as the spec calls them, a TODO in parse_tablename()
ought to be addressed.)

    As an aside, per <https://en.wikibooks.org/wiki/SQL_Dialects_Reference/Data_structure_definition/Delimited_identifiers>
    sqlite seems to be the only one supporting 'identifier'
    (which is rather hairy, since it can also be a string
    literal), and `identifier` seems only to be supported by
    MySQL. I didn't implement either one.

This also allows the use of `parse`/`expect_keyword` machinery
for non-reserved keywords: previously they relied on the keyword
being a Token::Keyword, which wasn't a Token::Identifier, and so
wasn't accepted as one.

Now whether a keyword can be used as an identifier can be decided
by the parser. (I didn't add a blacklist of "reserved" keywords,
so that any keyword which doesn't have a special meaning in the
parser could be used as an identifier. The list of keywords in
the dialect could be re-used for that purpose at a later stage.)
2019-01-31 03:57:16 +03:00
Nickolay Ponomarev
70c799e21d Use verified() in the remaining PG-specific tests 2019-01-20 19:30:13 +03:00
Nickolay Ponomarev
9441f9c5d8 Move tests for "LIKE '%'" to sqlparser_generic.rs
...as this syntax is not specific to the PostgreSQL dialect.

Also use verified() to assert that parsing + serializing results in the
original SQL string.
2019-01-20 19:30:12 +03:00
Nickolay Ponomarev
d5109a2880 Remove duplicate tests from sqlparser_postgres.rs
These have identical copies in sqlparser_generic.rs
2019-01-20 19:30:12 +03:00
Nickolay Ponomarev
a1da7b4005 Reduce differences between "generic" and "postgresql" tests
Mainly by replacing `assert_eq!(sql, ast.to_string())` with a call to
the recently introduced `verified()` helper or using `parses_to()` where
the expected serialization differs from the original SQL string.

There was one case (parse_implicit_join), where the inputs were different:
let sql = "SELECT * FROM t1,t2";
//vs
let sql = "SELECT * FROM t1, t2";

and since we don't test the whitespace handling in other tests, I just
used the canonical representation as input.
2019-01-20 19:14:53 +03:00
Nickolay Ponomarev
de4ccd3cb7 Fail when expected keyword is not found
Add #[must_use] to warn against unchecked results of parse_keyword/s in
the future.
2019-01-13 01:07:58 +03:00
Andy Grove
777fd4c2ee Merge branch 'master' into not 2019-01-12 11:14:07 -07:00
Andy Grove
8c351fe10a Merge branch 'join-support' of https://github.com/fredrikroos/sqlparser-rs into fredrikroos-join-support 2019-01-12 11:09:41 -07:00
Andy Grove
ab423bc9dc
Merge branch 'master' into join-support 2019-01-12 08:33:12 -07:00
Nickolay Ponomarev
3b13e153a8 Fix parse_time() handling of fractional seconds
There's no Token::Period in such situation, so fractional part (from sec) was silently truncated.

Can't uncomment the test yet, because parse_timestamp() is effectively
unused: the code added to parse_value() in 5abd9e7dec
was wrong as it attempted to handle unquoted date/time literals. One
part of it was commented out earlier, the other can't work as far as I
can see, as it tries to parse a Number token - `([0-9]|\.)+` - as a
timestamp, so I removed it as well.
2019-01-11 02:37:36 +03:00
Nickolay Ponomarev
eff92a2dc1 Remove special handling of ::type1::type2 from parse_pg_cast
...it gets handled just as well by the infix parser.
(Add a test while we're at it.)
2019-01-11 02:37:36 +03:00
Nickolay Ponomarev
f21cd697c3 Simplify custom datatypes handling and add a test
1) Simplified the bit in parse_datatype()
2) Made sure it was covered by the test (the "public.year" bit)
2a) ...the rest of changes in the test are to fix incorrect variable
 names: c_name/c_lat/c_lng were copy-pasted from a previous test.
3) Removed the branch from parse_pg_cast, which duplicated what
parse_data_type already handled (added in the same commit even
2007995938 )
2019-01-11 02:37:36 +03:00
Andy Grove
ee1944b9d9 Implemented NOT LIKE 2018-12-16 16:30:32 -07:00
Andy Grove
e863bc041c cargo fmt, fix compiler warnings 2018-12-16 13:57:01 -07:00
Clemens Winter
91aa985ed0 Add LIKE operator 2018-12-16 11:26:09 -08:00
Fredrik Roos
72024661a9 More tests and some small bugfixes 2018-11-18 00:53:39 +01:00
Andy Grove
7e152cd0a9 revert one timestamp parsing case 2018-10-14 12:26:47 -06:00
Andy Grove
035ef52696 re-instate tests for generic parser 2018-10-06 10:15:10 -06:00
Renamed from tests/sqlparser.rs (Browse further)