datafusion-sqlparse

mirror of https://github.com/apache/datafusion-sqlparser-rs.git synced 2025-07-08 17:35:00 +00:00

Author	SHA1	Message	Date
Nickolay Ponomarev	9371652446	Fix "unused stmt" warning in tests, with default features	2020-07-29 02:08:17 +03:00
Steven	8020b2e5f0	Add Postgres-specific PREPARE, EXECUTE and DEALLOCATE (#243 ) Adds top-statements PREPARE, EXECUTE and DEALLOCATE for Postgres-specific feature prepared statement.	2020-07-28 12:01:52 +03:00
Max Countryman	8cc7702a8c	update branch references to `main` (#215 ) * update branch references to `main` * ensure we point to ballista-compute * update a couple of links to point to ballista-compute	2020-07-02 21:31:54 +02:00
mz	0c83e5d9e8	Support SQLite's WITHOUT ROWID in CREATE TABLE (#208 ) Per https://sqlite.org/lang_createtable.html Co-authored-by: mashuai <mashuai@bytedance.com>	2020-06-26 15:11:46 +03:00
Daniël Heres	15d5f71646	Add CREATE TABLE AS support (#206 ) We parse it as a regular `CREATE TABLE` statement followed by an `AS <query>`, which is how BigQuery works: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#create_table_statement ANSI SQL and PostgreSQL only support a plain list of columns after the table name in a CTAS `CREATE TABLE t (a) AS SELECT a FROM foo` We currently only allow specifying a full schema with data types, or omitting it altogether. https://www.postgresql.org/docs/12/sql-createtableas.html https://jakewheat.github.io/sql-overview/sql-2016-foundation-grammar.html#as-subquery-clause Finally, when no schema is specified, we print empty parens after a plain `CREATE TABLE t ();` as required by PostgreSQL, but skip them in a CTAS: `CREATE TABLE t AS ...`. This affects serialization only, the parser allows omitting the schema in a regular `CREATE TABLE` too since the first release of the parser: `7d27abdfb4/src/sqlparser.rs (L325-L332)` Co-authored-by: Nickolay Ponomarev <asqueella@gmail.com>	2020-06-23 16:30:22 +03:00
Daniël Heres	fab6e28271	Output DataType capitalized (#202 ) This makes it consistent with other output which also prints keywords capitalized.	2020-06-13 16:18:44 +03:00
Alex Dukhno	5ad578e3e5	Implement CREATE TABLE IF NOT EXISTS (#163 ) A non-standard feature supported at least by Postgres https://www.postgresql.org/docs/12/sql-createtable.html	2020-04-21 16:28:02 +03:00
Robert Grimm	b1cbc55128	Turn type Ident into struct Ident The Ident type was previously an alias for a String. Turn it into a full fledged struct, so that the parser can preserve the distinction between identifier value and quote style already made by the tokenizer's Word structure.	2019-10-20 00:16:41 -04:00
Nikhil Benesch	b8fe800da5	Fix merge skew with number literals	2019-09-02 09:37:38 -04:00
Nikhil Benesch	e9c5567b04	Merge pull request #135 from andygrove/show-columns Support MySQL `SHOW COLUMNS` statement	2019-09-02 07:40:57 -04:00
Nikhil Benesch	a0aca824e8	Optionally parse numbers into BigDecimals With `--features bigdecimal`, parse numbers into BigDecimals instead of leaving them as strings.	2019-09-01 13:21:49 -04:00
Nikhil Benesch	b5621c0fe8	Don't lose precision when parsing decimal fractions The SQL standard requires that numeric literals with a decimal point, like 1.23, are represented exactly, up to some precision. That means that parsing these literals into f64s is invalid, as it is impossible to represent many decimal numbers exactly in binary floating point (for example, 0.3). This commit parses all numeric literals into a new `Value` variant `Number(String)`, removing the old `Long(u64)` and `Double(f64)` variants. This is slightly less convenient for downstream consumers, but far more flexible, as numbers that do not fit into a u64 and f64 are now representable.	2019-09-01 13:21:30 -04:00
Nikhil Benesch	e1ded184f8	Support `SHOW <var>` and `SET <var>`	2019-09-01 13:20:37 -04:00
Nikhil Benesch	ac555d7e86	Remove "SQL" prefix from types The rationale here is the same as the last commit: since this crate exclusively parses SQL, there's no need to restate that in every type name. (The prefix seems to be an artifact of this crate's history as a submodule of Datafusion, where it was useful to explicitly call out which types were related to SQL parsing.) This commit has the additional benefit of making all type names consistent; over type we'd added some types which were not prefixed with "SQL".	2019-06-25 13:11:11 -04:00
Nikhil Benesch	cf655ad1a6	Remove "sql" prefix from module names Since this crate only deals with SQL parsing, the modules are understood to refer to SQL and don't need to restate that explicitly.	2019-06-24 12:56:26 -04:00
Nikhil Benesch	646d1e13ca	Rename ASTNode to Expr The ASTNode enum was confusingly named. In the past, the name made sense, as the enum contained nearly all of the nodes in the AST, but over time, pieces have been split into different structs, like SQLStatement and SQLQuery. The ASTNode enum now contains only contains expression nodes, so Expr is a better name. Also rename the UnnamedExpression and ExpressionWithAlias variants of SQLSelectItem to UnnamedExpr and ExprWithAlias, respectively, to match the new shorthand for the word "expression".	2019-06-19 00:00:59 -04:00
Andy Grove	b379480b7a	Merge pull request #79 from benesch/license Standardize license headers	2019-06-10 19:39:12 -06:00
Nikhil Benesch	ffa1c8f853	Parse column constraints in any order CREATE TABLE t (a INT NOT NULL DEFAULT 1 PRIMARY KEY) is as valid as CREATE TABLE t (a INT DEFAULT 1 PRIMARY KEY NOT NULL).	2019-06-08 12:13:55 -04:00
Nikhil Benesch	e78cf0483e	Standardize license headers Standardize the license header, removing the Grove Enterprise copyright notice where it exists per #58. Also add a CI check to ensure that files without license headers don't get merged. Fix #58.	2019-06-04 10:54:12 -04:00
Nikhil Benesch	69f0082db6	Support arbitrary WITH options for CREATE [TABLE\|VIEW] Both Postgres and MSSQL accept this syntax, though the particular options they accept differ.	2019-06-03 13:32:13 -04:00
Nickolay Ponomarev	aab0c36443	Support parsing constraints in CREATE TABLE <table element> ::= ... \| <table constraint definition> \| ... https://jakewheat.github.io/sql-overview/sql-2011-foundation-grammar.html#table-element-list	2019-06-02 13:54:16 +03:00
Nickolay Ponomarev	67cc880fd1	Add comments to the test files	2019-05-04 02:43:00 +03:00
Nickolay Ponomarev	304710d59a	Add MSSQL dialect and fix up the postgres' identifier rules The `@@version` test is MS' dialect of SQL, it seems, so test it with its own dialect. Update the rules for identifiers in Postresql dialect per documentation, while we're at it. The current identifier rules in Postgresql dialect were introduced in this commit - as a copy of generic rules, it seems: `810cd8e6cf (diff-2808df0fba0aed85f9d35c167bd6a5f1L138)`	2019-05-04 01:00:13 +03:00
Nickolay Ponomarev	5047f2c02e	Remove the ansi-specific test file and update PG tests - The ANSI dialect is now tested in `sqlparser_common.rs` - Some PG testcases are also parsed by the generic dialect successfully, so test that.	2019-05-04 01:00:13 +03:00
Nickolay Ponomarev	1347ca0825	Move the rest of tests not specific to PG from the sqlparser_postgres.rs	2019-05-04 01:00:13 +03:00
Nickolay Ponomarev	478dbe940d	Factor test helpers into a common module Also run "generic" tests with all dialects (`parse_select_version` doesn't work with ANSI dialect, so I moved it to the postgres file temporarily)	2019-05-04 01:00:13 +03:00
Nickolay Ponomarev	de177f107c	Remove dead datetime-related code 1) Removed unused date/time parsing methods from `Parser` I don't see how the token-based parsing code would ever be used: the date/time literals are usually quoted, like `DATE 'yyyy-mm-dd'` or simply `'YYYYMMDD'`, so the date will be a single token. 2) Removed unused date/time related variants from `Value` and the dependency on `chrono`. We don't support parsing date/time literals at the moment and when we do I think we should store the exact String to let the consumer parse it as they see fit. 3) Removed `parse_timestamps_example` and `parse_timestamps_with_millis_example` tests. They parsed as `number(2016) minus number(02) minus number(15) <END OF EXPRESSION>` (leaving the time part unparsed) as it makes no sense to try parsing a yyyy-mm-dd value as an SQL expression.	2019-05-04 01:00:13 +03:00
Nickolay Ponomarev	9297ffbe18	Move tests using standard SQL from the postgresql-specific file	2019-05-04 01:00:13 +03:00
Nickolay Ponomarev	d1b088bd43	Switch remaining tests to the standard format	2019-05-04 01:00:13 +03:00
Nickolay Ponomarev	0233604f9b	Remove extraneous tests `parse_example_value` parses as compound identifier, which makes no sense ("SARAH"."LEWISE@sakilacustomer"."org") `parse_function_now` is unnecessary since we already test the parsing of functions in `parse_scalar_function_in_projection`	2019-05-04 01:00:13 +03:00
Nickolay Ponomarev	fe635350f0	Improve INSERT tests De-duplicate and check for specific error in `parse_insert_invalid`.	2019-05-04 01:00:13 +03:00
Nickolay Ponomarev	098d1c4a17	Enable clippy lints by default in RLS	2019-04-21 04:46:19 +03:00
Nickolay Ponomarev	dee30aabe0	Fix the clippy `assert!(false)` lint https://rust-lang.github.io/rust-clippy/master/index.html#assertions_on_constants While I don't feel it's valid, fixing it lets us act on the other, more useful, lints.	2019-04-21 04:46:19 +03:00
Nickolay Ponomarev	c223eaf0aa	Fix a bunch of trivial clippy lints	2019-04-21 04:46:19 +03:00
Nickolay Ponomarev	b12a19e197	Switch to the Rust 2018 edition This requires Rust 1.31 (from last year) to build, but is otherwise compatible with the 2015-edition code.	2019-04-21 04:41:11 +03:00
Zhiyuan Zheng	d8f824c400	merge CreateExternalTable & CreateTable.	2019-04-14 01:05:26 +08:00
Nickolay Ponomarev	c5bbfc33fd	Don't Box<ASTNode> in SQLStatement This used to be needed when it was a variant in the ASTNode enum itself.	2019-02-07 05:33:46 +03:00
Nickolay Ponomarev	39e98cb11a	Rename parse_tablename -> parse_object_name (4.2/4.4) ...to match the name of the recently introduced `SQLObjectName` struct and to avoid any reservations about using it with multi-part names of objects other than tables (as in the `type_name` case).	2019-02-07 05:31:44 +03:00
Nickolay Ponomarev	523f086be7	Introduce SQLObjectName struct (4.1/4.4) (To store "A name of a table, view, custom type, etc., possibly multi-part, i.e. db.schema.obj".) Before this change - some places used `String` for this (these are updated in this commit) - while others (notably SQLStatement::SQLDelete::relation, which is the reason for this series of commits) relied on ASTNode::SQLCompoundIdentifier (which is also backed by a Vec<SQLIdent>, but, as a variant of ASTNode enum, is not convenient to use when you know you need that specific variant).	2019-02-07 05:31:40 +03:00
Nickolay Ponomarev	b57c60a78c	Only use parse_expr() when we expect an expression (0/4) Before this commit there was a single `parse_expr(u8)` method, which was called both 1) from within the expression parser (to parse subexpression consisting of operators with higher priority than the current one), and 2) from the top-down parser both a) to parse true expressions (such as an item of the SELECT list or the condition after WHERE or after ON), and b) to parse sequences which are not exactly "expressions". This starts cleaning this up by renaming the `parse_expr(u8)` method to `parse_subexpr()` and using it only for (1) - i.e. usually providing a non-zero precedence parameter. The non-intuitively called `parse()` method is renamed to `parse_expr()`, which became available and is used for (2a). While reviewing the existing callers of `parse_expr`, four points to follow up on were identified (marked "TBD (#)" in the commit): 1) Do not lose parens (e.g. `(1+2)*3`) when roundtripping String->AST->String by using SQLNested. 2) Incorrect precedence of the NOT unary 3) `parse_table_factor` accepts any expression where a SELECT subquery is expected. 4) parse_delete uses parse_expr() to retrieve a table name These are dealt with in the commits to follow.	2019-02-07 05:24:54 +03:00
Nickolay Ponomarev	707c58ad57	Support parsing of multiple statements (5/5) Parser::parse_sql() can now parse a semicolon-separated list of statements, returning them in a Vec<SQLStatement>. To support this we: - Move handling of inter-statement tokens from the end of individual statement parsers (`parse_select` and `parse_delete`; this was not implemented for other top-level statements) to the common statement-list parsing code (`parse_sql`); - Change the "Unexpected token at end of ..." error, which didn't have tests and prevented us from parsing successive statements -> "Expected end of statement" (i.e. a delimiter - currently only ";" - or the EOF); - Add PartialEq on ParserError to be able to assert_eq!() that parsing statements that do not terminate properly returns an expected error.	2019-02-07 05:24:54 +03:00
Nickolay Ponomarev	2dec65fdb4	Separate statement from expr parsing (4/5) Continuing from https://github.com/andygrove/sqlparser-rs/pull/33#issuecomment-453060427 This stops the parser from accepting (and the AST from being able to represent) SQL look-alike code that makes no sense, e.g. SELECT ... FROM (CREATE TABLE ...) foo SELECT ... FROM (1+CAST(...)) foo Generally this makes the AST less "partially typed": meaning certain parts are strongly typed (e.g. SELECT can only contain projections, relations, etc.), while everything that didn't get its own type is dumped into ASTNode, effectively untyped. After a few more fixes (yet to be implemented), `ASTNode` could become an `SQLExpression`. The Pratt-style expression parser (returning an SQLExpression) would be invoked from the top-down parser in places where a generic expression is expected (e.g. after SELECT <...>, WHERE <...>, etc.), while things like select's `projection` and `relation` could be more appropriately (narrowly) typed. Since the diff is quite large due to necessarily large number of mechanical changes, here's an overview: 1) Interface changes: - A new AST enum - `SQLStatement` - is split out of ASTNode: - The variants of the ASTNode enum, which _only_ make sense as a top level statement (INSERT, UPDATE, DELETE, CREATE, ALTER, COPY) are _moved_ to the new enum, with no other changes. - SQLSelect is _duplicated_: now available both as a variant in SQLStatement::SQLSelect (top-level SELECT) and ASTNode:: (subquery). - The main entry point (Parser::parse_sql) now expects an SQL statement as input, and returns an `SQLStatement`. 2) Parser changes: instead of detecting the top-level constructs deep down in the precedence parser (`parse_prefix`) we are able to do it just right after setting up the parser in the `parse_sql` entry point (SELECT, again, is kept in the expression parser to demonstrate how subqueries could be implemented). The rest of parser changes are mechanical ASTNode -> SQLStatement replacements resulting from the AST change. 3) Testing changes: for every test - depending on whether the input was a complete statement or an expresssion - I used an appropriate helper function: - `verified` (parses SQL, checks that it round-trips, and returns the AST) - was replaced by `verified_stmt` or `verified_expr`. - `parse_sql` (which returned AST without checking it round-tripped) was replaced by: - `parse_sql_expr` (same function, for expressions) - `one_statement_parses_to` (formerly `parses_to`), extended to deal with statements that are not expected to round-trip. The weird name is to reduce further churn when implementing multi-statement parsing. - `verified_stmt` (in 4 testcases that actually round-tripped)	2019-01-31 15:54:57 +03:00
Nickolay Ponomarev	7bbf69f513	Further simplify parse_compound_identifier (5/8) This part changes behavior: - Fail when no identifier is found. - Avoid rewinding if EOF was hit right after the identifier.	2019-01-31 03:57:17 +03:00
Nickolay Ponomarev	9a8b6a8e64	Rework keyword/identifier parsing (1/8) Fold Token::{Keyword, Identifier, DoubleQuotedString} into one Token::SQLWord, which has the necessary information (was it a known keyword and/or was it quoted). This lets the parser easily accept DoubleQuotedString (a quoted identifier) everywhere it expects an Identifier in the same match arm. (To complete support of quoted identifiers, or "delimited identifiers" as the spec calls them, a TODO in parse_tablename() ought to be addressed.) As an aside, per <https://en.wikibooks.org/wiki/SQL_Dialects_Reference/Data_structure_definition/Delimited_identifiers> sqlite seems to be the only one supporting 'identifier' (which is rather hairy, since it can also be a string literal), and `identifier` seems only to be supported by MySQL. I didn't implement either one. This also allows the use of `parse`/`expect_keyword` machinery for non-reserved keywords: previously they relied on the keyword being a Token::Keyword, which wasn't a Token::Identifier, and so wasn't accepted as one. Now whether a keyword can be used as an identifier can be decided by the parser. (I didn't add a blacklist of "reserved" keywords, so that any keyword which doesn't have a special meaning in the parser could be used as an identifier. The list of keywords in the dialect could be re-used for that purpose at a later stage.)	2019-01-31 03:57:16 +03:00
Nickolay Ponomarev	70c799e21d	Use verified() in the remaining PG-specific tests	2019-01-20 19:30:13 +03:00
Nickolay Ponomarev	9441f9c5d8	Move tests for "LIKE '%'" to sqlparser_generic.rs ...as this syntax is not specific to the PostgreSQL dialect. Also use verified() to assert that parsing + serializing results in the original SQL string.	2019-01-20 19:30:12 +03:00
Nickolay Ponomarev	d5109a2880	Remove duplicate tests from sqlparser_postgres.rs These have identical copies in sqlparser_generic.rs	2019-01-20 19:30:12 +03:00
Nickolay Ponomarev	a1da7b4005	Reduce differences between "generic" and "postgresql" tests Mainly by replacing `assert_eq!(sql, ast.to_string())` with a call to the recently introduced `verified()` helper or using `parses_to()` where the expected serialization differs from the original SQL string. There was one case (parse_implicit_join), where the inputs were different: let sql = "SELECT * FROM t1,t2"; //vs let sql = "SELECT * FROM t1, t2"; and since we don't test the whitespace handling in other tests, I just used the canonical representation as input.	2019-01-20 19:14:53 +03:00
Nickolay Ponomarev	de4ccd3cb7	Fail when expected keyword is not found Add #[must_use] to warn against unchecked results of parse_keyword/s in the future.	2019-01-13 01:07:58 +03:00
Andy Grove	777fd4c2ee	Merge branch 'master' into not	2019-01-12 11:14:07 -07:00

1 2 3

111 commits