Commit graph

20 commits

Author SHA1 Message Date
Nickolay Ponomarev
b9f4b503b6 Support different quoting styles for delimited identifiers
The dialect information is from https://en.wikibooks.org/wiki/SQL_Dialects_Reference/Data_structure_definition/Delimited_identifiers
2019-02-07 05:34:29 +03:00
Nickolay Ponomarev
b3693bfa63 Simplify quoted identifier tokenization 2019-02-07 05:34:26 +03:00
Nickolay Ponomarev
f87230553e Remove dialect-specific keyword lists (2/8)
Now populating SQLWord.keyword based on the list of globally supported
keywords.
2019-01-31 03:57:16 +03:00
Nickolay Ponomarev
9a8b6a8e64 Rework keyword/identifier parsing (1/8)
Fold Token::{Keyword, Identifier, DoubleQuotedString} into one
Token::SQLWord, which has the necessary information (was it a
known keyword and/or was it quoted).

This lets the parser easily accept DoubleQuotedString (a quoted
identifier) everywhere it expects an Identifier in the same match
arm. (To complete support of quoted identifiers, or "delimited
identifiers" as the spec calls them, a TODO in parse_tablename()
ought to be addressed.)

    As an aside, per <https://en.wikibooks.org/wiki/SQL_Dialects_Reference/Data_structure_definition/Delimited_identifiers>
    sqlite seems to be the only one supporting 'identifier'
    (which is rather hairy, since it can also be a string
    literal), and `identifier` seems only to be supported by
    MySQL. I didn't implement either one.

This also allows the use of `parse`/`expect_keyword` machinery
for non-reserved keywords: previously they relied on the keyword
being a Token::Keyword, which wasn't a Token::Identifier, and so
wasn't accepted as one.

Now whether a keyword can be used as an identifier can be decided
by the parser. (I didn't add a blacklist of "reserved" keywords,
so that any keyword which doesn't have a special meaning in the
parser could be used as an identifier. The list of keywords in
the dialect could be re-used for that purpose at a later stage.)
2019-01-31 03:57:16 +03:00
Nickolay Ponomarev
d0a65ffd05 Remove Token::String, as it's never emitted
Indeed, given that there is Token::SingleQuotedString and
Token::Identifier, there's no other "string" that would make sense...
2019-01-20 19:26:57 +03:00
Andy Grove
722ea7a91b ran cargo fmt 2018-10-06 09:39:26 -06:00
Jovansonlee Cesar
da153bf848 Make a PostgreSQLDialect
Add is_primary and is_unique in the column definition

Initial code for testing alter table
2018-09-28 03:32:10 +08:00
Jovansonlee Cesar
74b34faaf1 Also tokenize non alphanumeric characters into some Char, since they can be tab separated values in COPY payload 2018-09-26 23:59:52 +08:00
Jovansonlee Cesar
34412f7e3d Whitespace tokens are not skipped
Differentiate single quoted string and double quoted string
2018-09-26 22:46:16 +08:00
Jovansonlee Cesar
199ec67da7 Add implementation for parsing SQL COPY 2018-09-25 01:31:54 +08:00
Jovansonlee Cesar
2007995938 Improve the create statement parser that uses create statements from pg database dump
Added PostgreSQL style casting
2018-09-24 03:34:40 +08:00
Andy Grove
3fe6fa4041 Reduce duplicate code in tokenizer 2018-09-08 14:55:03 -06:00
Andy Grove
810cd8e6cf tokenizer delegates to dialect now 2018-09-08 14:49:25 -06:00
Andy Grove
06a8870bd7 Introduce concept of dialects 2018-09-08 08:39:32 -06:00
Andy Grove
cc725791de cargo fmt 2018-09-08 08:10:05 -06:00
crw5996
900c56ff29 Fixed column values to reflect length of tokens 2018-09-07 20:23:23 -04:00
crw5996
82d1f36366 Added line number errors in the tokenizer 2018-09-07 18:34:22 -04:00
Andy Grove
7bea9a8648 cargo fmt 2018-09-03 11:45:03 -06:00
Andy Grove
5bac9fd131 Remove some non ANSI SQL support 2018-09-03 10:25:05 -06:00
Andy Grove
0c23392adb replace with code from datafusion 2018-09-03 09:56:39 -06:00