- reduce duplication in the handling of implicit/cross joins and make
the flow of data slightly clearer by returning the `join` instead of
pushing it and exiting early.
(I wanted the block that currently returns `join` to return one of
JoinOperator::* tags, so that `parse_table_factor` and the construction
of the `Join` struct could happen after we've parsed the JOIN keywords,
but that seems impossible.)
- move the check for the NATURAL keyword into the block that deals with
INNER/OUTER joins that support constraints (and thus can be preceded
by "NATURAL")
- add a check for NATURAL not followed by a known join type with a test
- add more tests for NATURAL joins (we didn't have any), and fix
whitespace bug in `to_string()` that was uncovered (we emitted an
extra space: `foo NATURAL JOIN bar `)
This block parses one of:
- `[ INNER ] JOIN <table_factor> <join_constraint>`
- `{ LEFT | RIGHT | FULL } [ OUTER ] JOIN <table_factor> <join_constraint>`
..but it was hard to see because of the duplication.
- use `if !self.consume_token(&Token::Comma) { break; }` to consume the
comma and exit the loop if no comma found.
- coalesce two `{ false }` blocks in `consume_token` by using a match guard
Internal improvements to Parser::next_token/prev_token
This reduces the number of helper functions used by next_token()/prev_token() while slightly improving performance and reducing the chances of coding errors when using prev_token() after hitting end-of-file.
Before this `next_token()` would only increment the index when returning
`Some(token)`. This means that the caller wishing to rewind must be
careful not to call `prev_token()` on EOF (`None`), while not forgetting
to call it for `Some`. Not doing this resulted in bugs in the
undertested code that does error handling.
After making this mistake several times, I'm changing `next_token()` /
`prev_token()` so that calling `next_token(); prev_token(); next_token()`
returns the same token in the first and the last invocation.
- Avoid cloning whitespace tokens in `peek_nth_token()` by using a
&Token from `tokens.get()` instead of a cloned `Token` from `token_at()`
- Similarly avoid cloning in `next_token_no_skip`, and clone the
non-whitespace tokens in `next_token` instead.
- Remove `token_at`, which was only used in `peek_token` and
`peek_nth_token`
- Fold `prev_token_no_skip()` into `prev_token()` and make `prev_token`
return nothing, as the return value isn't used anyway.
* Rewrite parsing of `ALTER TABLE ADD CONSTRAINT`
* Support constraints in CREATE TABLE
* Change `Value::Long()` to be unsigned, use u64 consistently
* Allow trailing comma in CREATE TABLE
The tokenizer emits a separate Token for +/- signs, so the value of
Value::Long() (as well as of parse_literal_int()) may never be negative.
Also we have been using both u64 and usize to represent a parsed
unsigned number. Change to using u64 for consistency.
NOT LIKE has the same precedence as the LIKE operator. The parser was
previously assigning it the precedence of the unary NOT operator. NOT
BETWEEN and NOT IN are treated similarly, as they are equivalent, from a
precedence perspective, to NOT LIKE.
The fix for this requires associating precedences with sequences of
tokens, rather than single tokens, so that "NOT LIKE" and "NOT <expr>"
can have different preferences. Perhaps surprisingly, this change is not
very invasive.
An alternative I considered involved adjusting the tokenizer to lex
NOT, NOT LIKE, NOT BETWEEN, and NOT IN as separate tokens. This broke
symmetry in strange ways, though, as NotLike, NotBetween, and NotIn
gained dedicated tokens, while LIKE, BETWEEN, and IN remained as
stringly identifiers.
Fixes#81.
`BETWEEN <thing> AND <thing>` allows <thing> to be any expr that doesn't
contain boolean operators. (Allowing boolean operators would wreak
havoc, because of the repurposing of AND as both a boolean operation
and part of the syntax of BETWEEN.)