Extensible SQL Lexer and Parser for Rust
Find a file
Nickolay Ponomarev 4294581ded [mssql] Parse CROSS/OUTER APPLY
T-SQL (and Oracle) support non-standard syntax, which is similar in
functionality to LATERAL joins in ANSI and PostgreSQL
<https://blog.jooq.org/tag/lateral-derived-table/>: it allows to use
the columns from the tables defined to the left of `APPLY` in the
"derived tables" (subqueries) to the right of `APPLY`. Unlike ANSI
LATERAL (but like Postgres' implementation), APPLY is also used with
table-valued function calls.

Despite them being similar, we represent "APPLY" joins with
`JoinOperator`s of its own (`CrossApply` and `OuterApply`). Doing
otherwise seemed like it would cause unnecessary confusion, as those
interested in dialect-specific parsing would probably not expect APPLY
being parsed as LATERAL, and those wanting to forbid non-standard SQL
would not be helped by this either.

This also renames existing JoinOperator::Cross -> CrossJoin to avoid
confusion with CrossApply.
2019-06-19 02:45:47 +03:00
docs Update docs on writing custom parsers 2018-09-08 07:29:34 -06:00
examples Standardize license headers 2019-06-04 10:54:12 -04:00
src [mssql] Parse CROSS/OUTER APPLY 2019-06-19 02:45:47 +03:00
tests [mssql] Parse CROSS/OUTER APPLY 2019-06-19 02:45:47 +03:00
.gitignore roughing out classic pratt parser 2018-02-08 07:49:24 -07:00
.travis.yml Standardize license headers 2019-06-04 10:54:12 -04:00
Cargo.toml Implement Hash on all AST nodes 2019-06-03 11:11:24 -04:00
HEADER Standardize license headers 2019-06-04 10:54:12 -04:00
LICENSE.TXT replace with code from datafusion 2018-09-03 09:56:39 -06:00
README.md Mention that we use rustfmt for code formatting 2019-05-06 22:20:29 +03:00
rustfmt.toml Standardize license headers 2019-06-04 10:54:12 -04:00

Extensible SQL Lexer and Parser for Rust

License Version Build Status Coverage Status Gitter Chat

The goal of this project is to build a SQL lexer and parser capable of parsing SQL that conforms with the ANSI SQL:2011 standard but also making it easy to support custom dialects so that this crate can be used as a foundation for vendor-specific parsers.

This parser is currently being used by the DataFusion query engine and LocustDB.

Example

The current code is capable of parsing some trivial SELECT and CREATE TABLE statements.

let sql = "SELECT a, b, 123, myfunc(b) \
           FROM table_1 \
           WHERE a > b AND b < 100 \
           ORDER BY a DESC, b";

let dialect = GenericSqlDialect{}; // or AnsiSqlDialect, or your own dialect ...

let ast = Parser::parse_sql(&dialect,sql.to_string()).unwrap();

println!("AST: {:?}", ast);

This outputs

AST: [SQLSelect(SQLQuery { ctes: [], body: Select(SQLSelect { distinct: false, projection: [UnnamedExpression(SQLIdentifier("a")), UnnamedExpression(SQLIdentifier("b")), UnnamedExpression(SQLValue(Long(123))), UnnamedExpression(SQLFunction { name: SQLObjectName(["myfunc"]), args: [SQLIdentifier("b")], over: None })], relation: Some(Table { name: SQLObjectName(["table_1"]), alias: None }), joins: [], selection: Some(SQLBinaryExpr { left: SQLBinaryExpr { left: SQLIdentifier("a"), op: Gt, right: SQLIdentifier("b") }, op: And, right: SQLBinaryExpr { left: SQLIdentifier("b"), op: Lt, right: SQLValue(Long(100)) } }), group_by: None, having: None }), order_by: Some([SQLOrderByExpr { expr: SQLIdentifier("a"), asc: Some(false) }, SQLOrderByExpr { expr: SQLIdentifier("b"), asc: None }]), limit: None })]

Design

This parser is implemented using the Pratt Parser design, which is a top-down operator-precedence parser.

I am a fan of this design pattern over parser generators for the following reasons:

  • Code is simple to write and can be concise and elegant (this is far from true for this current implementation unfortunately, but I hope to fix that using some macros)
  • Performance is generally better than code generated by parser generators
  • Debugging is much easier with hand-written code
  • It is far easier to extend and make dialect-specific extensions compared to using a parser generator

Supporting custom SQL dialects

This is a work in progress but I started some notes on writing a custom SQL parser.

Contributing

Contributors are welcome! Please see the current issues and feel free to file more!

Please run cargo fmt to ensure the code is properly formatted.