mirror of https://github.com/apache/datafusion-sqlparser-rs.git synced 2025-07-07 17:04:59 +00:00

Extensible SQL Lexer and Parser for Rust

Find a file

Andy Grove 70f3b78f33 (cargo-release) version 0.1.5		2018-09-08 08:52:01 -06:00
docs	Update docs on writing custom parsers	2018-09-08 07:29:34 -06:00
examples	fix example	2018-09-08 08:51:52 -06:00
src	Introduce concept of dialects	2018-09-08 08:39:32 -06:00
.gitignore	roughing out classic pratt parser	2018-02-08 07:49:24 -07:00
.travis.yml	add travis build script	2018-09-03 11:03:04 -06:00
Cargo.toml	(cargo-release) version 0.1.5	2018-09-08 08:52:01 -06:00
LICENSE.TXT	replace with code from datafusion	2018-09-03 09:56:39 -06:00
README.md	Update code sample in README	2018-09-08 08:43:15 -06:00

README.md

Extensible SQL Lexer and Parser for Rust

The goal of this project is to build a SQL lexer and parser capable of parsing SQL that conforms with the ANSI SQL:2011 standard but also making it easy to support custom dialects so that this crate can be used as a foundation for vendor-specific parsers.

This parser is currently being used by the DataFusion query engine.

Example

The current code is capable of parsing some trivial SELECT and CREATE TABLE statements.

let sql = "SELECT a, b, 123, myfunc(b) \
           FROM table_1 \
           WHERE a > b AND b < 100 \
           ORDER BY a DESC, b";

let dialect = GenericSqlDialect{}; // or AnsiSqlDialect, or your own dialect ...

let ast = Parser::parse_sql(&dialect,sql.to_string()).unwrap();

println!("AST: {:?}", ast);

This outputs

AST: SQLSelect { projection: [SQLIdentifier("a"), SQLIdentifier("b"), SQLLiteralLong(123), SQLFunction { id: "myfunc", args: [SQLIdentifier("b")] }], relation: Some(SQLIdentifier("table_1")), selection: Some(SQLBinaryExpr { left: SQLBinaryExpr { left: SQLIdentifier("a"), op: Gt, right: SQLIdentifier("b") }, op: And, right: SQLBinaryExpr { left: SQLIdentifier("b"), op: Lt, right: SQLLiteralLong(100) } }), order_by: Some([SQLOrderBy { expr: SQLIdentifier("a"), asc: false }, SQLOrderBy { expr: SQLIdentifier("b"), asc: true }]), group_by: None, having: None, limit: None }

Design

This parser is implemented using the Pratt Parser design, which is a top-down operator-precedence parser.

I am a fan of this design pattern over parser generators for the following reasons:

Code is simple to write and can be concise and elegant (this is far from true for this current implementation unfortunately, but I hope to fix that using some macros)
Performance is generally better than code generated by parser generators
Debugging is much easier with hand-written code
It is far easier to extend and make dialect-specific extensions compared to using a parser generator

Supporting custom SQL dialects

This is a work in progress but I started some notes on writing a custom SQL parser.

Contributing

Contributors are welcome! Please see the current issues and feel free to file more!