mirror of
https://github.com/apache/datafusion-sqlparser-rs.git
synced 2025-08-30 18:57:21 +00:00

...and add a `rustfmt.toml` as advised in https://www.reddit.com/r/rust/comments/9jl6a9/pro_tip_if_you_use_cargo_fmtrustfmt_use_a/
55 lines
3.5 KiB
Markdown
55 lines
3.5 KiB
Markdown
# Extensible SQL Lexer and Parser for Rust
|
|
|
|
[](https://opensource.org/licenses/Apache-2.0)
|
|
[](https://crates.io/crates/sqlparser)
|
|
[](https://travis-ci.org/andygrove/sqlparser-rs)
|
|
[](https://coveralls.io/github/andygrove/sqlparser-rs?branch=master)
|
|
[](https://gitter.im/sqlparser-rs/community?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
|
|
|
|
The goal of this project is to build a SQL lexer and parser capable of parsing SQL that conforms with the [ANSI SQL:2011](https://jakewheat.github.io/sql-overview/sql-2011-foundation-grammar.html#_5_1_sql_terminal_character) standard but also making it easy to support custom dialects so that this crate can be used as a foundation for vendor-specific parsers.
|
|
|
|
This parser is currently being used by the [DataFusion](https://github.com/andygrove/datafusion) query engine and [LocustDB](https://github.com/cswinter/LocustDB).
|
|
|
|
## Example
|
|
|
|
The current code is capable of parsing some trivial SELECT and CREATE TABLE statements.
|
|
|
|
```rust
|
|
let sql = "SELECT a, b, 123, myfunc(b) \
|
|
FROM table_1 \
|
|
WHERE a > b AND b < 100 \
|
|
ORDER BY a DESC, b";
|
|
|
|
let dialect = GenericSqlDialect{}; // or AnsiSqlDialect, or your own dialect ...
|
|
|
|
let ast = Parser::parse_sql(&dialect,sql.to_string()).unwrap();
|
|
|
|
println!("AST: {:?}", ast);
|
|
```
|
|
|
|
This outputs
|
|
|
|
```rust
|
|
AST: [SQLSelect(SQLQuery { ctes: [], body: Select(SQLSelect { distinct: false, projection: [UnnamedExpression(SQLIdentifier("a")), UnnamedExpression(SQLIdentifier("b")), UnnamedExpression(SQLValue(Long(123))), UnnamedExpression(SQLFunction { name: SQLObjectName(["myfunc"]), args: [SQLIdentifier("b")], over: None })], relation: Some(Table { name: SQLObjectName(["table_1"]), alias: None }), joins: [], selection: Some(SQLBinaryExpr { left: SQLBinaryExpr { left: SQLIdentifier("a"), op: Gt, right: SQLIdentifier("b") }, op: And, right: SQLBinaryExpr { left: SQLIdentifier("b"), op: Lt, right: SQLValue(Long(100)) } }), group_by: None, having: None }), order_by: Some([SQLOrderByExpr { expr: SQLIdentifier("a"), asc: Some(false) }, SQLOrderByExpr { expr: SQLIdentifier("b"), asc: None }]), limit: None })]
|
|
```
|
|
|
|
## Design
|
|
|
|
This parser is implemented using the [Pratt Parser](https://tdop.github.io/) design, which is a top-down operator-precedence parser.
|
|
|
|
I am a fan of this design pattern over parser generators for the following reasons:
|
|
|
|
- Code is simple to write and can be concise and elegant (this is far from true for this current implementation unfortunately, but I hope to fix that using some macros)
|
|
- Performance is generally better than code generated by parser generators
|
|
- Debugging is much easier with hand-written code
|
|
- It is far easier to extend and make dialect-specific extensions compared to using a parser generator
|
|
|
|
## Supporting custom SQL dialects
|
|
|
|
This is a work in progress but I started some notes on [writing a custom SQL parser](docs/custom_sql_parser.md).
|
|
|
|
## Contributing
|
|
|
|
Contributors are welcome! Please see the [current issues](https://github.com/andygrove/sqlparser-rs/issues) and feel free to file more!
|
|
|
|
Please run [cargo fmt](https://github.com/rust-lang/rustfmt#on-the-stable-toolchain) to ensure the code is properly formatted.
|