internal: replace TreeSink with a data structure

The general theme of this is to make parser a better independent
library.

The specific thing we do here is replacing callback based TreeSink with
a data structure. That is, rather than calling user-provided tree
construction methods, the parser now spits out a very bare-bones tree,
effectively a log of a DFS traversal.

This makes the parser usable without any *specifc* tree sink, and allows
us to, eg, move tests into this crate.

Now, it's also true that this is a distinction without a difference, as
the old and the new interface are equivalent in expressiveness. Still,
this new thing seems somewhat simpler. But yeah, I admit I don't have a
suuper strong motivation here, just a hunch that this is better.
This commit is contained in:
Aleksey Kladov 2021-12-19 17:36:23 +03:00
parent 2f63558dc5
commit d0d05075ed
10 changed files with 172 additions and 110 deletions

View file

@ -0,0 +1,67 @@
//! TODO
use crate::SyntaxKind;
/// Output of the parser.
#[derive(Default)]
pub struct TreeTraversal {
/// 32-bit encoding of events. If LSB is zero, then that's an index into the
/// error vector. Otherwise, it's one of the thee other variants, with data encoded as
///
/// |16 bit kind|8 bit n_raw_tokens|4 bit tag|4 bit leftover|
///
event: Vec<u32>,
error: Vec<String>,
}
pub enum TraversalStep<'a> {
Token { kind: SyntaxKind, n_raw_tokens: u8 },
EnterNode { kind: SyntaxKind },
LeaveNode,
Error { msg: &'a str },
}
impl TreeTraversal {
pub fn iter(&self) -> impl Iterator<Item = TraversalStep<'_>> {
self.event.iter().map(|&event| {
if event & 0b1 == 0 {
return TraversalStep::Error { msg: self.error[(event as usize) >> 1].as_str() };
}
let tag = ((event & 0x0000_00F0) >> 4) as u8;
match tag {
0 => {
let kind: SyntaxKind = (((event & 0xFFFF_0000) >> 16) as u16).into();
let n_raw_tokens = ((event & 0x0000_FF00) >> 8) as u8;
TraversalStep::Token { kind, n_raw_tokens }
}
1 => {
let kind: SyntaxKind = (((event & 0xFFFF_0000) >> 16) as u16).into();
TraversalStep::EnterNode { kind }
}
2 => TraversalStep::LeaveNode,
_ => unreachable!(),
}
})
}
pub(crate) fn token(&mut self, kind: SyntaxKind, n_tokens: u8) {
let e = ((kind as u16 as u32) << 16) | ((n_tokens as u32) << 8) | (0 << 4) | 1;
self.event.push(e)
}
pub(crate) fn enter_node(&mut self, kind: SyntaxKind) {
let e = ((kind as u16 as u32) << 16) | (1 << 4) | 1;
self.event.push(e)
}
pub(crate) fn leave_node(&mut self) {
let e = 2 << 4 | 1;
self.event.push(e)
}
pub(crate) fn error(&mut self, error: String) {
let idx = self.error.len();
self.error.push(error);
let e = (idx as u32) << 1;
self.event.push(e);
}
}