-
Notifications
You must be signed in to change notification settings - Fork 0
Parse whitespaces #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a156383 to
59e6ad5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this really works, it's a big step!
| // This allows escaping newlines everywhere, although this is only valid in | ||
| // preprocessor statements | ||
| /\s|\\\r?\n/, | ||
| $.whitespace, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know how this interferes with the \s above, which also includes whitespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whitespace does not consume all the tokens \s consumes, so the rule is still required
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, could it be some whitespaces in \s that should be handled before with$.whitespace? I suppose not to but, just to know if that can happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would try replacing\s with [\f\t\n\v\r] and see if something changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whitespacedoes not consume all the tokens\sconsumes, so the rule is still required
I'm not sure the kind of ambiguity this suggests... Why not? Why are both required?
59e6ad5 to
7c934b0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we considering upstreaming some of this? Maybe starting with a hidden white-space node could be a good middle point.
Otherwise, this looks like a big departure from upstream.
|
|
||
| preproc_function_def: $ => seq( | ||
| preprocessor('define'), | ||
| field('name', $.identifier), | ||
| field('parameters', $.preproc_params), | ||
| field('value', optional($.preproc_arg)), | ||
| token.immediate(/\r?\n/), | ||
| ), | ||
|
|
||
| preproc_params: $ => seq( | ||
| token.immediate('('), commaSep(choice($.identifier, '...')), ')', | ||
| ), | ||
|
|
||
| preproc_call: $ => seq( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More than merge, I would say simply remove, right? I don't see how the merge is happening.
| // This allows escaping newlines everywhere, although this is only valid in | ||
| // preprocessor statements | ||
| /\s|\\\r?\n/, | ||
| $.whitespace, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whitespacedoes not consume all the tokens\sconsumes, so the rule is still required
I'm not sure the kind of ambiguity this suggests... Why not? Why are both required?
src/scanner.c
Outdated
| static bool scan_whitespace(Scanner *scanner, TSLexer *lexer, | ||
| bool lex_whitespace) { | ||
| if (!iswblank(lexer->lookahead)) { | ||
| return false; | ||
| } | ||
|
|
||
| while (iswblank(lexer->lookahead)) { | ||
| if (lex_whitespace) { | ||
| advance(lexer); | ||
| continue; | ||
| } | ||
| skip(lexer); | ||
| } | ||
|
|
||
| if (!lex_whitespace) { | ||
| return false; | ||
| } | ||
|
|
||
| lexer->mark_end(lexer); | ||
| lexer->result_symbol = WHITESPACE; | ||
|
|
||
| return true; | ||
| } | ||
|
|
||
| static bool scan(Scanner *scanner, TSLexer *lexer, const bool *valid_symbols) { | ||
| // Consume any leading whitespace except newlines | ||
| if (scan_whitespace(scanner, lexer, valid_symbols[WHITESPACE])) { | ||
| return true; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed, instead of some simple grammar rule? Seems like something you are not telling here.
7c934b0 to
e946912
Compare
e946912 to
202d895
Compare
|
I will close this PR since the approach has changed significantly. We can continue the discussion in a new PR, focusing the review on the updated implementation. |
This PR adds whitespace parsing support to
tree-sitter-fortran. To make this work, I mergedpreproc_defandpreproc_function_def, since keeping them separate caused parsing conflicts. Given that macro support is already limited, losing the distinction between these two nodes should be acceptable for now.