Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cst parser #38

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Add cst parser #38

wants to merge 5 commits into from

Conversation

forivall
Copy link

@forivall forivall commented Oct 8, 2015

This is a concrete syntax tree parser that (i think) will be used by jscs.

@hzoo
Copy link
Collaborator

hzoo commented Oct 9, 2015

Yep that's our next priority! And awesome I was goina make PR sooner or later but you already did! 🎉

Did you have any overflow issues or anything related to circular nodes - I was thinking you might need to filter out some properties? I did for babel-eslint https://github.com/fkling/esprima_ast_explorer/pull/27/files#diff-00c7bc55af0584b76c632e0d51e06195R44

@forivall
Copy link
Author

forivall commented Oct 9, 2015

There's currently an infinite recursion issue with retrieving the proper node for highlighting. I discovered that CST discards the position data, so no highlighting for now. I'll update the PR soon, and see what else I can do.

I didn't know the format myself, and thats why I started this :0

@fkling
Copy link
Owner

fkling commented Oct 9, 2015

This is great, thank you so much! I will have a look later today.

@hzoo
Copy link
Collaborator

hzoo commented Oct 9, 2015

Example screenshot

screen shot 2015-10-09 at 12 44 31 pm

if (typeof node.start !== 'undefined') {
return [node.start, node.end];
}
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do return node.range; I guess we need to do something different for whitespace highlighting since this will be the only one that will do that

@fkling
Copy link
Owner

fkling commented Oct 9, 2015

Mmh, that gets pretty crowded... how do you guys feel about removing the list of properties and maybe only show the number of properties (like for arrays).

This would be a separate diff, just trying to get an opinion. Is this list actually useful for anybody?

@hzoo
Copy link
Collaborator

hzoo commented Oct 9, 2015

Yeah if we could make some of those things options or filter them out like I mentioned above (the prevChild/lastChild, etc) it would be nice.

I think the most useful thing is just showing whitespace nodes since otherwise it's just the same as the babylon parser

@@ -3,6 +3,7 @@ function isInRange(range, pos) {
}

export default function getFocusPath(node, pos, parser, path) {
if (parser.id === 'cst') { return []; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can remove this if there's no infinite recursion issue now.

@mdevils
Copy link

mdevils commented Oct 9, 2015

I think in AST Explorer we would only need AST properties + childElements property. So we probably don’t need parentElement, previousElement, firstChild…, isExpression,.... Not sure how to filter out all of them though. We basically need to display properties which are implemented in the final class (excluding properties from Element and so on).

@forivall
Copy link
Author

forivall commented Oct 9, 2015

yeah, filtering out the extra AST properties will take a bit more refactoring since PropertyList doesn't yet know about which parser is being shown. You can see my failed attempt in this branch (i squashed it for sanity's sake)

@hzoo
Copy link
Collaborator

hzoo commented Oct 9, 2015

I used the traverse method from babel to remove all _paths properties https://github.com/fkling/esprima_ast_explorer/pull/27/files#diff-00c7bc55af0584b76c632e0d51e06195R40

@fkling
Copy link
Owner

fkling commented Oct 9, 2015

yeah, filtering out the extra AST properties will take a bit more refactoring since PropertyList doesn't yet know about which parser is being shown.

yeah, I wanted to do that, but didn't want to pass the parser down the tree. I'll look into this over the weekend.

Parsers should be able to specify which properties / nodes to ignore though.

@hzoo
Copy link
Collaborator

hzoo commented Oct 9, 2015

Yeah I was saying we could just add an option to the parser to not return those properties if that is possible/better.

@fkling
Copy link
Owner

fkling commented Oct 9, 2015

Yeah, I guess we could also introduce a helper function for removing unwanted properties from the AST. That would require to traverse the AST twice though (once for cleaning it and once for rendering it), even if part of the AST is not visible.

Having the renderer talk to the parser to check which property to render is probably more performant.

@forivall
Copy link
Author

forivall commented Oct 9, 2015

I tried removing the properties, however, in cst, node.range is a property that traverses up the tree to calculate the location, so removing any of firstchild, lastchild, etc. will break highlighting.

@mdevils
Copy link

mdevils commented Oct 9, 2015

@forivall I think correct way would be to build a copy of tree filtering properties (and getting calculated values) instead of mutating the original tree.

@forivall
Copy link
Author

forivall commented Oct 9, 2015

@mdevils meh. range is all that was really needed, and cst doesn't have a way to nicely clone the tree into plain objects yet. Hopefully you can add that 🎉

@forivall
Copy link
Author

forivall commented Oct 9, 2015

2015-09-21-meld3-example

@fkling
Copy link
Owner

fkling commented Oct 11, 2015

After looking at this more closely I also think we should first implement something that converts the tree to plain objects. @mdevils adding a toJSON method to each class should suffice, right? That way the tree would also be JSON serialize via JSON.stringify (well, I guess we'd have to leave out the cyclic properties).
If you are OK with this, I might be able to work on this next week.

@forivall
Copy link
Author

That's fine with me. I worked on this mainly so that I could understand cst's format (beyond the lovely comic sans diagram), so I'm in no hurry. Once cst nodes can be serialized, it'll be nice and simple to finish this up!

@mdevils
Copy link

mdevils commented Oct 11, 2015

@fkling @forivall @hzoo The problem with serialization is that CST by its purpose hard to serialize. Check out this discussion: estree/estree#41. Those guys cannot decide on JSON format for several months already because you cannot build CST JSON Tree without duplicates. Because you will have duplicates within Node children and AST properties.

Let me show you an example, BinaryExpression, A + B. I will specify pointers to the same object using numbers in comments:

/*1*/ BinaryExpression {
    left: /*2*/ Identifier {
        childElements: [ Token("Identifier", "A") ]
    },
    operator: "+",
    right: /*3*/ Identifier {
        childElements: [ Token("Identifier", "B") ]
    },
    childElements: [
        /*2*/ Identifier { childElements: [ Token("Identifier", "A") ] },
        Token("Whitespace", " "),
        Token("Punctuator", "+"),
        Token("Whitespace", " "),
        /*3*/ Identifier { childElements: [ Token("Identifier", "B") ] }
    ]
}

As you can see, the structure of CST is duplicative because CST inherits AST plus introduces parent-to-children structure to put AST nodes together with CST tokens.

That's why serializing to JSON is quite unclear and controversial point.

@markelog
Copy link

@mdevils estree ticket is pretty young, but that issue is flying around for years :-). I express the idea for json serialization earlier in that ticket - since cst is created with tokens and ast, it might be useful to do the opposite - convert cst to tokens and ast, like -

{"tokens": [..], "ast": {...}};

It would also solve the interoperability concern

@markelog
Copy link

Proposed format by @gibson042 - estree/estree#41 (comment) is also a possibility, but that would take some time to implement

@mdevils
Copy link

mdevils commented Oct 13, 2015

@markelog yes, but it breaks the profit of CST for AST Explorer. I mean it's not a problem to come up with some JSON format. The problem is to output readable and useful structure for AST Explorer.

@markelog
Copy link

Yeah, i mentioned possible format - #38 (comment).

But there is a timing issue

@mdevils
Copy link

mdevils commented Oct 13, 2015

@markelog That format is not really readable in terms of AST Explorer.

I think in this case we just need to build a bit normalised copy of the tree. Not even necessary JSON-compatible (we will have duplicative threads).

@forivall
Copy link
Author

At the very least, it would be great to have better introspection on CST nodes since the decorators are non enumerable inherited props, so it's cumbersome to enumerate the children

@gibson042
Copy link

@gibson042 feels ears burning…

…you cannot build CST JSON Tree without duplicates. Because you will have duplicates within Node children and AST properties.

Not to be rude, but that's precisely what estree/estree#41 (comment) (mentioned by @markelog) offers.

Let me show you an example, BinaryExpression, A + B

Let me show you how duplication can be minimized and often completely eliminated:

BinaryExpression {
  operator: "+",
  left: Identifier {
    name: "A",
    /* omitted without effect, but shown for completeness
    sourceElements: [
      { reference: "name" } // or more generally, { element: "Identifier", value: "A" }
    ]*/
  },
  right: Identifier {
    name: "B",
    /* omitted without effect, but shown for completeness
    sourceElements: [
      { reference: "name" } // or more generally, { element: "Identifier", value: "B" }
    ]*/
  },
  sourceElements: [
    { reference: "left" },
    { element: "WhiteSpace", value: " " },
    { reference: "operator" },
    { element: "WhiteSpace", value: " " },
    { reference: "right" }
  ]
}

That's why serializing to JSON is quite unclear and controversial point.

Portability has great value, and JSON is the obvious choice. It's just a question of how.

That format is not really readable in terms of AST Explorer.

Maybe I'm misunderstanding, but how could purely additive backwards-compatible changes break readability?

@mdevils
Copy link

mdevils commented Oct 15, 2015

@gibson042 are you sure you are talking about integration of CST (npm package) into AST Explorer? The proposed JSON format does not help to show CST in AST Explorer, because the structure is different from what CST produces.

@gibson042
Copy link

Sorry for the delay in responding, @mdevils.

are you sure you are talking about integration of CST (npm package) into AST Explorer? The proposed JSON format does not help to show CST in AST Explorer, because the structure is different from what CST produces.

My awareness of the differences is acute. But this is a data interchange problem, and estree/estree/issues/41 is a proposal to accommodate it within the existing standard data interchange format—estree/estree#41 (comment) in particular gets at the heart of the matter. There's even a comment in node-jscs restating it: "Without a CST being a json structure, you can't pipe it."

You can have whatever internal structure you want, but import/export calls for a (hopefully standardized) JSON-presentable format, free of cycles and multiple references. https://github.com/cst/cst actually has an opportunity to pave the way here, and I'm more than happy to pitch in on either side for advancing ESTree to perfect-fidelity source representation.

@mdevils
Copy link

mdevils commented Oct 27, 2015

@gibson042 Interesting idea.

I think I'm going to introduce configurable toJSON method where you can specify jsonType. One of the types would be what ASTExplorer needs (user-friendly format) and another one (export-type) which you suggest at least to see how it works. I think I should be able to test export-import format proposed by you soon.

@qfox
Copy link

qfox commented May 11, 2016

Hey there, guys. Is there any chance to see this published?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants