Add cst parser #38

forivall · 2015-10-08T23:57:18Z

This is a concrete syntax tree parser that (i think) will be used by jscs.

hzoo · 2015-10-09T02:13:13Z

Yep that's our next priority! And awesome I was goina make PR sooner or later but you already did! 🎉

Did you have any overflow issues or anything related to circular nodes - I was thinking you might need to filter out some properties? I did for babel-eslint https://github.com/fkling/esprima_ast_explorer/pull/27/files#diff-00c7bc55af0584b76c632e0d51e06195R44

forivall · 2015-10-09T03:09:36Z

There's currently an infinite recursion issue with retrieving the proper node for highlighting. I discovered that CST discards the position data, so no highlighting for now. I'll update the PR soon, and see what else I can do.

I didn't know the format myself, and thats why I started this :0

fkling · 2015-10-09T16:32:56Z

This is great, thank you so much! I will have a look later today.

hzoo · 2015-10-09T16:46:13Z

Example screenshot

hzoo · 2015-10-09T16:55:36Z

src/parsers/cst.js

+    if (typeof node.start !== 'undefined') {
+      return [node.start, node.end];
+    }
+  },


You can do return node.range; I guess we need to do something different for whitespace highlighting since this will be the only one that will do that

fkling · 2015-10-09T16:57:43Z

Mmh, that gets pretty crowded... how do you guys feel about removing the list of properties and maybe only show the number of properties (like for arrays).

This would be a separate diff, just trying to get an opinion. Is this list actually useful for anybody?

hzoo · 2015-10-09T17:24:50Z

Yeah if we could make some of those things options or filter them out like I mentioned above (the prevChild/lastChild, etc) it would be nice.

I think the most useful thing is just showing whitespace nodes since otherwise it's just the same as the babylon parser

hzoo · 2015-10-09T17:26:04Z

src/getFocusPath.js

@@ -3,6 +3,7 @@ function isInRange(range, pos) {
 }

 export default function getFocusPath(node, pos, parser, path) {
+  if (parser.id === 'cst') { return []; }


can remove this if there's no infinite recursion issue now.

mdevils · 2015-10-09T18:00:18Z

I think in AST Explorer we would only need AST properties + childElements property. So we probably don’t need parentElement, previousElement, firstChild…, isExpression,.... Not sure how to filter out all of them though. We basically need to display properties which are implemented in the final class (excluding properties from Element and so on).

forivall · 2015-10-09T18:16:53Z

yeah, filtering out the extra AST properties will take a bit more refactoring since PropertyList doesn't yet know about which parser is being shown. You can see my failed attempt in this branch (i squashed it for sanity's sake)

hzoo · 2015-10-09T18:36:41Z

I used the traverse method from babel to remove all _paths properties https://github.com/fkling/esprima_ast_explorer/pull/27/files#diff-00c7bc55af0584b76c632e0d51e06195R40

fkling · 2015-10-09T18:50:12Z

yeah, filtering out the extra AST properties will take a bit more refactoring since PropertyList doesn't yet know about which parser is being shown.

yeah, I wanted to do that, but didn't want to pass the parser down the tree. I'll look into this over the weekend.

Parsers should be able to specify which properties / nodes to ignore though.

hzoo · 2015-10-09T18:51:15Z

Yeah I was saying we could just add an option to the parser to not return those properties if that is possible/better.

fkling · 2015-10-09T18:57:09Z

Yeah, I guess we could also introduce a helper function for removing unwanted properties from the AST. That would require to traverse the AST twice though (once for cleaning it and once for rendering it), even if part of the AST is not visible.

Having the renderer talk to the parser to check which property to render is probably more performant.

forivall · 2015-10-09T19:09:41Z

I tried removing the properties, however, in cst, node.range is a property that traverses up the tree to calculate the location, so removing any of firstchild, lastchild, etc. will break highlighting.

mdevils · 2015-10-09T19:16:33Z

@forivall I think correct way would be to build a copy of tree filtering properties (and getting calculated values) instead of mutating the original tree.

forivall · 2015-10-09T19:26:42Z

@mdevils meh. range is all that was really needed, and cst doesn't have a way to nicely clone the tree into plain objects yet. Hopefully you can add that 🎉

forivall · 2015-10-09T19:48:25Z

fkling · 2015-10-11T18:52:53Z

After looking at this more closely I also think we should first implement something that converts the tree to plain objects. @mdevils adding a toJSON method to each class should suffice, right? That way the tree would also be JSON serialize via JSON.stringify (well, I guess we'd have to leave out the cyclic properties).
If you are OK with this, I might be able to work on this next week.

forivall · 2015-10-11T21:41:38Z

That's fine with me. I worked on this mainly so that I could understand cst's format (beyond the lovely comic sans diagram), so I'm in no hurry. Once cst nodes can be serialized, it'll be nice and simple to finish this up!

mdevils · 2015-10-11T21:54:04Z

@fkling @forivall @hzoo The problem with serialization is that CST by its purpose hard to serialize. Check out this discussion: estree/estree#41. Those guys cannot decide on JSON format for several months already because you cannot build CST JSON Tree without duplicates. Because you will have duplicates within Node children and AST properties.

Let me show you an example, BinaryExpression, A + B. I will specify pointers to the same object using numbers in comments:

/*1*/ BinaryExpression {
    left: /*2*/ Identifier {
        childElements: [ Token("Identifier", "A") ]
    },
    operator: "+",
    right: /*3*/ Identifier {
        childElements: [ Token("Identifier", "B") ]
    },
    childElements: [
        /*2*/ Identifier { childElements: [ Token("Identifier", "A") ] },
        Token("Whitespace", " "),
        Token("Punctuator", "+"),
        Token("Whitespace", " "),
        /*3*/ Identifier { childElements: [ Token("Identifier", "B") ] }
    ]
}

As you can see, the structure of CST is duplicative because CST inherits AST plus introduces parent-to-children structure to put AST nodes together with CST tokens.

That's why serializing to JSON is quite unclear and controversial point.

markelog · 2015-10-13T15:19:10Z

@mdevils estree ticket is pretty young, but that issue is flying around for years :-). I express the idea for json serialization earlier in that ticket - since cst is created with tokens and ast, it might be useful to do the opposite - convert cst to tokens and ast, like -

{"tokens": [..], "ast": {...}};

It would also solve the interoperability concern

markelog · 2015-10-13T15:20:51Z

Proposed format by @gibson042 - estree/estree#41 (comment) is also a possibility, but that would take some time to implement

mdevils · 2015-10-13T15:21:16Z

@markelog yes, but it breaks the profit of CST for AST Explorer. I mean it's not a problem to come up with some JSON format. The problem is to output readable and useful structure for AST Explorer.

markelog · 2015-10-13T15:24:42Z

Yeah, i mentioned possible format - #38 (comment).

But there is a timing issue

mdevils · 2015-10-13T15:26:25Z

@markelog That format is not really readable in terms of AST Explorer.

I think in this case we just need to build a bit normalised copy of the tree. Not even necessary JSON-compatible (we will have duplicative threads).

forivall · 2015-10-13T16:29:24Z

At the very least, it would be great to have better introspection on CST nodes since the decorators are non enumerable inherited props, so it's cumbersome to enumerate the children

gibson042 · 2015-10-15T04:15:47Z

@gibson042 feels ears burning…

…you cannot build CST JSON Tree without duplicates. Because you will have duplicates within Node children and AST properties.

Not to be rude, but that's precisely what estree/estree#41 (comment) (mentioned by @markelog) offers.

Let me show you an example, BinaryExpression, A + B

Let me show you how duplication can be minimized and often completely eliminated:

BinaryExpression {
  operator: "+",
  left: Identifier {
    name: "A",
    /* omitted without effect, but shown for completeness
    sourceElements: [
      { reference: "name" } // or more generally, { element: "Identifier", value: "A" }
    ]*/
  },
  right: Identifier {
    name: "B",
    /* omitted without effect, but shown for completeness
    sourceElements: [
      { reference: "name" } // or more generally, { element: "Identifier", value: "B" }
    ]*/
  },
  sourceElements: [
    { reference: "left" },
    { element: "WhiteSpace", value: " " },
    { reference: "operator" },
    { element: "WhiteSpace", value: " " },
    { reference: "right" }
  ]
}

That's why serializing to JSON is quite unclear and controversial point.

Portability has great value, and JSON is the obvious choice. It's just a question of how.

That format is not really readable in terms of AST Explorer.

Maybe I'm misunderstanding, but how could purely additive backwards-compatible changes break readability?

mdevils · 2015-10-15T08:26:57Z

@gibson042 are you sure you are talking about integration of CST (npm package) into AST Explorer? The proposed JSON format does not help to show CST in AST Explorer, because the structure is different from what CST produces.

gibson042 · 2015-10-22T03:33:08Z

Sorry for the delay in responding, @mdevils.

are you sure you are talking about integration of CST (npm package) into AST Explorer? The proposed JSON format does not help to show CST in AST Explorer, because the structure is different from what CST produces.

My awareness of the differences is acute. But this is a data interchange problem, and estree/estree/issues/41 is a proposal to accommodate it within the existing standard data interchange format—estree/estree#41 (comment) in particular gets at the heart of the matter. There's even a comment in node-jscs restating it: "Without a CST being a json structure, you can't pipe it."

You can have whatever internal structure you want, but import/export calls for a (hopefully standardized) JSON-presentable format, free of cycles and multiple references. https://github.com/cst/cst actually has an opportunity to pave the way here, and I'm more than happy to pitch in on either side for advancing ESTree to perfect-fidelity source representation.

mdevils · 2015-10-27T10:15:34Z

@gibson042 Interesting idea.

I think I'm going to introduce configurable toJSON method where you can specify jsonType. One of the types would be what ASTExplorer needs (user-friendly format) and another one (export-type) which you suggest at least to see how it works. I think I should be able to test export-import format proposed by you soon.

qfox · 2016-05-11T14:54:06Z

Hey there, guys. Is there any chance to see this published?

Add cst parser

3ad4657

Temporarily fix infinite recursion error

e3f813b

hzoo reviewed Oct 9, 2015
View reviewed changes

Properly return range

90ad8c0

hzoo reviewed Oct 9, 2015
View reviewed changes

Fix getFocusPath for cst

41a2f9e

Store calcuated range and remove extra props

1209f13

hzoo mentioned this pull request Oct 16, 2015

Refresh homepage jquery/esprima#1311

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cst parser #38

Add cst parser #38

forivall commented Oct 8, 2015

hzoo commented Oct 9, 2015

forivall commented Oct 9, 2015

fkling commented Oct 9, 2015

hzoo commented Oct 9, 2015

hzoo Oct 9, 2015

fkling commented Oct 9, 2015

hzoo commented Oct 9, 2015

hzoo Oct 9, 2015

mdevils commented Oct 9, 2015

forivall commented Oct 9, 2015

hzoo commented Oct 9, 2015

fkling commented Oct 9, 2015

hzoo commented Oct 9, 2015

fkling commented Oct 9, 2015

forivall commented Oct 9, 2015

mdevils commented Oct 9, 2015

forivall commented Oct 9, 2015

forivall commented Oct 9, 2015

fkling commented Oct 11, 2015

forivall commented Oct 11, 2015

mdevils commented Oct 11, 2015

markelog commented Oct 13, 2015

markelog commented Oct 13, 2015

mdevils commented Oct 13, 2015

markelog commented Oct 13, 2015

mdevils commented Oct 13, 2015

forivall commented Oct 13, 2015

gibson042 commented Oct 15, 2015

mdevils commented Oct 15, 2015

gibson042 commented Oct 22, 2015

mdevils commented Oct 27, 2015

qfox commented May 11, 2016

Add cst parser #38

Are you sure you want to change the base?

Add cst parser #38

Conversation

forivall commented Oct 8, 2015

hzoo commented Oct 9, 2015

forivall commented Oct 9, 2015

fkling commented Oct 9, 2015

hzoo commented Oct 9, 2015

hzoo Oct 9, 2015

Choose a reason for hiding this comment

fkling commented Oct 9, 2015

hzoo commented Oct 9, 2015

hzoo Oct 9, 2015

Choose a reason for hiding this comment

mdevils commented Oct 9, 2015

forivall commented Oct 9, 2015

hzoo commented Oct 9, 2015

fkling commented Oct 9, 2015

hzoo commented Oct 9, 2015

fkling commented Oct 9, 2015

forivall commented Oct 9, 2015

mdevils commented Oct 9, 2015

forivall commented Oct 9, 2015

forivall commented Oct 9, 2015

fkling commented Oct 11, 2015

forivall commented Oct 11, 2015

mdevils commented Oct 11, 2015

markelog commented Oct 13, 2015

markelog commented Oct 13, 2015

mdevils commented Oct 13, 2015

markelog commented Oct 13, 2015

mdevils commented Oct 13, 2015

forivall commented Oct 13, 2015

gibson042 commented Oct 15, 2015

mdevils commented Oct 15, 2015

gibson042 commented Oct 22, 2015

mdevils commented Oct 27, 2015

qfox commented May 11, 2016