Traversing an AST With Babel Traverse
Programmatically traverse an AST and visit arbitrary nodes
Get the project source code below, and follow along with the lesson material.
Download Project Source CodeTo set up the project on your local machine, please follow the directions provided in the README.md
file. If you run into any issues with running the project source code, then feel free to reach out to the author in the course's Discord channel.
This lesson preview is part of the Practical Abstract Syntax Trees course and can be unlocked immediately with a \newline Pro subscription or a single-time purchase. Already have access to this course? Log in here.
Get unlimited access to Practical Abstract Syntax Trees, plus 70+ \newline books, guides and courses with the \newline Pro subscription.
data:image/s3,"s3://crabby-images/52191/5219100bef23c51912ad7af1cacac71d94fed112" alt="Thumbnail for the \newline course Practical Abstract Syntax Trees"
[00:00 - 00:10] In the previous lesson, we wanted to log all of the numeric values in the source code. This lesson will continue to build on that, but explore how to do it in a more generic way.
[00:11 - 00:18] The resulting script may have looked to something like this. We had three print statements, one for each of the numeric values in the tree.
[00:19 - 00:30] However, what would happen if the order of operations was swapped in the source code to do the addition first, then multiplication? Changing the order of operations like this will result in a different AST.
[00:31 - 00:39] Previously, four and ten were multiplied first, then added to two. Now, two and four are added first, then multiplied with ten.
[00:40 - 00:49] We can see that this new equation produces a different tree from the previous lesson. Now, the new tree has a binary expression on the left and a numeric literal on the right.
[00:50 - 00:58] This time, the first binary expression node has another binary expression node on the left instead of the right. And a numeric literal node on the right instead of the left.
[00:59 - 01:05] Additionally, its operator is now a multiplication sign. Each type of node only defines properties relevant to what it represents.
[01:06 - 01:22] For example, a binary expression node has an operator property to indicate either addition or multiplication. However, a numeric literal node does not have an operator property since it represents only a number, but it does have a value property, which a binary expression does not.
[01:23 - 01:34] Each property can have several different types of values. For example, the numeric literal value property is always a number, whereas the binary expression left and right properties are always another node.
[01:35 - 01:48] As seen in this example, the left and right properties can be more than one type of node, in this case either numeric literal or another binary expression. The parser script made assumptions about the structure of the tree and the types of nodes.
[01:49 - 02:01] It assumed the first binary expression had another binary expression in the right property, and that that binary expression had a numeric literal on the left property. Looking at the new tree, we can see this is no longer the case.
[02:02 - 02:14] Now, the binary expression's right property will return a numeric literal node instead of another binary expression node. Then, trying to access the left property will return undefined since the left doesn't exist on a numeric literal node.
[02:15 - 02:24] And finally, trying to access the value of that undefined value will throw a runtime error. Now, let's switch to our terminal and run the script and see if that's the case .
[02:25 - 02:32] We can see that this line indeed did throw a runtime error as we expected. In order to fix this issue, we need to make our script more flexible.
[02:33 - 02:46] We need a way to generically visit a specific type of node in the tree wherever in the hierarchy it may be. We want our script to be more flexible in order to handle any AST, or in other words, any source code input.
[02:47 - 02:56] The Babel Traverse package offers the functionality to programmatically visit nodes anywhere in the tree. In our case, we want to visit the nodes that represent a numeric literal.
[02:57 - 03:16] We'll continue with the existing project from the previous lesson, and we'll get started by first installing the Babel Traverse package. Then, we'll switch back to our editor and create a new Traverse.js file that's a copy of parser.js with one new import for the Traverse function from the Babel Traverse package.
[03:17 - 03:31] The parse method is a named export from the Babel parser package, but the Tra verse function is the default export of the Babel Traverse package. And we can also remove the print statements.
[03:32 - 03:42] The Traverse function expects an AST as the first argument, which we've already generated with the Babel parser package. The second argument is an object of visitors.
[03:43 - 03:55] Each key in the object is the type of node to traverse, or visit. Each key's value is a function invoked for each node of that type in the AST, also known as a visitor.
[03:56 - 04:09] The Traverse function will visit each node in the AST. For each node, if there is a visitor defined for that node's type, it will be invoked with the node wrapped in a path object.
[04:10 - 04:21] The underlying AST node can be accessed via the node property on path. AST's consist only of nodes, but most tooling will pass a path object to visitor functions.
[04:22 - 04:31] It's a wrapper around the node it represents, with some additional helpful metadata, such as a reference to the parent node. Terminology like this is why AST's can seem complex.
[04:32 - 04:42] It's helpful to be familiar with these terms since most tooling and documentation will use them, but it's more important to understand the concepts . The actual code for this case is fairly simple.
[04:43 - 04:52] We want to find all numeric literal nodes in the AST, so we've defined a visitor for nodes of that type. Now let's run the script to see what this prints.
[04:53 - 05:04] The script printed all of the numeric literal value nodes, but we only want to print the values. We can switch back to our editor and update the visitor function for numeric literal nodes and only print the nodes value.
[05:05 - 05:14] Now running the script will produce the same output as the parser that manually printed these values. Unlike the previous script, it will handle any source code and won't throw a runtime error.
[05:15 - 05:24] The output order is maintained because traverse will visit nodes depth first. Additionally, traverse will invoke the visitor when traversing down the tree or on enter.
[05:25 - 05:37] It's also possible to define a visitor that will be invoked when visiting the node when going back up the tree or on exit. Let's quickly define a visitor for numeric literal nodes that will print on both enter and exit.
[05:38 - 06:08] [silence] Running the script again will now print twice for each node. In this case, all of the numeric literal nodes are leaf nodes or don't have any children.
[06:09 - 06:20] So the exit will be invoked immediately following the enter. Changing when we visit a node won't be necessary for any of the examples in this course, but it can be a helpful technique depending on the use case.
[06:21 - 06:34] We can revert to the original visitor function for now. Take a few moments to try updating the source code input with several different equations and notice how this approach can handle any potential input now.
[06:35 - 06:50] I'll quickly paste in a new equation and show how it prints all of the numeric literal nodes in the equation. After you've tried a few different equations, try updating the script to print only the operators instead of the numeric literals.
[06:51 - 07:01] The expected output for the original source code example would be the addition sign and then the multiplication sign. Make sure that a two can handle any equation.
[07:02 - 07:14] If at any point you get stuck, don't forget about AST Explorer. Once you've updated the script, take a moment to consider how you would have found all the numeric values or operators with a traditional search tool or find a replace tool.
[07:15 - 07:26] This example is simplified, but you might imagine doing something similar if you wanted to know all the usages of X in a codebase. The next module will begin to explore how these types of questions can be answered with ASTs.
[07:27 - 07:32] Before that, the next lesson will quickly cover one more tip for creating custom traversal scripts.