README.md
1# Acorn
2
3[](https://travis-ci.org/ternjs/acorn)
4[](https://www.npmjs.com/package/acorn)
5[](https://cdnjs.com/libraries/acorn)
6[Author funding status: ](https://marijnhaverbeke.nl/fund/)
7
8A tiny, fast JavaScript parser, written completely in JavaScript.
9
10## Community
11
12Acorn is open source software released under an
13[MIT license](https://github.com/ternjs/acorn/blob/master/LICENSE).
14
15You are welcome to
16[report bugs](https://github.com/ternjs/acorn/issues) or create pull
17requests on [github](https://github.com/ternjs/acorn). For questions
18and discussion, please use the
19[Tern discussion forum](https://discuss.ternjs.net).
20
21## Installation
22
23The easiest way to install acorn is with [`npm`][npm].
24
25[npm]: https://www.npmjs.com/
26
27```sh
28npm install acorn
29```
30
31Alternately, download the source.
32
33```sh
34git clone https://github.com/ternjs/acorn.git
35```
36
37## Components
38
39When run in a CommonJS (node.js) or AMD environment, exported values
40appear in the interfaces exposed by the individual files, as usual.
41When loaded in the browser (Acorn works in any JS-enabled browser more
42recent than IE5) without any kind of module management, a single
43global object `acorn` will be defined, and all the exported properties
44will be added to that.
45
46### Main parser
47
48This is implemented in `dist/acorn.js`, and is what you get when you
49`require("acorn")` in node.js.
50
51**parse**`(input, options)` is used to parse a JavaScript program.
52The `input` parameter is a string, `options` can be undefined or an
53object setting some of the options listed below. The return value will
54be an abstract syntax tree object as specified by the
55[ESTree spec][estree].
56
57When encountering a syntax error, the parser will raise a
58`SyntaxError` object with a meaningful message. The error object will
59have a `pos` property that indicates the character offset at which the
60error occurred, and a `loc` object that contains a `{line, column}`
61object referring to that same position.
62
63[estree]: https://github.com/estree/estree
64
65- **ecmaVersion**: Indicates the ECMAScript version to parse. Must be
66 either 3, 5, 6 (2015), 7 (2016), or 8 (2017). This influences support for strict
67 mode, the set of reserved words, and support for new syntax features.
68 Default is 7.
69
70 **NOTE**: Only 'stage 4' (finalized) ECMAScript features are being
71 implemented by Acorn.
72
73- **sourceType**: Indicate the mode the code should be parsed in. Can be
74 either `"script"` or `"module"`. This influences global strict mode
75 and parsing of `import` and `export` declarations.
76
77- **onInsertedSemicolon**: If given a callback, that callback will be
78 called whenever a missing semicolon is inserted by the parser. The
79 callback will be given the character offset of the point where the
80 semicolon is inserted as argument, and if `locations` is on, also a
81 `{line, column}` object representing this position.
82
83- **onTrailingComma**: Like `onInsertedSemicolon`, but for trailing
84 commas.
85
86- **allowReserved**: If `false`, using a reserved word will generate
87 an error. Defaults to `true` for `ecmaVersion` 3, `false` for higher
88 versions. When given the value `"never"`, reserved words and
89 keywords can also not be used as property names (as in Internet
90 Explorer's old parser).
91
92- **allowReturnOutsideFunction**: By default, a return statement at
93 the top level raises an error. Set this to `true` to accept such
94 code.
95
96- **allowImportExportEverywhere**: By default, `import` and `export`
97 declarations can only appear at a program's top level. Setting this
98 option to `true` allows them anywhere where a statement is allowed.
99
100- **allowHashBang**: When this is enabled (off by default), if the
101 code starts with the characters `#!` (as in a shellscript), the
102 first line will be treated as a comment.
103
104- **locations**: When `true`, each node has a `loc` object attached
105 with `start` and `end` subobjects, each of which contains the
106 one-based line and zero-based column numbers in `{line, column}`
107 form. Default is `false`.
108
109- **onToken**: If a function is passed for this option, each found
110 token will be passed in same format as tokens returned from
111 `tokenizer().getToken()`.
112
113 If array is passed, each found token is pushed to it.
114
115 Note that you are not allowed to call the parser from the
116 callback—that will corrupt its internal state.
117
118- **onComment**: If a function is passed for this option, whenever a
119 comment is encountered the function will be called with the
120 following parameters:
121
122 - `block`: `true` if the comment is a block comment, false if it
123 is a line comment.
124 - `text`: The content of the comment.
125 - `start`: Character offset of the start of the comment.
126 - `end`: Character offset of the end of the comment.
127
128 When the `locations` options is on, the `{line, column}` locations
129 of the comment’s start and end are passed as two additional
130 parameters.
131
132 If array is passed for this option, each found comment is pushed
133 to it as object in Esprima format:
134
135 ```javascript
136 {
137 "type": "Line" | "Block",
138 "value": "comment text",
139 "start": Number,
140 "end": Number,
141 // If `locations` option is on:
142 "loc": {
143 "start": {line: Number, column: Number}
144 "end": {line: Number, column: Number}
145 },
146 // If `ranges` option is on:
147 "range": [Number, Number]
148 }
149 ```
150
151 Note that you are not allowed to call the parser from the
152 callback—that will corrupt its internal state.
153
154- **ranges**: Nodes have their start and end characters offsets
155 recorded in `start` and `end` properties (directly on the node,
156 rather than the `loc` object, which holds line/column data. To also
157 add a [semi-standardized][range] `range` property holding a
158 `[start, end]` array with the same numbers, set the `ranges` option
159 to `true`.
160
161- **program**: It is possible to parse multiple files into a single
162 AST by passing the tree produced by parsing the first file as the
163 `program` option in subsequent parses. This will add the toplevel
164 forms of the parsed file to the "Program" (top) node of an existing
165 parse tree.
166
167- **sourceFile**: When the `locations` option is `true`, you can pass
168 this option to add a `source` attribute in every node’s `loc`
169 object. Note that the contents of this option are not examined or
170 processed in any way; you are free to use whatever format you
171 choose.
172
173- **directSourceFile**: Like `sourceFile`, but a `sourceFile` property
174 will be added (regardless of the `location` option) directly to the
175 nodes, rather than the `loc` object.
176
177- **preserveParens**: If this option is `true`, parenthesized expressions
178 are represented by (non-standard) `ParenthesizedExpression` nodes
179 that have a single `expression` property containing the expression
180 inside parentheses.
181
182[range]: https://bugzilla.mozilla.org/show_bug.cgi?id=745678
183
184**parseExpressionAt**`(input, offset, options)` will parse a single
185expression in a string, and return its AST. It will not complain if
186there is more of the string left after the expression.
187
188**getLineInfo**`(input, offset)` can be used to get a `{line,
189column}` object for a given program string and character offset.
190
191**tokenizer**`(input, options)` returns an object with a `getToken`
192method that can be called repeatedly to get the next token, a `{start,
193end, type, value}` object (with added `loc` property when the
194`locations` option is enabled and `range` property when the `ranges`
195option is enabled). When the token's type is `tokTypes.eof`, you
196should stop calling the method, since it will keep returning that same
197token forever.
198
199In ES6 environment, returned result can be used as any other
200protocol-compliant iterable:
201
202```javascript
203for (let token of acorn.tokenizer(str)) {
204 // iterate over the tokens
205}
206
207// transform code to array of tokens:
208var tokens = [...acorn.tokenizer(str)];
209```
210
211**tokTypes** holds an object mapping names to the token type objects
212that end up in the `type` properties of tokens.
213
214#### Note on using with [Escodegen][escodegen]
215
216Escodegen supports generating comments from AST, attached in
217Esprima-specific format. In order to simulate same format in
218Acorn, consider following example:
219
220```javascript
221var comments = [], tokens = [];
222
223var ast = acorn.parse('var x = 42; // answer', {
224 // collect ranges for each node
225 ranges: true,
226 // collect comments in Esprima's format
227 onComment: comments,
228 // collect token ranges
229 onToken: tokens
230});
231
232// attach comments using collected information
233escodegen.attachComments(ast, comments, tokens);
234
235// generate code
236console.log(escodegen.generate(ast, {comment: true}));
237// > 'var x = 42; // answer'
238```
239
240[escodegen]: https://github.com/estools/escodegen
241
242### dist/acorn_loose.js ###
243
244This file implements an error-tolerant parser. It exposes a single
245function. The loose parser is accessible in node.js via `require("acorn/dist/acorn_loose")`.
246
247**parse_dammit**`(input, options)` takes the same arguments and
248returns the same syntax tree as the `parse` function in `acorn.js`,
249but never raises an error, and will do its best to parse syntactically
250invalid code in as meaningful a way as it can. It'll insert identifier
251nodes with name `"✖"` as placeholders in places where it can't make
252sense of the input. Depends on `acorn.js`, because it uses the same
253tokenizer.
254
255### dist/walk.js ###
256
257Implements an abstract syntax tree walker. Will store its interface in
258`acorn.walk` when loaded without a module system.
259
260**simple**`(node, visitors, base, state)` does a 'simple' walk over
261a tree. `node` should be the AST node to walk, and `visitors` an
262object with properties whose names correspond to node types in the
263[ESTree spec][estree]. The properties should contain functions
264that will be called with the node object and, if applicable the state
265at that point. The last two arguments are optional. `base` is a walker
266algorithm, and `state` is a start state. The default walker will
267simply visit all statements and expressions and not produce a
268meaningful state. (An example of a use of state is to track scope at
269each point in the tree.)
270
271**ancestor**`(node, visitors, base, state)` does a 'simple' walk over
272a tree, building up an array of ancestor nodes (including the current node)
273and passing the array to the callbacks as a third parameter.
274
275**recursive**`(node, state, functions, base)` does a 'recursive'
276walk, where the walker functions are responsible for continuing the
277walk on the child nodes of their target node. `state` is the start
278state, and `functions` should contain an object that maps node types
279to walker functions. Such functions are called with `(node, state, c)`
280arguments, and can cause the walk to continue on a sub-node by calling
281the `c` argument on it with `(node, state)` arguments. The optional
282`base` argument provides the fallback walker functions for node types
283that aren't handled in the `functions` object. If not given, the
284default walkers will be used.
285
286**make**`(functions, base)` builds a new walker object by using the
287walker functions in `functions` and filling in the missing ones by
288taking defaults from `base`.
289
290**findNodeAt**`(node, start, end, test, base, state)` tries to
291locate a node in a tree at the given start and/or end offsets, which
292satisfies the predicate `test`. `start` and `end` can be either `null`
293(as wildcard) or a number. `test` may be a string (indicating a node
294type) or a function that takes `(nodeType, node)` arguments and
295returns a boolean indicating whether this node is interesting. `base`
296and `state` are optional, and can be used to specify a custom walker.
297Nodes are tested from inner to outer, so if two nodes match the
298boundaries, the inner one will be preferred.
299
300**findNodeAround**`(node, pos, test, base, state)` is a lot like
301`findNodeAt`, but will match any node that exists 'around' (spanning)
302the given position.
303
304**findNodeAfter**`(node, pos, test, base, state)` is similar to
305`findNodeAround`, but will match all nodes *after* the given position
306(testing outer nodes before inner nodes).
307
308## Command line interface
309
310The `bin/acorn` utility can be used to parse a file from the command
311line. It accepts as arguments its input file and the following
312options:
313
314- `--ecma3|--ecma5|--ecma6|--ecma7`: Sets the ECMAScript version to parse. Default is
315 version 5.
316
317- `--module`: Sets the parsing mode to `"module"`. Is set to `"script"` otherwise.
318
319- `--locations`: Attaches a "loc" object to each node with "start" and
320 "end" subobjects, each of which contains the one-based line and
321 zero-based column numbers in `{line, column}` form.
322
323- `--allow-hash-bang`: If the code starts with the characters #! (as in a shellscript), the first line will be treated as a comment.
324
325- `--compact`: No whitespace is used in the AST output.
326
327- `--silent`: Do not output the AST, just return the exit status.
328
329- `--help`: Print the usage information and quit.
330
331The utility spits out the syntax tree as JSON data.
332
333## Build system
334
335Acorn is written in ECMAScript 6, as a set of small modules, in the
336project's `src` directory, and compiled down to bigger ECMAScript 3
337files in `dist` using [Browserify](http://browserify.org) and
338[Babel](http://babeljs.io/). If you are already using Babel, you can
339consider including the modules directly.
340
341The command-line test runner (`npm test`) uses the ES6 modules. The
342browser-based test page (`test/index.html`) uses the compiled modules.
343The `bin/build-acorn.js` script builds the latter from the former.
344
345If you are working on Acorn, you'll probably want to try the code out
346directly, without an intermediate build step. In your scripts, you can
347register the Babel require shim like this:
348
349 require("babel-core/register")
350
351That will allow you to directly `require` the ES6 modules.
352
353## Plugins
354
355Acorn is designed support allow plugins which, within reasonable
356bounds, redefine the way the parser works. Plugins can add new token
357types and new tokenizer contexts (if necessary), and extend methods in
358the parser object. This is not a clean, elegant API—using it requires
359an understanding of Acorn's internals, and plugins are likely to break
360whenever those internals are significantly changed. But still, it is
361_possible_, in this way, to create parsers for JavaScript dialects
362without forking all of Acorn. And in principle it is even possible to
363combine such plugins, so that if you have, for example, a plugin for
364parsing types and a plugin for parsing JSX-style XML literals, you
365could load them both and parse code with both JSX tags and types.
366
367A plugin should register itself by adding a property to
368`acorn.plugins`, which holds a function. Calling `acorn.parse`, a
369`plugins` option can be passed, holding an object mapping plugin names
370to configuration values (or just `true` for plugins that don't take
371options). After the parser object has been created, the initialization
372functions for the chosen plugins are called with `(parser,
373configValue)` arguments. They are expected to use the `parser.extend`
374method to extend parser methods. For example, the `readToken` method
375could be extended like this:
376
377```javascript
378parser.extend("readToken", function(nextMethod) {
379 return function(code) {
380 console.log("Reading a token!")
381 return nextMethod.call(this, code)
382 }
383})
384```
385
386The `nextMethod` argument passed to `extend`'s second argument is the
387previous value of this method, and should usually be called through to
388whenever the extended method does not handle the call itself.
389
390Similarly, the loose parser allows plugins to register themselves via
391`acorn.pluginsLoose`. The extension mechanism is the same as for the
392normal parser:
393
394```javascript
395looseParser.extend("readToken", function(nextMethod) {
396 return function() {
397 console.log("Reading a token in the loose parser!")
398 return nextMethod.call(this)
399 }
400})
401```
402
403### Existing plugins
404
405 - [`acorn-jsx`](https://github.com/RReverser/acorn-jsx): Parse [Facebook JSX syntax extensions](https://github.com/facebook/jsx)
406 - [`acorn-es7-plugin`](https://github.com/MatAtBread/acorn-es7-plugin/): Parse [async/await syntax proposal](https://github.com/tc39/ecmascript-asyncawait)
407 - [`acorn-object-spread`](https://github.com/UXtemple/acorn-object-spread): Parse [object spread syntax proposal](https://github.com/sebmarkbage/ecmascript-rest-spread)
408 - [`acorn-es7`](https://www.npmjs.com/package/acorn-es7): Parse [decorator syntax proposal](https://github.com/wycats/javascript-decorators)
409 - [`acorn-objj`](https://www.npmjs.com/package/acorn-objj): [Objective-J](http://www.cappuccino-project.org/learn/objective-j.html) language parser built as Acorn plugin
410