lexer.coffee

跳至… +

browser.coffee cake.coffee coffee-script.coffee command.coffee grammar.coffee helpers.coffee index.coffee lexer.coffee nodes.coffee optparse.coffee register.coffee repl.coffee rewriter.coffee scope.litcoffee sourcemap.litcoffee

lexer.coffee
¶

CoffeeScript 詞法分析器。使用一系列代幣比對正規表示法，嘗試與原始碼開頭進行比對。找到比對時，會產生一個代幣，我們會使用該比對，然後重新開始。代幣格式為
```
[tag, value, locationData]
```
其中 locationData 為 {first_line, first_column, last_line, last_column}，這是一種格式，可以直接提供給 Jison。這些會由 jison 在 coffee-script.coffee 中定義的 parser.lexer 函式中讀取。
```
{Rewriter, INVERSES} = require './rewriter'
```

匯入我們需要的輔助程式。

{count, starts, compact, repeat, invertLiterate,
locationDataToString,  throwSyntaxError} = require './helpers'

¶

詞法分析器類別
¶
¶

詞法分析器類別會讀取 CoffeeScript 串流，並將其分為標記代幣。詞法分析器中已加入一些額外的智慧，避免語法中潛在的歧義。
```
exports.Lexer = class Lexer
```

tokenize 是詞法分析器的主要方法。透過使用固定在剩餘程式碼開頭的正規表示法，或自訂遞迴代幣比對方法（用於內插），逐一嘗試比對代幣來掃描。記錄下一個代幣後，我們會在程式碼中向前移動，超過該代幣，然後重新開始。

每個代幣化方法負責傳回已使用的字元數。

在傳回代幣串流前，透過重寫器執行它。

  tokenize: (code, opts = {}) ->
    @literate   = opts.literate  # Are we lexing literate CoffeeScript?
    @indent     = 0              # The current indentation level.
    @baseIndent = 0              # The overall minimum indentation level
    @indebt     = 0              # The over-indentation at the current level.
    @outdebt    = 0              # The under-outdentation at the current level.
    @indents    = []             # The stack of all current indentation levels.
    @ends       = []             # The stack for pairing up tokens.
    @tokens     = []             # Stream of parsed tokens in the form `['TYPE', value, location data]`.
    @seenFor    = no             # Used to recognize FORIN, FOROF and FORFROM tokens.
    @seenImport = no             # Used to recognize IMPORT FROM? AS? tokens.
    @seenExport = no             # Used to recognize EXPORT FROM? AS? tokens.
    @importSpecifierList = no    # Used to identify when in an IMPORT {...} FROM? ...
    @exportSpecifierList = no    # Used to identify when in an EXPORT {...} FROM? ...

    @chunkLine =
      opts.line or 0             # The start line for the current @chunk.
    @chunkColumn =
      opts.column or 0           # The start column of the current @chunk.
    code = @clean code           # The stripped, cleaned original source code.

在每個位置，執行這份嘗試比對的清單，如果其中任何一個成功，則短路。它們的順序會決定優先順序：@literalToken 是後備萬用字元。

    i = 0
    while @chunk = code[i..]
      consumed = \
           @identifierToken() or
           @commentToken()    or
           @whitespaceToken() or
           @lineToken()       or
           @stringToken()     or
           @numberToken()     or
           @regexToken()      or
           @jsToken()         or
           @literalToken()

更新位置

      [@chunkLine, @chunkColumn] = @getLineAndColumnFromChunk consumed

      i += consumed

      return {@tokens, index: i} if opts.untilBalanced and @ends.length is 0

    @closeIndentation()
    @error "missing #{end.tag}", end.origin[2] if end = @ends.pop()
    return @tokens if opts.rewrite is off
    (new Rewriter).rewrite @tokens

預先處理程式碼，移除開頭和結尾的空白、換行符號等。如果我們要分析有註解的 CoffeeScript，請移除所有縮排少於四個空格或一個 tab 的行，以移除外部 Markdown。

  clean: (code) ->
    code = code.slice(1) if code.charCodeAt(0) is BOM
    code = code.replace(/\r/g, '').replace TRAILING_SPACES, ''
    if WHITESPACE.test code
      code = "\n#{code}"
      @chunkLine--
    code = invertLiterate code if @literate
    code

¶

標記器
¶
¶

比對識別文字：變數、關鍵字、方法名稱等。檢查以確保 JavaScript 保留字不會用作識別碼。由於 CoffeeScript 保留少數 JavaScript 中允許的關鍵字，因此我們小心不要在這裡將它們標記為關鍵字，作為屬性名稱時，所以你仍然可以執行 jQuery.is()，即使 is 否則表示 ===。
```
  identifierToken: ->
    return 0 unless match = IDENTIFIER.exec @chunk
    [input, id, colon] = match
```

保留 id 長度以供位置資料

    idLength = id.length
    poppedToken = undefined

    if id is 'own' and @tag() is 'FOR'
      @token 'OWN', id
      return id.length
    if id is 'from' and @tag() is 'YIELD'
      @token 'FROM', id
      return id.length
    if id is 'as' and @seenImport
      if @value() is '*'
        @tokens[@tokens.length - 1][0] = 'IMPORT_ALL'
      else if @value() in COFFEE_KEYWORDS
        @tokens[@tokens.length - 1][0] = 'IDENTIFIER'
      if @tag() in ['DEFAULT', 'IMPORT_ALL', 'IDENTIFIER']
        @token 'AS', id
        return id.length
    if id is 'as' and @seenExport and @tag() in ['IDENTIFIER', 'DEFAULT']
      @token 'AS', id
      return id.length
    if id is 'default' and @seenExport and @tag() in ['EXPORT', 'AS']
      @token 'DEFAULT', id
      return id.length

    [..., prev] = @tokens

    tag =
      if colon or prev? and
         (prev[0] in ['.', '?.', '::', '?::'] or
         not prev.spaced and prev[0] is '@')
        'PROPERTY'
      else
        'IDENTIFIER'

    if tag is 'IDENTIFIER' and (id in JS_KEYWORDS or id in COFFEE_KEYWORDS) and
       not (@exportSpecifierList and id in COFFEE_KEYWORDS)
      tag = id.toUpperCase()
      if tag is 'WHEN' and @tag() in LINE_BREAK
        tag = 'LEADING_WHEN'
      else if tag is 'FOR'
        @seenFor = yes
      else if tag is 'UNLESS'
        tag = 'IF'
      else if tag is 'IMPORT'
        @seenImport = yes
      else if tag is 'EXPORT'
        @seenExport = yes
      else if tag in UNARY
        tag = 'UNARY'
      else if tag in RELATION
        if tag isnt 'INSTANCEOF' and @seenFor
          tag = 'FOR' + tag
          @seenFor = no
        else
          tag = 'RELATION'
          if @value() is '!'
            poppedToken = @tokens.pop()
            id = '!' + id
    else if tag is 'IDENTIFIER' and @seenFor and id is 'from' and
       isForFrom(prev)
      tag = 'FORFROM'
      @seenFor = no

    if tag is 'IDENTIFIER' and id in RESERVED
      @error "reserved word '#{id}'", length: id.length

    unless tag is 'PROPERTY'
      if id in COFFEE_ALIASES
        alias = id
        id = COFFEE_ALIAS_MAP[id]
      tag = switch id
        when '!'                 then 'UNARY'
        when '==', '!='          then 'COMPARE'
        when 'true', 'false'     then 'BOOL'
        when 'break', 'continue', \
             'debugger'          then 'STATEMENT'
        when '&&', '||'          then id
        else  tag

    tagToken = @token tag, id, 0, idLength
    tagToken.origin = [tag, alias, tagToken[2]] if alias
    if poppedToken
      [tagToken[2].first_line, tagToken[2].first_column] =
        [poppedToken[2].first_line, poppedToken[2].first_column]
    if colon
      colonOffset = input.lastIndexOf ':'
      @token ':', ':', colonOffset, colon.length

    input.length

比對數字，包括小數、十六進制和指數表示法。小心不要干擾正在進行的範圍。

  numberToken: ->
    return 0 unless match = NUMBER.exec @chunk

    number = match[0]
    lexedLength = number.length

    switch
      when /^0[BOX]/.test number
        @error "radix prefix in '#{number}' must be lowercase", offset: 1
      when /^(?!0x).*E/.test number
        @error "exponential notation in '#{number}' must be indicated with a lowercase 'e'",
          offset: number.indexOf('E')
      when /^0\d*[89]/.test number
        @error "decimal literal '#{number}' must not be prefixed with '0'", length: lexedLength
      when /^0\d+/.test number
        @error "octal literal '#{number}' must be prefixed with '0o'", length: lexedLength

    base = switch number.charAt 1
      when 'b' then 2
      when 'o' then 8
      when 'x' then 16
      else null
    numberValue = if base? then parseInt(number[2..], base) else parseFloat(number)
    if number.charAt(1) in ['b', 'o']
      number = "0x#{numberValue.toString 16}"

    tag = if numberValue is Infinity then 'INFINITY' else 'NUMBER'
    @token tag, number, 0, lexedLength
    lexedLength

比對字串，包括多行字串，以及有或沒有內插的 heredocs。

  stringToken: ->
    [quote] = STRING_START.exec(@chunk) || []
    return 0 unless quote

如果前一個標記是 from，而且這是匯入或匯出陳述，請正確標記 from。

    if @tokens.length and @value() is 'from' and (@seenImport or @seenExport)
      @tokens[@tokens.length - 1][0] = 'FROM'

    regex = switch quote
      when "'"   then STRING_SINGLE
      when '"'   then STRING_DOUBLE
      when "'''" then HEREDOC_SINGLE
      when '"""' then HEREDOC_DOUBLE
    heredoc = quote.length is 3

    {tokens, index: end} = @matchWithInterpolations regex, quote
    $ = tokens.length - 1

    delimiter = quote.charAt(0)
    if heredoc

找出最小的縮排。稍後會從所有行中移除它。

      indent = null
      doc = (token[1] for token, i in tokens when token[0] is 'NEOSTRING').join '#{}'
      while match = HEREDOC_INDENT.exec doc
        attempt = match[1]
        indent = attempt if indent is null or 0 < attempt.length < indent.length
      indentRegex = /// \n#{indent} ///g if indent
      @mergeInterpolationTokens tokens, {delimiter}, (value, i) =>
        value = @formatString value, delimiter: quote
        value = value.replace indentRegex, '\n' if indentRegex
        value = value.replace LEADING_BLANK_LINE,  '' if i is 0
        value = value.replace TRAILING_BLANK_LINE, '' if i is $
        value
    else
      @mergeInterpolationTokens tokens, {delimiter}, (value, i) =>
        value = @formatString value, delimiter: quote
        value = value.replace SIMPLE_STRING_OMIT, (match, offset) ->
          if (i is 0 and offset is 0) or
             (i is $ and offset + match.length is value.length)
            ''
          else
            ' '
        value

    end

比對和使用註解。

  commentToken: ->
    return 0 unless match = @chunk.match COMMENT
    [comment, here] = match
    if here
      if match = HERECOMMENT_ILLEGAL.exec comment
        @error "block comments cannot contain #{match[0]}",
          offset: match.index, length: match[0].length
      if here.indexOf('\n') >= 0
        here = here.replace /// \n #{repeat ' ', @indent} ///g, '\n'
      @token 'HERECOMMENT', here, 0, comment.length
    comment.length

比對透過反引號直接內插到來源中的 JavaScript。

  jsToken: ->
    return 0 unless @chunk.charAt(0) is '`' and
      (match = HERE_JSTOKEN.exec(@chunk) or JSTOKEN.exec(@chunk))

¶

將跳脫反引號轉換為反引號，以及在跳脫反引號正前方的跳脫反斜線轉換為反斜線
```
    script = match[1].replace /\\+(`|$)/g, (string) ->
```
¶

字串總是像 ‘`‘、‘\`‘、‘\\`‘ 等值。透過將它簡化為後半部分，我們將 ‘`‘ 轉換為 ‘'、‘\\\‘ 轉換為 ‘`‘ 等。
```
      string[-Math.ceil(string.length / 2)..]
    @token 'JS', script, 0, match[0].length
    match[0].length
```

比對正規表示式文字，以及多行延伸的正規表示式文字。正規表示式詞法分析難以與除法區分，因此我們從 JavaScript 和 Ruby 借用一些基本的啟發法。

  regexToken: ->
    switch
      when match = REGEX_ILLEGAL.exec @chunk
        @error "regular expressions cannot begin with #{match[2]}",
          offset: match.index + match[1].length
      when match = @matchWithInterpolations HEREGEX, '///'
        {tokens, index} = match
      when match = REGEX.exec @chunk
        [regex, body, closed] = match
        @validateEscapes body, isRegex: yes, offsetInChunk: 1
        body = @formatRegex body, delimiter: '/'
        index = regex.length
        [..., prev] = @tokens
        if prev
          if prev.spaced and prev[0] in CALLABLE
            return 0 if not closed or POSSIBLY_DIVISION.test regex
          else if prev[0] in NOT_REGEX
            return 0
        @error 'missing / (unclosed regex)' unless closed
      else
        return 0

    [flags] = REGEX_FLAGS.exec @chunk[index..]
    end = index + flags.length
    origin = @makeToken 'REGEX', null, 0, end
    switch
      when not VALID_FLAGS.test flags
        @error "invalid regular expression flags #{flags}", offset: index, length: flags.length
      when regex or tokens.length is 1
        body ?= @formatHeregex tokens[0][1]
        @token 'REGEX', "#{@makeDelimitedLiteral body, delimiter: '/'}#{flags}", 0, end, origin
      else
        @token 'REGEX_START', '(', 0, 0, origin
        @token 'IDENTIFIER', 'RegExp', 0, 0
        @token 'CALL_START', '(', 0, 0
        @mergeInterpolationTokens tokens, {delimiter: '"', double: yes}, @formatHeregex
        if flags
          @token ',', ',', index - 1, 0
          @token 'STRING', '"' + flags + '"', index - 1, flags.length
        @token ')', ')', end - 1, 0
        @token 'REGEX_END', ')', end - 1, 0

    end

比對換行、縮排和取消縮排，並判斷哪一個是哪一個。如果我們可以偵測到目前的列延續到下一列，則會抑制換行

elements
  .each( ... )
  .map( ... )

追蹤縮排層級，因為單一取消縮排標記可以關閉多個縮排，所以我們需要知道我們碰巧深入到什麼程度。

  lineToken: ->
    return 0 unless match = MULTI_DENT.exec @chunk
    indent = match[0]

    @seenFor = no
    @seenImport = no unless @importSpecifierList
    @seenExport = no unless @exportSpecifierList

    size = indent.length - 1 - indent.lastIndexOf '\n'
    noNewlines = @unfinished()

    if size - @indebt is @indent
      if noNewlines then @suppressNewlines() else @newlineToken 0
      return indent.length

    if size > @indent
      if noNewlines
        @indebt = size - @indent
        @suppressNewlines()
        return indent.length
      unless @tokens.length
        @baseIndent = @indent = size
        return indent.length
      diff = size - @indent + @outdebt
      @token 'INDENT', diff, indent.length - size, size
      @indents.push diff
      @ends.push {tag: 'OUTDENT'}
      @outdebt = @indebt = 0
      @indent = size
    else if size < @baseIndent
      @error 'missing indentation', offset: indent.length
    else
      @indebt = 0
      @outdentToken @indent - size, noNewlines, indent.length
    indent.length

記錄取消縮排標記或多個標記，如果我們碰巧往回移動到幾個已記錄縮排的內側。設定新的 @indent 值。

  outdentToken: (moveOut, noNewlines, outdentLength) ->
    decreasedIndent = @indent - moveOut
    while moveOut > 0
      lastIndent = @indents[@indents.length - 1]
      if not lastIndent
        moveOut = 0
      else if lastIndent is @outdebt
        moveOut -= @outdebt
        @outdebt = 0
      else if lastIndent < @outdebt
        @outdebt -= lastIndent
        moveOut  -= lastIndent
      else
        dent = @indents.pop() + @outdebt
        if outdentLength and @chunk[outdentLength] in INDENTABLE_CLOSERS
          decreasedIndent -= dent - moveOut
          moveOut = dent
        @outdebt = 0

配對可能會呼叫 outdentToken，所以保留 decreasedIndent

        @pair 'OUTDENT'
        @token 'OUTDENT', moveOut, 0, outdentLength
        moveOut -= dent
    @outdebt -= moveOut if dent
    @tokens.pop() while @value() is ';'

    @token 'TERMINATOR', '\n', outdentLength, 0 unless @tag() is 'TERMINATOR' or noNewlines
    @indent = decreasedIndent
    this

比對和使用沒有意義的空白。將前一個標記標記為「有空格」，因為有些情況下會有差別。

  whitespaceToken: ->
    return 0 unless (match = WHITESPACE.exec @chunk) or
                    (nline = @chunk.charAt(0) is '\n')
    [..., prev] = @tokens
    prev[if match then 'spaced' else 'newLine'] = true if prev
    if match then match[0].length else 0

產生換行標記。連續換行會合併在一起。

  newlineToken: (offset) ->
    @tokens.pop() while @value() is ';'
    @token 'TERMINATOR', '\n', offset, 0 unless @tag() is 'TERMINATOR'
    this

在行尾使用 \ 來抑制換行。一但完成工作，就會移除斜線。

  suppressNewlines: ->
    @tokens.pop() if @value() is '\\'
    this

我們將所有其他單一字元視為標記。例如：( ) , . ! 多字元運算子也是文字標記，以便 Jison 可以指定適當的運算順序。有些符號我們在此特別標記。; 和換行都被視為 TERMINATOR，我們區分表示方法呼叫的括號和一般括號，等等。

  literalToken: ->
    if match = OPERATOR.exec @chunk
      [value] = match
      @tagParameters() if CODE.test value
    else
      value = @chunk.charAt 0
    tag  = value
    [..., prev] = @tokens

    if prev and value in ['=', COMPOUND_ASSIGN...]
      skipToken = false
      if value is '=' and prev[1] in ['||', '&&'] and not prev.spaced
        prev[0] = 'COMPOUND_ASSIGN'
        prev[1] += '='
        prev = @tokens[@tokens.length - 2]
        skipToken = true
      if prev and prev[0] isnt 'PROPERTY'
        origin = prev.origin ? prev
        message = isUnassignable prev[1], origin[1]
        @error message, origin[2] if message
      return value.length if skipToken

    if value is '{' and @seenImport
      @importSpecifierList = yes
    else if @importSpecifierList and value is '}'
      @importSpecifierList = no
    else if value is '{' and prev?[0] is 'EXPORT'
      @exportSpecifierList = yes
    else if @exportSpecifierList and value is '}'
      @exportSpecifierList = no

    if value is ';'
      @seenFor = @seenImport = @seenExport = no
      tag = 'TERMINATOR'
    else if value is '*' and prev[0] is 'EXPORT'
      tag = 'EXPORT_ALL'
    else if value in MATH            then tag = 'MATH'
    else if value in COMPARE         then tag = 'COMPARE'
    else if value in COMPOUND_ASSIGN then tag = 'COMPOUND_ASSIGN'
    else if value in UNARY           then tag = 'UNARY'
    else if value in UNARY_MATH      then tag = 'UNARY_MATH'
    else if value in SHIFT           then tag = 'SHIFT'
    else if value is '?' and prev?.spaced then tag = 'BIN?'
    else if prev and not prev.spaced
      if value is '(' and prev[0] in CALLABLE
        prev[0] = 'FUNC_EXIST' if prev[0] is '?'
        tag = 'CALL_START'
      else if value is '[' and prev[0] in INDEXABLE
        tag = 'INDEX_START'
        switch prev[0]
          when '?'  then prev[0] = 'INDEX_SOAK'
    token = @makeToken tag, value
    switch value
      when '(', '{', '[' then @ends.push {tag: INVERSES[value], origin: token}
      when ')', '}', ']' then @pair value
    @tokens.push token
    value.length

¶

標記處理器
¶

我們語法中模稜兩可的來源，用於函式定義中的參數清單，相對於函式呼叫中的引數清單。向後走，特別標記參數，以便讓解析器更容易處理。

  tagParameters: ->
    return this if @tag() isnt ')'
    stack = []
    {tokens} = this
    i = tokens.length
    tokens[--i][0] = 'PARAM_END'
    while tok = tokens[--i]
      switch tok[0]
        when ')'
          stack.push tok
        when '(', 'CALL_START'
          if stack.length then stack.pop()
          else if tok[0] is '('
            tok[0] = 'PARAM_START'
            return this
          else return this
    this

在檔案結尾關閉所有剩下的開啟區塊。

  closeIndentation: ->
    @outdentToken @indent

¶

比對區隔符號令牌的內容，並使用類似 Ruby 的符號表示法來擴充其中的變數和表達式，以取代任意表達式。
```
"Hello #{name.capitalize()}."
```
如果遇到插補，此方法將遞迴建立新的 Lexer 並進行標記化，直到 { 中的 #{ 與 } 平衡為止。
- regex 比對令牌的內容（但不是 delimiter，也不是 #{，如果需要插補的話）。
- delimiter 是令牌的分隔符號。範例包括 '、"、'''、""" 和 ///。
此方法允許我們在字串中無限次地使用插補字串。
```
  matchWithInterpolations: (regex, delimiter) ->
    tokens = []
    offsetInChunk = delimiter.length
    return null unless @chunk[...offsetInChunk] is delimiter
    str = @chunk[offsetInChunk..]
    loop
      [strPart] = regex.exec str

      @validateEscapes strPart, {isRegex: delimiter.charAt(0) is '/', offsetInChunk}
```

推入一個假的「NEOSTRING」令牌，稍後會轉換成真正的字串。

      tokens.push @makeToken 'NEOSTRING', strPart, offsetInChunk

      str = str[strPart.length..]
      offsetInChunk += strPart.length

      break unless str[...2] is '#{'

1 用於移除 #{ 中的 #。

      [line, column] = @getLineAndColumnFromChunk offsetInChunk + 1
      {tokens: nested, index} =
        new Lexer().tokenize str[1..], line: line, column: column, untilBalanced: on

¶

略過尾端的 }。
```
      index += 1
```

將開頭和結尾的 { 和 } 轉換成括號。不必要的括號會在稍後移除。

      [open, ..., close] = nested
      open[0]  = open[1]  = '('
      close[0] = close[1] = ')'
      close.origin = ['', 'end of interpolation', close[2]]

移除開頭的「TERMINATOR」（如果有）。

      nested.splice 1, 1 if nested[1]?[0] is 'TERMINATOR'

推入一個假的「TOKENS」令牌，稍後會轉換成真正的令牌。

      tokens.push ['TOKENS', nested]

      str = str[index..]
      offsetInChunk += index

    unless str[...delimiter.length] is delimiter
      @error "missing #{delimiter}", length: delimiter.length

    [firstToken, ..., lastToken] = tokens
    firstToken[2].first_column -= delimiter.length
    if lastToken[1].substr(-1) is '\n'
      lastToken[2].last_line += 1
      lastToken[2].last_column = delimiter.length - 1
    else
      lastToken[2].last_column += delimiter.length
    lastToken[2].last_column -= 1 if lastToken[1].length is 0

    {tokens, index: offsetInChunk + delimiter.length}

將假的令牌類型「TOKENS」和「NEOSTRING」的陣列 tokens（由 matchWithInterpolations 傳回）合併到令牌串流中。先使用 fn 轉換「NEOSTRING」的值，再使用 options 將其轉換成字串。

  mergeInterpolationTokens: (tokens, options, fn) ->
    if tokens.length > 1
      lparen = @token 'STRING_START', '(', 0, 0

    firstIndex = @tokens.length
    for token, i in tokens
      [tag, value] = token
      switch tag
        when 'TOKENS'

¶

最佳化空插值 (一對空括號)。
```
          continue if value.length is 2
```

將所有代碼放入假的「代碼」代碼中。這些代碼已經有正確的位置資料。

          locationToken = value[0]
          tokensToPush = value
        when 'NEOSTRING'

將「NEOSTRING」轉換成「STRING」。

          converted = fn.call this, token[1], i

最佳化空字串。我們確保代碼串流總是從字串代碼開始，以確保結果確實是字串。

          if converted.length is 0
            if i is 0
              firstEmptyStringIndex = @tokens.length
            else
              continue

不過，有一種情況我們可以最佳化起始的空字串。

          if i is 2 and firstEmptyStringIndex?
            @tokens.splice firstEmptyStringIndex, 2 # Remove empty string and the plus.
          token[0] = 'STRING'
          token[1] = @makeDelimitedLiteral converted, options
          locationToken = token
          tokensToPush = [token]
      if @tokens.length > firstIndex

建立一個長度為 0 的「+」代碼。

        plusToken = @token '+', '+'
        plusToken[2] =
          first_line:   locationToken[2].first_line
          first_column: locationToken[2].first_column
          last_line:    locationToken[2].first_line
          last_column:  locationToken[2].first_column
      @tokens.push tokensToPush...

    if lparen
      [..., lastToken] = tokens
      lparen.origin = ['STRING', null,
        first_line:   lparen[2].first_line
        first_column: lparen[2].first_column
        last_line:    lastToken[2].last_line
        last_column:  lastToken[2].last_column
      ]
      rparen = @token 'STRING_END', ')'
      rparen[2] =
        first_line:   lastToken[2].last_line
        first_column: lastToken[2].last_column
        last_line:    lastToken[2].last_line
        last_column:  lastToken[2].last_column

配對一個結束代碼，確保代碼串流中所有列出的代碼配對在整個過程中都正確平衡。

  pair: (tag) ->
    [..., prev] = @ends
    unless tag is wanted = prev?.tag
      @error "unmatched #{tag}" unless 'OUTDENT' is wanted

自動關閉縮進，以支援類似這樣的語法

el.click((event) ->
  el.hide())

      [..., lastIndent] = @indents
      @outdentToken lastIndent, true
      return @pair tag
    @ends.pop()

¶

輔助程式
¶

從當前區塊的偏移量傳回行號和欄號。

offset 是 @chunk 中的字元數。

  getLineAndColumnFromChunk: (offset) ->
    if offset is 0
      return [@chunkLine, @chunkColumn]

    if offset >= @chunk.length
      string = @chunk
    else
      string = @chunk[..offset-1]

    lineCount = count string, '\n'

    column = @chunkColumn
    if lineCount > 0
      [..., lastLine] = string.split '\n'
      column = lastLine.length
    else
      column += string.length

    [@chunkLine + lineCount, column]

與「代碼」相同，但這個只傳回代碼，而不會將其新增至結果。

  makeToken: (tag, value, offsetInChunk = 0, length = value.length) ->
    locationData = {}
    [locationData.first_line, locationData.first_column] =
      @getLineAndColumnFromChunk offsetInChunk

使用長度 - 1 作為最後的偏移量 - 我們提供 last_line 和 last_column，因此如果 last_column == first_column，則我們正在查看長度為 1 的字元。

    lastCharacter = if length > 0 then (length - 1) else 0
    [locationData.last_line, locationData.last_column] =
      @getLineAndColumnFromChunk offsetInChunk + lastCharacter

    token = [tag, value, locationData]

    token

¶

將代碼新增至結果。offset 是代碼在當前 @chunk 中開始的偏移量。length 是 @chunk 中代碼的長度，在偏移量之後。如果未指定，將使用 value 的長度。

傳回新的代碼。
```
  token: (tag, value, offsetInChunk, length, origin) ->
    token = @makeToken tag, value, offsetInChunk, length
    token.origin = origin if origin
    @tokens.push token
    token
```

查看代碼串流中的最後一個標籤。

  tag: ->
    [..., token] = @tokens
    token?[0]

窺探令牌串流中的最後一個值。

  value: ->
    [..., token] = @tokens
    token?[1]

我們是否處於未完成表達式的中間？

  unfinished: ->
    LINE_CONTINUER.test(@chunk) or
    @tag() in UNFINISHED

  formatString: (str, options) ->
    @replaceUnicodeCodePointEscapes str.replace(STRING_OMIT, '$1'), options

  formatHeregex: (str) ->
    @formatRegex str.replace(HEREGEX_OMIT, '$1$2'), delimiter: '///'

  formatRegex: (str, options) ->
    @replaceUnicodeCodePointEscapes str, options

  unicodeCodePointToUnicodeEscapes: (codePoint) ->
    toUnicodeEscape = (val) ->
      str = val.toString 16
      "\\u#{repeat '0', 4 - str.length}#{str}"
    return toUnicodeEscape(codePoint) if codePoint < 0x10000

代理對

    high = Math.floor((codePoint - 0x10000) / 0x400) + 0xD800
    low = (codePoint - 0x10000) % 0x400 + 0xDC00
    "#{toUnicodeEscape(high)}#{toUnicodeEscape(low)}"

在字串和正規表示式中將 \u{…} 替換為 \uxxxx[\uxxxx]

  replaceUnicodeCodePointEscapes: (str, options) ->
    str.replace UNICODE_CODE_POINT_ESCAPE, (match, escapedBackslash, codePointHex, offset) =>
      return escapedBackslash if escapedBackslash

      codePointDecimal = parseInt codePointHex, 16
      if codePointDecimal > 0x10ffff
        @error "unicode code point escapes greater than \\u{10ffff} are not allowed",
          offset: offset + options.delimiter.length
          length: codePointHex.length + 4

      @unicodeCodePointToUnicodeEscapes codePointDecimal

驗證字串和正規表示式中的跳脫字元。

  validateEscapes: (str, options = {}) ->
    invalidEscapeRegex =
      if options.isRegex
        REGEX_INVALID_ESCAPE
      else
        STRING_INVALID_ESCAPE
    match = invalidEscapeRegex.exec str
    return unless match
    [[], before, octal, hex, unicodeCodePoint, unicode] = match
    message =
      if octal
        "octal escape sequences are not allowed"
      else
        "invalid escape sequence"
    invalidEscape = "\\#{octal or hex or unicodeCodePoint or unicode}"
    @error "#{message} #{invalidEscape}",
      offset: (options.offsetInChunk ? 0) + match.index + before.length
      length: invalidEscape.length

透過跳脫特定字元來建構字串或正規表示式。

  makeDelimitedLiteral: (body, options = {}) ->
    body = '(?:)' if body is '' and options.delimiter is '/'
    regex = ///
        (\\\\)                               # escaped backslash
      | (\\0(?=[1-7]))                       # nul character mistaken as octal escape
      | \\?(#{options.delimiter})            # (possibly escaped) delimiter
      | \\?(?: (\n)|(\r)|(\u2028)|(\u2029) ) # (possibly escaped) newlines
      | (\\.)                                # other escapes
    ///g
    body = body.replace regex, (match, backslash, nul, delimiter, lf, cr, ls, ps, other) -> switch

忽略跳脫的反斜線。

      when backslash then (if options.double then backslash + backslash else backslash)
      when nul       then '\\x00'
      when delimiter then "\\#{delimiter}"
      when lf        then '\\n'
      when cr        then '\\r'
      when ls        then '\\u2028'
      when ps        then '\\u2029'
      when other     then (if options.double then "\\#{other}" else other)
    "#{options.delimiter}#{body}#{options.delimiter}"

在目前區塊的特定偏移量或令牌位置 (token[2]) 擲回錯誤。

  error: (message, options = {}) ->
    location =
      if 'first_line' of options
        options
      else
        [first_line, first_column] = @getLineAndColumnFromChunk options.offset ? 0
        {first_line, first_column, last_column: first_column + (options.length ? 1) - 1}
    throwSyntaxError message, location

¶

輔助函式


isUnassignable = (name, displayName = name) -> switch
  when name in [JS_KEYWORDS..., COFFEE_KEYWORDS...]
    "keyword '#{displayName}' can't be assigned"
  when name in STRICT_PROSCRIBED
    "'#{displayName}' can't be assigned"
  when name in RESERVED
    "reserved word '#{displayName}' can't be assigned"
  else
    false

exports.isUnassignable = isUnassignable

¶

from 並非 CoffeeScript 關鍵字，但它在 import 和 export 陳述式 (如上所述) 和 for 迴圈的宣告行中表現得像關鍵字。嘗試偵測 from 是變數識別碼還是這個「有時」關鍵字。
```
isForFrom = (prev) ->
  if prev[0] is 'IDENTIFIER'
```

for i from from、for from from iterable

    if prev[1] is 'from'
      prev[1][0] = 'IDENTIFIER'
      yes

¶

for i from iterable
```
    yes
```
¶

for from…
```
  else if prev[0] is 'FOR'
    no
```

for {from}…、for [from]…、for {a, from}…、for {a: from}…

  else if prev[1] in ['{', '[', ',', ':']
    no
  else
    yes

¶

常數
¶

CoffeeScript 與 JavaScript 共用的關鍵字。

JS_KEYWORDS = [
  'true', 'false', 'null', 'this'
  'new', 'delete', 'typeof', 'in', 'instanceof'
  'return', 'throw', 'break', 'continue', 'debugger', 'yield'
  'if', 'else', 'switch', 'for', 'while', 'do', 'try', 'catch', 'finally'
  'class', 'extends', 'super'
  'import', 'export', 'default'
]

僅限 CoffeeScript 的關鍵字。

COFFEE_KEYWORDS = [
  'undefined', 'Infinity', 'NaN'
  'then', 'unless', 'until', 'loop', 'of', 'by', 'when'
]

COFFEE_ALIAS_MAP =
  and  : '&&'
  or   : '||'
  is   : '=='
  isnt : '!='
  not  : '!'
  yes  : 'true'
  no   : 'false'
  on   : 'true'
  off  : 'false'

COFFEE_ALIASES  = (key for key of COFFEE_ALIAS_MAP)
COFFEE_KEYWORDS = COFFEE_KEYWORDS.concat COFFEE_ALIASES

JavaScript 保留但未使用或由 CoffeeScript 內部使用的關鍵字清單。當遇到這些關鍵字時，我們會擲回錯誤，以避免執行時期發生 JavaScript 錯誤。

RESERVED = [
  'case', 'function', 'var', 'void', 'with', 'const', 'let', 'enum'
  'native', 'implements', 'interface', 'package', 'private'
  'protected', 'public', 'static'
]

STRICT_PROSCRIBED = ['arguments', 'eval']

¶

JavaScript 關鍵字和保留字的超集，其中任何一個都不得用作識別碼或屬性。
```
exports.JS_FORBIDDEN = JS_KEYWORDS.concat(RESERVED).concat(STRICT_PROSCRIBED)
```
¶

令人討厭的 Microsoft 瘋狂，又稱為 BOM 的字元代碼。
```
BOM = 65279
```

令牌比對正規表示式。

IDENTIFIER = /// ^
  (?!\d)
  ( (?: (?!\s)[$\w\x7f-\uffff] )+ )
  ( [^\n\S]* : (?!:) )?  # Is this a property name?
///

NUMBER     = ///
  ^ 0b[01]+    |              # binary
  ^ 0o[0-7]+   |              # octal
  ^ 0x[\da-f]+ |              # hex
  ^ \d*\.?\d+ (?:e[+-]?\d+)?  # decimal
///i

OPERATOR   = /// ^ (
  ?: [-=]>             # function
   | [-+*/%<>&|^!?=]=  # compound assign / compare
   | >>>=?             # zero-fill right shift
   | ([-+:])\1         # doubles
   | ([&|<>*/%])\2=?   # logic / shift / power / floor division / modulo
   | \?(\.|::)         # soak access
   | \.{2,3}           # range or splat
) ///

WHITESPACE = /^[^\n\S]+/

COMMENT    = /^###([^#][\s\S]*?)(?:###[^\n\S]*|###$)|^(?:\s*#(?!##[^#]).*)+/

CODE       = /^[-=]>/

MULTI_DENT = /^(?:\n[^\n\S]*)+/

JSTOKEN      = ///^ `(?!``) ((?: [^`\\] | \\[\s\S]           )*) `   ///
HERE_JSTOKEN = ///^ ```     ((?: [^`\\] | \\[\s\S] | `(?!``) )*) ``` ///

字串比對正規表示式。

STRING_START   = /^(?:'''|"""|'|")/

STRING_SINGLE  = /// ^(?: [^\\']  | \\[\s\S]                      )* ///
STRING_DOUBLE  = /// ^(?: [^\\"#] | \\[\s\S] |           \#(?!\{) )* ///
HEREDOC_SINGLE = /// ^(?: [^\\']  | \\[\s\S] | '(?!'')            )* ///
HEREDOC_DOUBLE = /// ^(?: [^\\"#] | \\[\s\S] | "(?!"") | \#(?!\{) )* ///

STRING_OMIT    = ///
    ((?:\\\\)+)      # consume (and preserve) an even number of backslashes
  | \\[^\S\n]*\n\s*  # remove escaped newlines
///g
SIMPLE_STRING_OMIT = /\s*\n\s*/g
HEREDOC_INDENT     = /\n+([^\n\S]*)(?=\S)/g

正規表示式比對正規表示式。

REGEX = /// ^
  / (?!/) ((
  ?: [^ [ / \n \\ ]  # every other thing
   | \\[^\n]         # anything but newlines escaped
   | \[              # character class
       (?: \\[^\n] | [^ \] \n \\ ] )*
     \]
  )*) (/)?
///

REGEX_FLAGS  = /^\w*/
VALID_FLAGS  = /^(?!.*(.).*\1)[imguy]*$/

HEREGEX      = /// ^(?: [^\\/#] | \\[\s\S] | /(?!//) | \#(?!\{) )* ///

HEREGEX_OMIT = ///
    ((?:\\\\)+)     # consume (and preserve) an even number of backslashes
  | \\(\s)          # preserve escaped whitespace
  | \s+(?:#.*)?     # remove whitespace and comments
///g

REGEX_ILLEGAL = /// ^ ( / | /{3}\s*) (\*) ///

POSSIBLY_DIVISION   = /// ^ /=?\s ///

其他正規表示式。

HERECOMMENT_ILLEGAL = /\*\//

LINE_CONTINUER      = /// ^ \s* (?: , | \??\.(?![.\d]) | :: ) ///

STRING_INVALID_ESCAPE = ///
  ( (?:^|[^\\]) (?:\\\\)* )        # make sure the escape isn’t escaped
  \\ (
     ?: (0[0-7]|[1-7])             # octal escape
      | (x(?![\da-fA-F]{2}).{0,2}) # hex escape
      | (u\{(?![\da-fA-F]{1,}\})[^}]*\}?) # unicode code point escape
      | (u(?!\{|[\da-fA-F]{4}).{0,4}) # unicode escape
  )
///
REGEX_INVALID_ESCAPE = ///
  ( (?:^|[^\\]) (?:\\\\)* )        # make sure the escape isn’t escaped
  \\ (
     ?: (0[0-7])                   # octal escape
      | (x(?![\da-fA-F]{2}).{0,2}) # hex escape
      | (u\{(?![\da-fA-F]{1,}\})[^}]*\}?) # unicode code point escape
      | (u(?!\{|[\da-fA-F]{4}).{0,4}) # unicode escape
  )
///

UNICODE_CODE_POINT_ESCAPE = ///
  ( \\\\ )        # make sure the escape isn’t escaped
  |
  \\u\{ ( [\da-fA-F]+ ) \}
///g

LEADING_BLANK_LINE  = /^[^\n\S]*\n/
TRAILING_BLANK_LINE = /\n[^\n\S]*$/

TRAILING_SPACES     = /\s+$/

複合賦值令牌。

COMPOUND_ASSIGN = [
  '-=', '+=', '/=', '*=', '%=', '||=', '&&=', '?=', '<<=', '>>=', '>>>='
  '&=', '^=', '|=', '**=', '//=', '%%='
]

單元令牌。

UNARY = ['NEW', 'TYPEOF', 'DELETE', 'DO']

UNARY_MATH = ['!', '~']

¶

位元移位令牌。
```
SHIFT = ['<<', '>>', '>>>']
```

比較令牌。

COMPARE = ['==', '!=', '<', '>', '<=', '>=']

¶

數學令牌。
```
MATH = ['*', '/', '%', '//', '%%']
```
¶

可以使用 not 前綴否定的關聯令牌。
```
RELATION = ['IN', 'OF', 'INSTANCEOF']
```
¶

布林令牌。
```
BOOL = ['TRUE', 'FALSE']
```

可以合法調用或索引的令牌。這些令牌後面的開括號或方括號將記錄為函式調用或索引運算的開頭。

CALLABLE  = ['IDENTIFIER', 'PROPERTY', ')', ']', '?', '@', 'THIS', 'SUPER']
INDEXABLE = CALLABLE.concat [
  'NUMBER', 'INFINITY', 'NAN', 'STRING', 'STRING_END', 'REGEX', 'REGEX_END'
  'BOOL', 'NULL', 'UNDEFINED', '}', '::'
]

¶

正規表示式永遠不會緊接在後面的令牌（某些情況下會出現有空格的 CALLABLE 例外），但除法運算子可以。

請參閱：http://www-archive.mozilla.org/js/language/js20-2002-04/rationale/syntax.html#regular-expressions
```
NOT_REGEX = INDEXABLE.concat ['++', '--']
```
¶

緊接在 WHEN 之前的令牌，表示 WHEN 出現在行的開頭。我們將其與尾隨的 WHEN 區分開來，以避免語法中的歧義。
```
LINE_BREAK = ['INDENT', 'OUTDENT', 'TERMINATOR']
```
¶

這些前面的額外縮排將被忽略。
```
INDENTABLE_CLOSERS = [')', '}', ']']
```

出現在行尾的令牌，會抑制後面的 TERMINATOR/INDENT 令牌

UNFINISHED = ['\\', '.', '?.', '?::', 'UNARY', 'MATH', 'UNARY_MATH', '+', '-',
           '**', 'SHIFT', 'RELATION', 'COMPARE', '&', '^', '|', '&&', '||',
           'BIN?', 'THROW', 'EXTENDS', 'DEFAULT']

詞法分析器類別

標記器

標記處理器

輔助程式

輔助函式

常數