rewriter.coffee

跳至… +

browser.coffee cake.coffee coffee-script.coffee command.coffee grammar.coffee helpers.coffee index.coffee lexer.coffee nodes.coffee optparse.coffee register.coffee repl.coffee rewriter.coffee scope.litcoffee sourcemap.litcoffee

rewriter.coffee
¶

CoffeeScript 語言有許多可選語法、隱含語法和簡寫語法。這會大幅增加語法的複雜度，並擴充產生的分析表。我們不讓剖析器處理所有這些，而是使用此 Rewriter 針對記號串進行一系列的傳遞，將簡寫轉換為明確的長格式，新增隱含的縮排和括號，並大致整理一下。

建立一個產生的記號：一個由於使用隱含語法而存在的記號。

generate = (tag, value, origin) ->
  tok = [tag, value]
  tok.generated = yes
  tok.origin = origin if origin
  tok

¶

Rewriter 類別由 Lexer 使用，直接針對其內部的記號陣列。
```
exports.Rewriter = class Rewriter
```
¶

一次一個邏輯篩選器，以多重傳遞重寫記號串。這當然可以變更為透過串流的單一傳遞，搭配一個大型有效率的切換，但這樣的工作方式好多了。這些傳遞的順序很重要 – 必須在隱含括號可以包覆程式碼區塊之前修正縮排。
```
  rewrite: (@tokens) ->
```

有助於偵錯的片段：console.log (t[0] + ‘/‘ + t[1] for t in @tokens).join ‘ ‘

    @removeLeadingNewlines()
    @closeOpenCalls()
    @closeOpenIndexes()
    @normalizeLines()
    @tagPostfixConditionals()
    @addImplicitBracesAndParens()
    @addLocationDataToGeneratedTokens()
    @fixOutdentLocationData()
    @tokens

重寫記號串，向前和向後查看一個記號。允許區塊的傳回值告訴我們在串流中向前（或向後）移動多少個記號，以確保我們不會在插入和移除記號時遺漏任何內容，而且串流長度會在我們腳下變更。

  scanTokens: (block) ->
    {tokens} = this
    i = 0
    i += block.call this, token, i, tokens while token = tokens[i]
    true

  detectEnd: (i, condition, action) ->
    {tokens} = this
    levels = 0
    while token = tokens[i]
      return action.call this, token, i     if levels is 0 and condition.call this, token, i
      return action.call this, token, i - 1 if not token or levels < 0
      if token[0] in EXPRESSION_START
        levels += 1
      else if token[0] in EXPRESSION_END
        levels -= 1
      i += 1
    i - 1

前導換行符號會在語法中引入歧義，因此我們在此分配它們。

  removeLeadingNewlines: ->
    break for [tag], i in @tokens when tag isnt 'TERMINATOR'
    @tokens.splice 0, i if i

詞法分析器已標記方法呼叫的開括號。將其與配對的閉括號配對。我們在此包含錯誤嵌套的縮排情況，用於在同一行上關閉的呼叫，就在其縮排之前。

  closeOpenCalls: ->
    condition = (token, i) ->
      token[0] in [')', 'CALL_END'] or
      token[0] is 'OUTDENT' and @tag(i - 1) is ')'

    action = (token, i) ->
      @tokens[if token[0] is 'OUTDENT' then i - 1 else i][0] = 'CALL_END'

    @scanTokens (token, i) ->
      @detectEnd i + 1, condition, action if token[0] is 'CALL_START'
      1

詞法分析器已標記索引操作呼叫的開括號。將其與配對的閉括號配對。

  closeOpenIndexes: ->
    condition = (token, i) ->
      token[0] in [']', 'INDEX_END']

    action = (token, i) ->
      token[0] = 'INDEX_END'

    @scanTokens (token, i) ->
      @detectEnd i + 1, condition, action if token[0] is 'INDEX_START'
      1

從 i 開始，使用 pattern 來比對代幣串流中的標籤，略過「HERECOMMENT」。pattern 可能包含字串（相等）、字串陣列（其中之一）或 null（萬用字元）。傳回比對的索引，如果沒有比對則傳回 -1。

  indexOfTag: (i, pattern...) ->
    fuzz = 0
    for j in [0 ... pattern.length]
      fuzz += 2 while @tag(i + j + fuzz) is 'HERECOMMENT'
      continue if not pattern[j]?
      pattern[j] = [pattern[j]] if typeof pattern[j] is 'string'
      return -1 if @tag(i + j + fuzz) not in pattern[j]
    i + j + fuzz - 1

如果在類似 @<x>:、<x>: 或 <EXPRESSION_START><x>...<EXPRESSION_END>: 的字串前面，則傳回 yes，略過「HERECOMMENT」。

  looksObjectish: (j) ->
    return yes if @indexOfTag(j, '@', null, ':') > -1 or @indexOfTag(j, null, ':') > -1
    index = @indexOfTag(j, EXPRESSION_START)
    if index > -1
      end = null
      @detectEnd index + 1, ((token) -> token[0] in EXPRESSION_END), ((token, i) -> end = i)
      return yes if @tag(end + 1) is ':'
    no

如果代幣的目前行包含相同表達式層級標籤的元素，則傳回 yes。在 LINEBREAK 或包含平衡表達式的明確開始處停止搜尋。

  findTagsBackwards: (i, tags) ->
    backStack = []
    while i >= 0 and (backStack.length or
          @tag(i) not in tags and
          (@tag(i) not in EXPRESSION_START or @tokens[i].generated) and
          @tag(i) not in LINEBREAKS)
      backStack.push @tag(i) if @tag(i) in EXPRESSION_END
      backStack.pop() if @tag(i) in EXPRESSION_START and backStack.length
      i -= 1
    @tag(i) in tags

¶

在代幣串流中尋找隱式呼叫和物件的標誌，並將其加入。
```
  addImplicitBracesAndParens: ->
```

在堆疊中追蹤目前的平衡深度（隱式和明確）。

    stack = []
    start = null

    @scanTokens (token, i, tokens) ->
      [tag]     = token
      [prevTag] = prevToken = if i > 0 then tokens[i - 1] else []
      [nextTag] = if i < tokens.length - 1 then tokens[i + 1] else []
      stackTop  = -> stack[stack.length - 1]
      startIdx  = i

¶

輔助函式，用於在傳回取得新代幣時追蹤已使用和拼接的代幣數目。
```
      forward   = (n) -> i - startIdx + n
```

輔助函式

      isImplicit        = (stackItem) -> stackItem?[2]?.ours
      isImplicitObject  = (stackItem) -> isImplicit(stackItem) and stackItem?[0] is '{'
      isImplicitCall    = (stackItem) -> isImplicit(stackItem) and stackItem?[0] is '('
      inImplicit        = -> isImplicit stackTop()
      inImplicitCall    = -> isImplicitCall stackTop()
      inImplicitObject  = -> isImplicitObject stackTop()

隱式括號內未關閉的控制陳述式（例如類別宣告或 if 條件式）

      inImplicitControl = -> inImplicit and stackTop()?[0] is 'CONTROL'

      startImplicitCall = (j) ->
        idx = j ? i
        stack.push ['(', idx, ours: yes]
        tokens.splice idx, 0, generate 'CALL_START', '(', ['', 'implicit function call', token[2]]
        i += 1 if not j?

      endImplicitCall = ->
        stack.pop()
        tokens.splice i, 0, generate 'CALL_END', ')', ['', 'end of input', token[2]]
        i += 1

      startImplicitObject = (j, startsLine = yes) ->
        idx = j ? i
        stack.push ['{', idx, sameLine: yes, startsLine: startsLine, ours: yes]
        val = new String '{'
        val.generated = yes
        tokens.splice idx, 0, generate '{', val, token
        i += 1 if not j?

      endImplicitObject = (j) ->
        j = j ? i
        stack.pop()
        tokens.splice j, 0, generate '}', '}', token
        i += 1

如果下列任一項在引數中，則不要在下次縮排時結束隱式呼叫

      if inImplicitCall() and tag in ['IF', 'TRY', 'FINALLY', 'CATCH',
        'CLASS', 'SWITCH']
        stack.push ['CONTROL', i, ours: yes]
        return forward(1)

      if tag is 'INDENT' and inImplicit()

INDENT 會關閉隱式呼叫，除非

我們在這一行看到一個 CONTROL 參數。
縮排前的最後一個符號是下列清單的一部分

        if prevTag not in ['=>', '->', '[', '(', ',', '{', 'TRY', 'ELSE', '=']
          endImplicitCall() while inImplicitCall()
        stack.pop() if inImplicitControl()
        stack.push [tag, i]
        return forward(1)

明確表達式的直接開始

      if tag in EXPRESSION_START
        stack.push [tag, i]
        return forward(1)

關閉所有在明確關閉表達式內的隱含表達式。

      if tag in EXPRESSION_END
        while inImplicit()
          if inImplicitCall()
            endImplicitCall()
          else if inImplicitObject()
            endImplicitObject()
          else
            stack.pop()
        start = stack.pop()

辨識標準的隱含呼叫，例如 f a、f() b、f? c、h[0] d 等。

      if (tag in IMPLICIT_FUNC and token.spaced or
          tag is '?' and i > 0 and not tokens[i - 1].spaced) and
         (nextTag in IMPLICIT_CALL or
          nextTag in IMPLICIT_UNSPACED_CALL and
          not tokens[i + 1]?.spaced and not tokens[i + 1]?.newLine)
        tag = token[0] = 'FUNC_EXIST' if tag is '?'
        startImplicitCall i + 1
        return forward(2)

隱含呼叫將隱含縮排物件作為第一個參數。

f
  a: b
  c: d

和

f
  1
  a: b
  b: c

當在與下列控制結構同一行時，不要接受此類型的隱含呼叫，因為這可能會誤解類似於

if f
   a: 1

if f(a: 1)

的結構，這可能總是無意的。此外，不要在字面陣列中允許這樣做，因為這會產生語法上的歧義。

      if tag in IMPLICIT_FUNC and
         @indexOfTag(i + 1, 'INDENT') > -1 and @looksObjectish(i + 2) and
         not @findTagsBackwards(i, ['CLASS', 'EXTENDS', 'IF', 'CATCH',
          'SWITCH', 'LEADING_WHEN', 'FOR', 'WHILE', 'UNTIL'])
        startImplicitCall i + 1
        stack.push ['INDENT', i + 2]
        return forward(3)

¶

隱含物件從這裡開始
```
      if tag is ':'
```

回到物件的（隱含）開頭

        s = switch
          when @tag(i - 1) in EXPRESSION_END then start[1]
          when @tag(i - 2) is '@' then i - 2
          else i - 1
        s -= 2 while @tag(s - 2) is 'HERECOMMENT'

標記值是否為 for 迴圈

        @insideForDeclaration = nextTag is 'FOR'

        startsLine = s is 0 or @tag(s - 1) in LINEBREAKS or tokens[s - 1].newLine

我們是否只是繼續已宣告的物件？

        if stackTop()
          [stackTag, stackIdx] = stackTop()
          if (stackTag is '{' or stackTag is 'INDENT' and @tag(stackIdx - 1) is '{') and
             (startsLine or @tag(s - 1) is ',' or @tag(s - 1) is '{')
            return forward(1)

        startImplicitObject(s, !!startsLine)
        return forward(2)

¶

在鏈接方法呼叫時結束隱含呼叫，例如
```
f ->
  a
.g b, ->
  c
.h a
```
還有
```
f a
.g b
.h a
```

將所有封閉物件標記為 not sameLine

      if tag in LINEBREAKS
        for stackItem in stack by -1
          break unless isImplicit stackItem
          stackItem[2].sameLine = no if isImplicitObject stackItem

      newLine = prevTag is 'OUTDENT' or prevToken.newLine
      if tag in IMPLICIT_END or tag in CALL_CLOSERS and newLine
        while inImplicit()
          [stackTag, stackIdx, {sameLine, startsLine}] = stackTop()

在到達參數清單的結尾時關閉隱含呼叫

          if inImplicitCall() and prevTag isnt ','
            endImplicitCall()

關閉隱含物件，例如：return a: 1, b: 2 unless true

          else if inImplicitObject() and not @insideForDeclaration and sameLine and
                  tag isnt 'TERMINATOR' and prevTag isnt ':'
            endImplicitObject()

當行尾時關閉隱含物件，行尾沒有逗號，隱含物件沒有在行首，且下一行不像是物件的延續。

          else if inImplicitObject() and tag is 'TERMINATOR' and prevTag isnt ',' and
                  not (startsLine and @looksObjectish(i + 1))
            return forward 1 if nextTag is 'HERECOMMENT'
            endImplicitObject()
          else
            break

如果逗號是最後一個字元，且後面看起來不像是屬於它的，則關閉隱含物件。這用於尾隨逗號和呼叫，例如

x =
    a: b,
    c: d,
e = 2

和

f a, b: c, d: e, f, g: h: i, j

      if tag is ',' and not @looksObjectish(i + 1) and inImplicitObject() and
         not @insideForDeclaration and
         (nextTag isnt 'TERMINATOR' or not @looksObjectish(i + 2))

¶

當 nextTag 為 OUTDENT 時，逗號不重要，應略過，因此將它嵌入隱含物件中。

當它不是逗號時，會繼續在堆疊中扮演呼叫或陣列的角色，因此給它一個機會。
```
        offset = if nextTag is 'OUTDENT' then 1 else 0
        while inImplicitObject()
          endImplicitObject i + offset
      return forward(1)
```

將位置資料新增到重寫器產生的所有權杖。

  addLocationDataToGeneratedTokens: ->
    @scanTokens (token, i, tokens) ->
      return 1 if     token[2]
      return 1 unless token.generated or token.explicit
      if token[0] is '{' and nextLocation=tokens[i + 1]?[2]
        {first_line: line, first_column: column} = nextLocation
      else if prevLocation = tokens[i - 1]?[2]
        {last_line: line, last_column: column} = prevLocation
      else
        line = column = 0
      token[2] =
        first_line:   line
        first_column: column
        last_line:    line
        last_column:  column
      return 1

OUTDENT 權杖應始終置於前一個權杖的最後一個字元，以便以 OUTDENT 權杖結尾的 AST 節點最終會對應到節點下方的最後一個「實際」權杖的位置。

  fixOutdentLocationData: ->
    @scanTokens (token, i, tokens) ->
      return 1 unless token[0] is 'OUTDENT' or
        (token.generated and token[0] is 'CALL_END') or
        (token.generated and token[0] is '}')
      prevLocationData = tokens[i - 1][2]
      token[2] =
        first_line:   prevLocationData.last_line
        first_column: prevLocationData.last_column
        last_line:    prevLocationData.last_line
        last_column:  prevLocationData.last_column
      return 1

由於我們的語法是 LALR(1)，因此無法處理缺少結束分隔符號的一些單行表達式。重寫器會新增隱含區塊，因此不需要這樣做。為了保持語法簡潔，會移除表達式中的尾隨換行符，並新增空區塊的縮排權杖。

  normalizeLines: ->
    starter = indent = outdent = null

    condition = (token, i) ->
      token[1] isnt ';' and token[0] in SINGLE_CLOSERS and
      not (token[0] is 'TERMINATOR' and @tag(i + 1) in EXPRESSION_CLOSE) and
      not (token[0] is 'ELSE' and starter isnt 'THEN') and
      not (token[0] in ['CATCH', 'FINALLY'] and starter in ['->', '=>']) or
      token[0] in CALL_CLOSERS and
      (@tokens[i - 1].newLine or @tokens[i - 1][0] is 'OUTDENT')

    action = (token, i) ->
      @tokens.splice (if @tag(i - 1) is ',' then i - 1 else i), 0, outdent

    @scanTokens (token, i, tokens) ->
      [tag] = token
      if tag is 'TERMINATOR'
        if @tag(i + 1) is 'ELSE' and @tag(i - 1) isnt 'OUTDENT'
          tokens.splice i, 1, @indentation()...
          return 1
        if @tag(i + 1) in EXPRESSION_CLOSE
          tokens.splice i, 1
          return 0
      if tag is 'CATCH'
        for j in [1..2] when @tag(i + j) in ['OUTDENT', 'TERMINATOR', 'FINALLY']
          tokens.splice i + j, 0, @indentation()...
          return 2 + j
      if tag in SINGLE_LINERS and @tag(i + 1) isnt 'INDENT' and
         not (tag is 'ELSE' and @tag(i + 1) is 'IF')
        starter = tag
        [indent, outdent] = @indentation tokens[i]
        indent.fromThen   = true if starter is 'THEN'
        tokens.splice i + 1, 0, indent
        @detectEnd i + 2, condition, action
        tokens.splice i, 1 if tag is 'THEN'
        return 1
      return 1

將後置條件標記為這樣，以便我們可以使用不同的優先順序來分析它們。

  tagPostfixConditionals: ->

    original = null

    condition = (token, i) ->
      [tag] = token
      [prevTag] = @tokens[i - 1]
      tag is 'TERMINATOR' or (tag is 'INDENT' and prevTag not in SINGLE_LINERS)

    action = (token, i) ->
      if token[0] isnt 'INDENT' or (token.generated and not token.fromThen)
        original[0] = 'POST_' + original[0]

    @scanTokens (token, i) ->
      return 1 unless token[0] is 'IF'
      original = token
      @detectEnd i + 1, condition, action
      return 1

根據同一行上的另一個權杖產生縮排權杖。

  indentation: (origin) ->
    indent  = ['INDENT', 2]
    outdent = ['OUTDENT', 2]
    if origin
      indent.generated = outdent.generated = yes
      indent.origin = outdent.origin = origin
    else
      indent.explicit = outdent.explicit = yes
    [indent, outdent]

  generate: generate

¶

根據權杖索引查詢標籤。
```
  tag: (i) -> @tokens[i]?[0]
```
¶

常數
¶

必須平衡的權杖對清單。

BALANCED_PAIRS = [
  ['(', ')']
  ['[', ']']
  ['{', '}']
  ['INDENT', 'OUTDENT'],
  ['CALL_START', 'CALL_END']
  ['PARAM_START', 'PARAM_END']
  ['INDEX_START', 'INDEX_END']
  ['STRING_START', 'STRING_END']
  ['REGEX_START', 'REGEX_END']
]

¶

我們嘗試修復 BALANCED_PAIRS 的反向對應，因此我們可以從任一端查詢。
```
exports.INVERSES = INVERSES = {}
```

標記平衡配對開始/結束的代幣。

EXPRESSION_START = []
EXPRESSION_END   = []

for [left, rite] in BALANCED_PAIRS
  EXPRESSION_START.push INVERSES[rite] = left
  EXPRESSION_END  .push INVERSES[left] = rite

表示表達式子句結束的代幣。

EXPRESSION_CLOSE = ['CATCH', 'THEN', 'ELSE', 'FINALLY'].concat EXPRESSION_END

如果後接 IMPLICIT_CALL，則表示函數呼叫的代幣。

IMPLICIT_FUNC    = ['IDENTIFIER', 'PROPERTY', 'SUPER', ')', 'CALL_END', ']', 'INDEX_END', '@', 'THIS']

如果前接 IMPLICIT_FUNC，則表示函數呼叫。

IMPLICIT_CALL    = [
  'IDENTIFIER', 'PROPERTY', 'NUMBER', 'INFINITY', 'NAN'
  'STRING', 'STRING_START', 'REGEX', 'REGEX_START', 'JS'
  'NEW', 'PARAM_START', 'CLASS', 'IF', 'TRY', 'SWITCH', 'THIS'
  'UNDEFINED', 'NULL', 'BOOL'
  'UNARY', 'YIELD', 'UNARY_MATH', 'SUPER', 'THROW'
  '@', '->', '=>', '[', '(', '{', '--', '++'
]

IMPLICIT_UNSPACED_CALL = ['+', '-']

總是標記單行隱式呼叫結束的代幣。

IMPLICIT_END     = ['POST_IF', 'FOR', 'WHILE', 'UNTIL', 'WHEN', 'BY',
  'LOOP', 'TERMINATOR']

具有未封閉結尾的區塊表達式的單行形式。語法無法消除它們的歧義，因此我們插入隱式縮排。

SINGLE_LINERS    = ['ELSE', '->', '=>', 'TRY', 'FINALLY', 'THEN']
SINGLE_CLOSERS   = ['TERMINATOR', 'CATCH', 'FINALLY', 'ELSE', 'OUTDENT', 'LEADING_WHEN']

結束一行的代幣。

LINEBREAKS       = ['TERMINATOR', 'INDENT', 'OUTDENT']

在換行後關閉開放呼叫的代幣。

CALL_CLOSERS     = ['.', '?.', '::', '?::']

常數