如何从javascript文件中提取javascript函数 [英] How to to extract a javascript function from a javascript file

查看:99
本文介绍了如何从javascript文件中提取javascript函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要从脚本文件中提取整个javascript函数。我知道函数的名称,但我不知道函数的内容是什么。此函数可以嵌入任意数量的闭包中。

I need to extract an entire javascript function from a script file. I know the name of the function, but I don't know what the contents of the function may be. This function may be embedded within any number of closures.

我需要有两个输出值:


  1. 我在输入脚本中找到的命名函数的整个主体。

  2. 删除找到的命名函数的完整输入脚本。

所以,假设我在这个输入脚本中寻找 findMe 函数:

So, assume I'm looking for the findMe function in this input script:

function() {
  function something(x,y) {
    if (x == true) {
      console.log ("Something says X is true");
      // The regex should not find this:
      console.log ("function findMe(z) { var a; }");
    }
  }
  function findMe(z) {
    if (z == true) {
      console.log ("Something says Z is true");
    }
  }
  findMe(true);
  something(false,"hello");
}();

从这里,我需要以下两个结果值:

From this, I need the following two result values:


  1. 提取的 findMe 脚本

function findMe(z) {
  if (z == true) {
    console.log ("Something says Z is true");
  }
}


  • 带<$ c的输入脚本$ c> findMe 删除功能

    function() {
      function something(x,y) {
        if (x == true) {
          console.log ("Something says X is true");
          // The regex should not find this:
          console.log ("function findMe(z) { var a; }");
        }
      }
      findMe(true);
      something(false,"hello");
    }();
    


  • 我正在处理的问题:


    1. 要查找的脚本正文中可能包含任何有效的JavaScript代码。查找此脚本的代码或正则表达式必须能够忽略字符串中的值,多个嵌套块级别等等。

    1. The body of the script to find could have any valid javascript code within it. The code or regex to find this script must be able to ignore values in strings, multiple nested block levels, and so forth.

    如果要查找函数定义在字符串中指定,应该忽略它。

    If the function definition to find is specified inside of a string, it should be ignored.

    关于如何完成这样的事情的任何建议?

    Any advice on how to accomplish something like this?

    更新:

    看起来正则表达式不是正确的方法。我愿意接受能够帮助我实现这一目标的解析器。我正在寻找 Jison ,但我很想知道其他任何事情。

    It looks like regex is not the right way to do this. I'm open to pointers to parsers that could help me accomplish this. I'm looking at Jison, but would love to hear about anything else.

    推荐答案

    如果脚本包含在您的页面中(您不清楚)并且该功能可公开访问,那么您可以获取源代码到函数:

    If the script is included in your page (something you weren't clear about) and the function is publicly accessible, then you can just get the source to the function with:

    functionXX.toString();
    

    https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Function/toString

    其他想法:

    1)查看JS minification或JS pretty indent的开源代码。在这两种情况下,这些代码都必须理解JS语言才能以容错的方式完成工作。我怀疑这将是纯正的正则表达式,因为语言比这更复杂。

    1) Look at the open source code that does either JS minification or JS pretty indent. In both cases, those pieces of code have to "understand" the JS language in order to do their work in a fault tolerant way. I doubt it's going to be pure regex as the language is just a bit more complicated than that.

    2)如果您控制服务器上的源并且想要修改一个特定的函数,然后插入一些新的JS,在运行时用你自己的函数替换该函数。这样,你让JS编译器为你识别这个函数,然后用你自己的版本替换它。

    2) If you control the source at the server and are wanted to modify a particular function in it, then just insert some new JS that replaces that function at runtime with your own function. That way, you let the JS compiler identify the function for you and you just replace it with your own version.

    3)对于正则表达式,这就是我所做的不是万无一失,但我为我使用的一些构建工具工作:

    3) For regex, here's what I've done which is not foolproof, but worked for me for some build tools I use:

    我运行多次传递(在python中使用正则表达式):

    I run multiple passes (using regex in python):


    1. 删除用/ *和* /描述的所有评论。

    2. 删除所有引用的字符串

    3. 现在,剩下的就是非字符串,非评论javascript,所以你应该可以直接在函数声明上使用正则表达式

    4. 如果你需要带有字符串和注释的函数源,现在你知道函数的开始结束了,你将不得不重新构建它。

    1. Remove all comments delineated with /* and */.
    2. Remove all quoted strings
    3. Now, all that's left is non-string, non-comment javascript so you should be able to regex directly on your function declaration
    4. If you need the function source with strings and comments back in, you'll have to reconstitute that from the original, now that you know the begin end of the function

    以下是我使用的正则表达式(以python的多行格式表示):

    Here are the regexes I use (expressed in python's multi-line format):

    reStr = r"""
        (                               # capture the non-comment portion
            "(?:\\.|[^"\\])*"           # capture double quoted strings
            |
            '(?:\\.|[^'\\])*'           # capture single quoted strings
            |
            (?:[^/\n"']|/[^/*\n"'])+    # any code besides newlines or string literals
            |
            \n                          # newline
        )
        |
        (/\*  (?:[^*]|\*[^/])*   \*/)       # /* comment */
        |
        (?://(.*)$)                     # // single line comment
        $"""    
    
    reMultiStart = r"""         # start of a multiline comment that doesn't terminate on this line
        (
            /\*                 # /* 
            (
                [^\*]           # any character that is not a *
                |               # or
                \*[^/]          # * followed by something that is not a /
            )*                  # any number of these
        )
        $"""
    
    reMultiEnd = r"""           # end of a multiline comment that didn't start on this line
        (
            ^                   # start of the line
            (
                [^\*]           # any character that is not a *
                |               # or
                \*+[^/]         # * followed by something that is not a /
            )*                  # any number of these
            \*/                 # followed by a */
        )
    """
    
    regExSingleKeep = re.compile("// /")                    # lines that have single lines comments that start with "// /" are single line comments we should keep
    regExMain = re.compile(reStr, re.VERBOSE)
    regExMultiStart = re.compile(reMultiStart, re.VERBOSE)
    regExMultiEnd = re.compile(reMultiEnd, re.VERBOSE)
    

    这对我来说听起来很麻烦。你可能最好解释一下你真正试图解决的问题,这样人们可以帮助找到一个更优雅的解决方案来解决真正的问题。

    This all sounds messy to me. You might be better off explaining what problem you're really trying to solve so folks can help find a more elegant solution to the real problem.

    这篇关于如何从javascript文件中提取javascript函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆