在javascript中检索javascript评论,或者,如何解析js中的js? [英] Retrieve javascript comments in javascript, or, how do I parse js in js?
问题描述
我正在寻找一种方法来从一些(其他)JavaScript代码访问javascript评论。
我计划使用它来显示页面上调用各种js函数的元素的低级帮助信息,而不会在多个地方重复该信息。
mypage。 html:
...
< script src =foo.js>< / script&
...
< span onclick =foo(bar);> clickme< / span>
< span onclick =showhelpfor('foo');>?< / span>
...
foo.js:
/ **
* foo。
*使用bar调用它。Yadda yaddagroo。
* /
function foo(x)
{
...
}
我想我可以使用getElementsByTagName来抓取脚本标签,然后使用AJAX请求加载文件得到它的纯文本内容,但是,我需要一种方法来解析javascript可靠的方式(即没有一堆黑客一起regexp的),保留的字符,只是评估它会抛弃。 / p>
我想简单地把文档放在函数后面,在一个js字符串,但是这是尴尬,我有一个感觉让doxygen来选择它将是困难的。 / p>
function foo(x){...}
foo.comment =\
此函数foo.\
使用bar调用它。 Yadda yadda \groo\.\
;
解决方案您可以创建一个不解析完整JS语言的小解析器,但只匹配字符串文字,单行和多行注释和函数。
有一个名为 PEG.js 的JS解析器生成器可以做到这一点很容易,语法看起来像这样: / p>
{
var functions = {};
var buffer ='';
}
start
= unit * {return functions;}
unit
= func
/ string
/ multi_line_comment
/ single_line_comment
/ any_char
func
= m:multi_line_comment spaces?functionspaces id:identifier {functions [id] = m;}
/函数空格id:identifier {functions [id] = null;}
multi_line_comment
=/ * \\ //)} c :. {buffer + = c;})*
{
var temp = buffer;
buffer ='';
return/ *+ temp.replace(/ \s + / g,'');
}
single_line_comment
=//[^ \r\\\
] *
标识符
= [az] / [AZ] /_)b:([az] / [AZ] / [0-9] /_)* {return a + b.join();}
空格
= [\t\r\\\
] + {return;}
string
=\ \\。/ [^])*\
/'(\\。/ [^'])*'
any_char
=。
当您使用生成的解析器解析以下源: / p>
/ **
*此函数执行foo。
*用bar调用Yadda yaddagroo 。
* /
function foo(x)
{
...
}
var s =/ * .. 。* / function notAFunction(){} ...;
//函数alsoNotAFunction()
// {...}
function withoutMultiLineComment ){
}
var t ='/ * ... * / function notAFunction(){} ...';
/ **
* BAR!
*调用它
* /
函数doc_way_above(x,y, z){
...
}
// function done(){};
start()
返回以下映射:{
foo:/ ** *此函数执行foo。* Call它与bar。Yadda yadda \groo\。* /,
withoutMultiLineComment:null,
doc_way_above:/ ** * BAR!
}
我意识到有一些空白需要填补(如
this.id = function(){...}
),但在阅读来自PEG.js的文档< a>一点,这不应该是一个大问题(假设你知道一点解析器生成器)。如果这是一个问题,回发,我将它添加到语法,并解释一下语法中发生了什么。
你甚至可以测试语法上面发布!
I am looking for a way to access javascript comments from some (other) javascript code. I plan on using this to display low level help information for elements on the page that call various js function without duplicating that information in multiple places.
mypage.html:
... <script src="foo.js"></script> ... <span onclick="foo(bar);">clickme</span> <span onclick="showhelpfor('foo');>?</span> ...
foo.js:
/** * This function does foo. * Call it with bar. Yadda yadda "groo". */ function foo(x) { ... }
I figure I can use getElementsByTagName to grab the script tag, then load the file with an AJAX request to get the plain text content of it. However, then I'd need a way to parse the javascript in a reliable way (i.e. not a bunch of hacked together regexp's) that preserves the characters that simply eval'ing it would throw away.
I was thinking of simply putting the documentation after the function, in a js string, but that's awkward and I have a feeling getting doxygen to pick that up will be difficult.
function foo(x) { ... } foo.comment = "\ This functions does foo.\ Call it with bar. Yadda yadda \"groo\".\ ";
解决方案You could create a little parser that does not parse the complete JS language, but only matches string literals, single- and multi-line comments and functions of course.
There's a JS parser generator called PEG.js that could do this fairly easy. The grammar could look like this:
{ var functions = {}; var buffer = ''; } start = unit* {return functions;} unit = func / string / multi_line_comment / single_line_comment / any_char func = m:multi_line_comment spaces? "function" spaces id:identifier {functions[id] = m;} / "function" spaces id:identifier {functions[id] = null;} multi_line_comment = "/*" ( !{return buffer.match(/\*\//)} c:. {buffer += c;} )* { var temp = buffer; buffer = ''; return "/*" + temp.replace(/\s+/g, ' '); } single_line_comment = "//" [^\r\n]* identifier = a:([a-z] / [A-Z] / "_") b:([a-z] / [A-Z] / [0-9] /"_")* {return a + b.join("");} spaces = [ \t\r\n]+ {return "";} string = "\"" ("\\" . / [^"])* "\"" / "'" ("\\" . / [^'])* "'" any_char = .
When you parse the following source with the generated parser:
/** * This function does foo. * Call it with bar. Yadda yadda "groo". */ function foo(x) { ... } var s = " /* ... */ function notAFunction() {} ... "; // function alsoNotAFunction() // { ... } function withoutMultiLineComment() { } var t = ' /* ... */ function notAFunction() {} ... '; /** * BAR! * Call it? */ function doc_way_above(x, y, z) { ... } // function done(){};
the
start()
function of the parser returns the following map:{ "foo": "/** * This function does foo. * Call it with bar. Yadda yadda \"groo\". */", "withoutMultiLineComment": null, "doc_way_above": "/** * BAR! * Call it? */" }
I realize there's some gaps to be filled (like
this.id = function() { ... }
), but after reading the docs from PEG.js a bit, that shouldn't be a big problem (assuming you know a little of parser generators). If it is a problem, post back and I'll add it to the grammar and explain a bit about what's happening in the grammar.You can even test the grammar posted above online!
这篇关于在javascript中检索javascript评论,或者,如何解析js中的js?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!