PageDown通过ScriptEngine错误地解析Markdown [英] PageDown through ScriptEngine incorrectly parsing Markdown

查看:128
本文介绍了PageDown通过ScriptEngine错误地解析Markdown的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在客户端使用 PageDown 作为编辑器,并且服务器端然后将Markdown解析为HTML。

I am trying to use PageDown on the client side as an editor, and on the server side to then parse that Markdown to HTML.

它似乎在客户端正常工作,但在服务器端,tickmarks只是编纂字符跟随,而不是它包装的词。所以,如果我这样做:

It seems to work fine on the client side, but on the server side, tickmarks are only "codifying" the character that follows, not the word that it wraps. So if I do this:

test`test` test

我期待这一点,这确实是我在客户端得到的:

I expect this, and this is indeed what I get on the client side:

test< code> test< / code> ;测试

但在服务器端,我最终得到了这个:

But on the server side, I end up getting this instead:

test< code> t< / code> \\ test< code> < / code> test

我创建了一个名为 pageDown.js的文件,这只是 Markdown.Converter。 js Markdown.Sanitizer.js 合并为一个文件,添加了此功能:

I've created a file called pageDown.js, which is simply Markdown.Converter.js and Markdown.Sanitizer.js combined into a single file, with this function added:

function getSanitizedHtml(pagedown){
    var converter =  new Markdown.getSanitizingConverter();
    return converter.makeHtml(pagedown);
}

在客户端,我可以像这样使用这个文件:

On the client side, I can use this file like so:

<!DOCTYPE html>
<html>
<head>
<script src="pageDown.js"></script>
<script>
function convert(){

    var html = getSanitizedHtml("test `test` test");

    console.log(html);

    document.getElementById("content").innerHTML = html;
}

</script>
</head>

<body onload="convert()">
<p id="content"></p>
</body>
</html>

正确显示:< p> test< code> test< /代码> test< / p>

在(Java)服务器端,我使用同一个确切的文件,通过Java的 ScriptEngineManager Invocable

On the (Java) server side, I use this same exact file, through Java's ScriptEngineManager and Invocable:

import java.io.InputStreamReader;
import javax.script.Invocable;
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;

public class PageDownTest{

    public static void main(String... args){

        try{
            ScriptEngineManager manager = new ScriptEngineManager();
            ScriptEngine engine = manager.getEngineByName("JavaScript");
            engine.eval(new InputStreamReader(PageDownTest.class.getResourceAsStream("pageDown.js")));
            Invocable inv = (Invocable) engine;
            String s = String.valueOf(inv.invokeFunction("getSanitizedHtml", "test `test` test"));
            System.out.println(s);
        }
        catch(Exception e){
            e.printStackTrace();
        }
    }
}

该程序打印出来: < p> test< code> t< / code> est< code>< / code> test< / p>

我看到其他降价的类似问题: test ** test ** test 只是忽略 ** 部分。但是, ## test 正确返回< h2> test< / h2>

I see similar problems with other markdown: test **test** test simply ignores the ** part. However, ##test correctly returns as <h2>test</h2>.

如果我直接通过HTML转到JavaScript,这一切都正常,但是当我通过Java时却没有。这里发生了什么?我应该以不同的方式处理服务器上的Markdown吗?

This all works fine if I go to the JavaScript directly through HTML, but not when I go through Java. What's going on here? Should I be handling Markdown on the server differently?

推荐答案

我设法将问题减少到以下代码:

I managed to reduce the problem to the following code:

function getSanitizedHtml(text)
{
    return text.replace(/(a)(?!b)\1/gm, 'c');
}

在浏览器中调用为

getSanitizedHtml('aa');

它返回:

c

从Nashorn引擎调用为

When called from the Nashorn engine as

String s = String.valueOf(inv.invokeFunction("getSanitizedHtml", "aa"));

它返回:

cc

对我来说,这看起来像后向引用 \ ,应该指向(a),而是指向(?!b),其捕获的内容为零长度,因此匹配任何内容。

To me, this looks like the backreference \1, which should point to (a), instead points to (?!b), whose captured content is zero-length and thus matches anything.

Java中的等效代码:

The equivalent code in Java:

System.out.println(("aa").replaceAll("(a)(?!b)\\1", "c"));

返回正确的结果:

c



结论



我很确定这是Nashorn引擎中的一个错误。

我提交了一个错误报告,如果它公开,我会在这里发布它的ID。

Conclusion

I'm pretty sure this is a bug in the Nashorn engine.
I filed a bug report and will post its ID here, if it goes public.

至于你的问题,我认为你唯一的选择是切换到不同的JavaScript环境,至少是暂时的。

As for your problem, I think your only option is to switch to a different JavaScript environment, at least temporarily.

浏览器中的JS:

function x(s){return s.replace(/(a)(?!b)\1/gm, 'c');}
document.write(x('aa'));

Nashorn引擎中的JS:

JS in Nashorn engine:

[ Ideone ]

[ Ideone ]

Pure Java:

Pure Java:

[ Ideone ]

[ Ideone ]

正如已经指出的,你唯一的选择(此时)是切换到另一个JavaScript环境。

有很多可用的,维基百科有比较页面。在这个例子中,我选择了 io.js (我相信你会设法自己安装它)。

As already pointed out, your only option (at this point) is to switch to another JavaScript environment.
There are many of those available, and Wikipedia has a comparison page. For this example, I've chosen io.js (I trust you'll manage to install it on your own).

如果你想使用你的pageDown.js文件,你首先需要注释掉 exports 检查并使用普通的旧变量,如下所示:

If you want to use your pageDown.js file, you'll first need to comment out the exports checks and use the plain old variables, like this:

/*if (typeof exports === "object" && typeof require === "function") // we're in a CommonJS (e.g. Node.js) module
    Markdown = exports;
else*/
    Markdown = {};

/*if (typeof exports === "object" && typeof require === "function") { // we're in a CommonJS (e.g. Node.js) module
    output = exports;
    Converter = require("./Markdown.Converter").Converter;
} else {*/
    output = Markdown;
    Converter = output.Converter;
//}

(注意我也更改了输出= window.Markdown; output = Markdown; - 你必须做同样的事情(否则Nashorn会给你一个错误),但是忘记了在你的问题中提到。)

(Note that I also changed output = window.Markdown; to output = Markdown; - you must have done the same (Nashorn would have given you an error otherwise), but just forgot to mention that in your question.)

或者,您当然可以使用导出系统和单独的文件,但我没有经验,所以我会做就这样。

Alternatively, you could of course use the exports system and separate files, but I have no experience with that, so I'll do it this way.

现在,io.js接受来自stdin的JavaScript代码,你可以通过 process.stdout.write()写入stdout ,所以我们可以执行以下操作(在命令行上):

Now, io.js accepts JavaScript code from stdin, and you can write to stdout via process.stdout.write(), so we can do the following (on the command line):

{ cat pageDown.js; echo 'process.stdout.write(getSanitizedHtml("test `test` test"));'; } | iojs;

我们得到以下回报:

<p>test <code>test</code> test</p>

如果你需要从Java那里做到这一点,你可以这样做:

If you need to do that from Java, you can do it like this:

import java.io.*;

class Test
{
    public static void main(String[] args) throws Exception
    {
        Process p = Runtime.getRuntime().exec("/path/to/iojs");
        OutputStream stdin = p.getOutputStream();
        InputStream stdout = p.getInputStream();
        File file = new File("/path/to/pageDown.js");
        byte[] b = new byte[(int)file.length()];
        FileInputStream in = new FileInputStream(file);
        for(int read = 0; read < b.length; read += in.read(b, read, b.length - read)); // <-- note the semicolon
        stdin.write(b);
        stdin.write("process.stdout.write(getSanitizedHtml('test `test` test'));".getBytes());
        stdin.close(); // <-- important to close
        p.waitFor();
        b = new byte[stdout.available()];
        stdout.read(b);
        System.out.println(new String(b));
    }
}

请注意 for (所以每次只有读取+ = in.read(b,读取,b.length - 读取),没有别的并且还要注意,在流上调用 .close()时通常是可选的,因为它会在对象超出范围时自动完成, stdin.close()必须在这里调用,或 iojs 将继续等待输入, p。 waitFor()将永远不会返回。

Note the semicolon directly after the for (so it only does read += in.read(b, read, b.length - read) every time, and nothing else) and also note that while calling .close() on a stream is usually optional, as it will be done automatically when the object goes out of scope, stdin.close() has to be called here, or iojs will continue to wait for input, and p.waitFor() will never return.

这篇关于PageDown通过ScriptEngine错误地解析Markdown的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆