如何将Markdown的一小部分解析为React组件? [英] How to parse a small subset of Markdown into React components?

查看:82
本文介绍了如何将Markdown的一小部分解析为React组件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有Markdown的很小一部分,还有一些我想解析为React组件的自定义html.例如,我想将以下字符串转换为

I have a very small subset of Markdown along with some custom html that I would like to parse into React components. For example, I would like to turn this following string:

hello *asdf* *how* _are_ you !doing! today

进入以下数组:

[ "hello ", <strong>asdf</strong>, " ", <strong>how</strong>, " ", <em>are</em>, " you ", <MyComponent onClick={this.action}>doing</MyComponent>, " today" ]

,然后从React渲染函数返回它(React会将数组正确渲染为格式化的HTML)

and then return it from a React render function (React will render the array properly as formatted HTML)

基本上,我想为用户提供使用一组非常有限的Markdown选项将其文本转换为样式化组件(在某些情况下是我自己的组件!)的选项

Basically, I want to give users the option to use a very limited set of Markdown to turn their text into styled components (and in some cases my own components!)

危险地设置SetInnerHTML是不明智的,并且我不想引入外部依赖关系,因为它们都很繁重,而且我只需要非常基本的功能.

It is unwise to dangerouslySetInnerHTML, and I do not want to bring in an external dependency, because they are all very heavy, and I only need very basic functionality.

我目前正在做这样的事情,但是它非常脆弱,并且不能在所有情况下都起作用.我想知道是否有更好的方法:

I'm currently doing something like this, but it is very brittle, and doesn't work for all cases. I was wondering if there were a better way:

function matchStrong(result, i) {
  let match = result[i].match(/(^|[^\\])\*(.*)\*/);
  if (match) { result[i] = <strong key={"ms" + i}>{match[2]}</strong>; }
  return match;
}

function matchItalics(result, i) {
  let match = result[i].match(/(^|[^\\])_(.*)_/); // Ignores \_asdf_ but not _asdf_
  if (match) { result[i] = <em key={"mi" + i}>{match[2]}</em>; }
  return match;
}

function matchCode(result, i) {
  let match = result[i].match(/(^|[^\\])```\n?([\s\S]+)\n?```/);
  if (match) { result[i] = <code key={"mc" + i}>{match[2]}</code>; }
  return match;
}

// Very brittle and inefficient
export function convertMarkdownToComponents(message) {
  let result = message.match(/(\\?([!*_`+-]{1,3})([\s\S]+?)\2)|\s|([^\\!*_`+-]+)/g);

  if (result == null) { return message; }

  for (let i = 0; i < result.length; i++) {
    if (matchCode(result, i)) { continue; }
    if (matchStrong(result, i)) { continue; }
    if (matchItalics(result, i)) { continue; }
  }

  return result;
}

这是我的上一个问题导致了这个问题.

推荐答案

如何工作?

它的工作原理是逐块读取一个字符串,这可能不是 真正长字符串的最佳解决方案.

How it works?

It works by reading a string chunk by chunk, which might not be the best solution for really long strings.

只要解析器检测到关键块正在读取,即'*'或 任何其他markdown标记,它将开始解析该元素的块,直到 解析器找到其结束标记.

Whenever the parser detects a critical chunk is being read, i.e. '*' or any other markdown tag, it starts parsing chunks of this element until the parser finds its closing tag.

它适用于多行字符串,请参见示例代码.

It works on multi-line strings, see the code for example.

您还没有指定,否则我可能会误解您的需求,如果有的话 解析粗体和斜体 标签的必要性,目前 解决方案在这种情况下可能无法正常工作.

You haven't specified, or I could have misuderstood your needs, if there's the necessity to parse tags that are both bold and italic, my current solution might not work in this case.

但是,如果需要使用上述条件,请在此处评论 然后我将调整代码.

If you need, however, to work with the above conditions just comment here and I'll tweak the code.

标记不再是硬编码的,它们是可以轻松扩展的地图 满足您的需求.

Tags are no longer hardcoded, instead they are a map where you can easily extend to fit your needs.

修复了您在评论中提到的错误,感谢您指出此问题= p

Fixed the bugs you've mentioned in the comments, thanks for pointing this issues =p

尽管方法parseMarkdown尚不支持多长度标签, 我们可以用简单的string.replace轻松替换那些多长度标签 发送我们的rawMarkdown道具时.

Though the method parseMarkdown does not yet support multi-length tags, we can easily replace those multi-length tags with a simple string.replace when sending our rawMarkdown prop.

要在实践中查看此示例,请查看位于 在代码末尾.

To see an example of this in practice, look at the ReactDOM.render, located at the end of the code.

即使您的应用程序支持,也存在无效语言 JavaScript仍然可以检测到的unicode字符,例如:"\uFFFF"是无效的 unicode,如果我没记错的话,但是JS仍然可以比较它("\uFFFF" === "\uFFFF" = true)

Even if your application does support multiple languages, there are invalid unicode characters that JavaScript still detects, ex.: "\uFFFF" is not a valid unicode, if I recall correctly, but JS will still be able to compare it ("\uFFFF" === "\uFFFF" = true)

乍一看似乎很客气,但根据您的用例,我看不到 使用此路线可以解决任何主要问题.

It might seems hack-y at first but, depending on your use-case, I don't see any major issues by using this route.

好吧,我们可以轻松地跟踪最后一个N(其中N对应于长度 最长的多长度标签)块.

Well, we could easily track the last N (where N corresponds to the length of the longest multi-length tag) chunks.

将对循环内部方法的方式进行一些调整 parseMarkdown的行为,即检查当前块是否为多长度的一部分 标签(如果将其用作标签);否则,在``k之类的情况下,我们需要 将其标记为notMultiLength或类似名称,然后将其推送为 内容.

There would be some tweaks to be made to the way the loop inside method parseMarkdown behaves, i.e. checking if current chunk is part of a multi-length tag, if it is use it as a tag; otherwise, in cases like ``k, we'd need to mark it as notMultiLength or something similar and push that chunk as content.

// Instead of creating hardcoded variables, we can make the code more extendable
// by storing all the possible tags we'll work with in a Map. Thus, creating
// more tags will not require additional logic in our code.
const tags = new Map(Object.entries({
  "*": "strong", // bold
  "!": "button", // action
  "_": "em", // emphasis
  "\uFFFF": "pre", // Just use a very unlikely to happen unicode character,
                   // We'll replace our multi-length symbols with that one.
}));
// Might be useful if we need to discover the symbol of a tag
const tagSymbols = new Map();
tags.forEach((v, k) => { tagSymbols.set(v, k ); })

const rawMarkdown = `
  This must be *bold*,

  This also must be *bo_ld*,

  this _entire block must be
  emphasized even if it's comprised of multiple lines_,

  This is an !action! it should be a button,

  \`\`\`
beep, boop, this is code
  \`\`\`

  This is an asterisk\\*
`;

class App extends React.Component {
  parseMarkdown(source) {
    let currentTag = "";
    let currentContent = "";

    const parsedMarkdown = [];

    // We create this variable to track possible escape characters, eg. "\"
    let before = "";

    const pushContent = (
      content,
      tagValue,
      props,
    ) => {
      let children = undefined;

      // There's the need to parse for empty lines
      if (content.indexOf("\n\n") >= 0) {
        let before = "";
        const contentJSX = [];

        let chunk = "";
        for (let i = 0; i < content.length; i++) {
          if (i !== 0) before = content[i - 1];

          chunk += content[i];

          if (before === "\n" && content[i] === "\n") {
            contentJSX.push(chunk);
            contentJSX.push(<br />);
            chunk = "";
          }

          if (chunk !== "" && i === content.length - 1) {
            contentJSX.push(chunk);
          }
        }

        children = contentJSX;
      } else {
        children = [content];
      }
      parsedMarkdown.push(React.createElement(tagValue, props, children))
    };

    for (let i = 0; i < source.length; i++) {
      const chunk = source[i];
      if (i !== 0) {
        before = source[i - 1];
      }

      // Does our current chunk needs to be treated as a escaped char?
      const escaped = before === "\\";

      // Detect if we need to start/finish parsing our tags

      // We are not parsing anything, however, that could change at current
      // chunk
      if (currentTag === "" && escaped === false) {
        // If our tags array has the chunk, this means a markdown tag has
        // just been found. We'll change our current state to reflect this.
        if (tags.has(chunk)) {
          currentTag = tags.get(chunk);

          // We have simple content to push
          if (currentContent !== "") {
            pushContent(currentContent, "span");
          }

          currentContent = "";
        }
      } else if (currentTag !== "" && escaped === false) {
        // We'll look if we can finish parsing our tag
        if (tags.has(chunk)) {
          const symbolValue = tags.get(chunk);

          // Just because the current chunk is a symbol it doesn't mean we
          // can already finish our currentTag.
          //
          // We'll need to see if the symbol's value corresponds to the
          // value of our currentTag. In case it does, we'll finish parsing it.
          if (symbolValue === currentTag) {
            pushContent(
              currentContent,
              currentTag,
              undefined, // you could pass props here
            );

            currentTag = "";
            currentContent = "";
          }
        }
      }

      // Increment our currentContent
      //
      // Ideally, we don't want our rendered markdown to contain any '\'
      // or undesired '*' or '_' or '!'.
      //
      // Users can still escape '*', '_', '!' by prefixing them with '\'
      if (tags.has(chunk) === false || escaped) {
        if (chunk !== "\\" || escaped) {
          currentContent += chunk;
        }
      }

      // In case an erroneous, i.e. unfinished tag, is present and the we've
      // reached the end of our source (rawMarkdown), we want to make sure
      // all our currentContent is pushed as a simple string
      if (currentContent !== "" && i === source.length - 1) {
        pushContent(
          currentContent,
          "span",
          undefined,
        );
      }
    }

    return parsedMarkdown;
  }

  render() {
    return (
      <div className="App">
        <div>{this.parseMarkdown(this.props.rawMarkdown)}</div>
      </div>
    );
  }
}

ReactDOM.render(<App rawMarkdown={rawMarkdown.replace(/```/g, "\uFFFF")} />, document.getElementById('app'));

链接到代码(TypeScript) https://codepen.io/ludanin/pen/GRgNWPv

Link to the code (TypeScript) https://codepen.io/ludanin/pen/GRgNWPv

链接到代码(香草/babel) https://codepen.io/ludanin/pen/eYmBvXw

Link to the code (vanilla/babel) https://codepen.io/ludanin/pen/eYmBvXw

这篇关于如何将Markdown的一小部分解析为React组件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆