XSLT 标记化 - 捕获分隔符 [英] XSLT tokenize - capturing the separators

查看：32 发布时间：2021/9/8 20:23:08 xslt tokenize separator

本文介绍了XSLT 标记化 - 捕获分隔符的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这里是一段 XSL 代码，它将一个文本标记为由interpunction 和类似字符分隔的片段.我想问一下是否有可能以某种方式捕获文本标记的字符串，例如逗号或点等.

here is a piece of code in XSL which tokenizes a text into fragments separated by interpunction and similar characters. I'd like to ask if there is a possibility to somehow capture the strings by which the text was tokenized, for example the comma or dot etc.

<xsl:stylesheet version="2.0" exclude-result-prefixes="xs xdt err fn" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fn="http://www.w3.org/2005/xpath-functions" xmlns:err="http://www.w3.org/2005/xqt-errors" xmlns:xdt="http://www.w3.org/2005/xpath-datatypes">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="GENERUJ">
    <TEXT>
        <xsl:variable name="text">
            <xsl:value-of select="normalize-space(unparsed-text(@filename, 'UTF-8'))" disable-output-escaping="yes"/>
        </xsl:variable>
        <xsl:for-each select="tokenize($text, '(\s+(&quot;|\(|\[|\{))|((&quot;|,|;|:|\s\-|\)|\]|\})\s+)|((\.|\?|!|;)&quot;?\s*)' )">
            <xsl:choose>
                <xsl:when test="string-length(.)&gt;0">
                    <FRAGMENT>
                        <CONTENT>
                            <xsl:value-of select="."/>
                        </CONTENT>
                        <LENGTH>
                            <xsl:value-of select="string-length(.)"/>
                        </LENGTH>
                    </FRAGMENT>
                </xsl:when>
                <xsl:otherwise>
                    <FRAGMENT_COUNT>
                        <xsl:value-of select="last()-1"/>
                    </FRAGMENT_COUNT>
                </xsl:otherwise>
            </xsl:choose>
        </xsl:for-each>
    </TEXT>
</xsl:template>

当您看到构造的标签内容、长度时，如果您明白我的意思，我想添加一个名为 SEPARATOR 的标签.我在互联网上找不到任何答案，我只是一个 xsl 转换的初学者，所以我正在寻找一个快速的解决方案.提前致谢.

As you see the constructed tags CONTENTS, LENGTH, I'd like to add one called SEPARATOR if you know what I mean. I couldnt find any answer to this on the internet and I'm just a beginner with xsl transformations so I'm looking for a quick solution. Thank you in advance.

XSLT 标记化 - 捕获分隔符 [英] XSLT tokenize - capturing the separators

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

XSLT 标记化 - 捕获分隔符 [英] XSLT tokenize - capturing the separators

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭