如何编写 XQuery flwor 表达式来计算单词之间的概率? [英] How to write an XQuery flwor expression to calculate the probability between words?

查看：28 发布时间：2021/10/2 19:57:33 xml xslt xquery

本文介绍了如何编写 XQuery flwor 表达式来计算单词之间的概率?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在编写一个 XQuery flwor 表达式，以返回 xml 文件中所有出现的目标单词we"，以及每种情况下句子中的下一个单词.我想将概率计算为比率:(后继词出现在目标词我们"之后的次数除以后继词整体出现的次数).

I'm writing an XQuery flwor expression to return all the occurrences of the target word 'we' in the xml file, together with the word which comes next in the sentence in each case. I want to calculate the probability as the ratio: (number of times successor word appears after target word 'we' divided by the number of times successor word appears overall).

这是我正在处理的 XML 文件:

Here is the XML file I am working on:

<u who="PS6H7">
<s n="3">
    <w c5="AV0" hw="well" pos="ADV">Well</w>
    <c c5="PUN">, </c>
    <w c5="AJ0" hw="good" pos="ADJ">good </w>
    <w c5="NN1" hw="afternoon" pos="SUBST">afternoon</w>
    <c c5="PUN">, </c>
    <w c5="PNI" hw="everybody" pos="PRON">everybody</w>
    <c c5="PUN">, </c>
    <w c5="PNP" hw="i" pos="PRON">I </w>
    <w c5="VVB" hw="think" pos="VERB">think </w>
    <w c5="PNP" hw="we" pos="PRON">we</w>
    <w c5="VHD" hw="have" pos="VERB">'d </w>
    <w c5="AV0" hw="well" pos="ADV">better </w>
    <w c5="VVI" hw="get" pos="VERB">get </w>
    <w c5="VVN" hw="start" pos="VERB">started</w>
    <c c5="PUN">.</c>
</s>

<s n="4">
    <w c5="PNP" hw="we" pos="PRON">We </w>
    <w c5="VVD" hw="look" pos="VERB">looked </w>
    <w c5="AV0" hw="so" pos="ADV">so </w>
    <w c5="AJ0" hw="thin" pos="ADJ">thin </w>
    <w c5="PRP" hw="on" pos="PREP">on </w>
    <w c5="AT0" hw="the" pos="ART">the </w>
    <w c5="NN1" hw="ground" pos="SUBST">ground</w>
    <c c5="PUN">, </c>
    <w c5="PNP" hw="i" pos="PRON">I </w>
    <w c5="VVD" hw="think" pos="VERB">thought </w>
    <w c5="PNP" hw="we" pos="PRON">we</w>
    <w c5="VM0" hw="would" pos="VERB">'d </w>
    <w c5="VVI" hw="sit" pos="VERB">sit </w>
    <w c5="CJC" hw="and" pos="CONJ">and </w>
    <w c5="VVI" hw="wait" pos="VERB">wait </w>
    <w c5="CJC" hw="and" pos="CONJ">and </w>
    <w c5="VVI" hw="see" pos="VERB">see </w>
    <w c5="CJS" hw="if" pos="CONJ">if </w>
    <w c5="PNI" hw="everyone" pos="PRON">everyone</w>
    <w c5="VBZ" hw="be" pos="VERB">'s </w>
    <w c5="VVG-AJ0" hw="come" pos="VERB">coming</w>
    <c c5="PUN">, </c>
    <w c5="CJC" hw="but" pos="CONJ">but </w>
    <w c5="UNC" hw="erm" pos="UNC">erm </w>
    <w c5="PNP" hw="we" pos="PRON">we</w>
    <w c5="VM0" hw="will" pos="VERB">'ll </w>
    <w c5="VHI" hw="have" pos="VERB">have </w>
    <w c5="TO0" hw="to" pos="PREP">to </w>
    <w c5="VVI" hw="get" pos="VERB">get </w>
    <w c5="VVN" hw="start" pos="VERB">started </w>
    <w c5="AV0" hw="anyway" pos="ADV">anyway</w>
    <c c5="PUN">.</c>
</s>

<s n="5">
    <w c5="PNP" hw="we" pos="PRON">We</w>
    <w c5="VM0" hw="will" pos="VERB">'ll </w>
    <w c5="VVI" hw="welcome" pos="VERB">welcome</w>
    <c c5="PUN">, </c>
    <w c5="PNP" hw="we" pos="PRON">we </w>
    <w c5="VHB" hw="have" pos="VERB">have </w>
    <w c5="CRD" hw="two" pos="ADJ">two </w>
    <w c5="NN2" hw="speaker" pos="SUBST">speakers</w>
    <c c5="PUN">, </c>
    <w c5="NP0" hw="mr" pos="SUBST">Mr </w>
    <w c5="NP0" hw="bob" pos="SUBST">Bob </w>
    <w c5="NP0" hw="plumtree" pos="SUBST">Plumtree</w>
    <c c5="PUN">, </c>
    <w c5="CJC" hw="and" pos="CONJ">and </w>
    <w c5="NP0" hw="ms" pos="SUBST">Ms </w>
    <w c5="NP0" hw="erica" pos="SUBST">Erica </w>
    <w c5="NP0" hw="ison" pos="SUBST">Ison</w>
    <c c5="PUN">.</c>
</s>

<s n="6">
    <w c5="PNP" hw="we" pos="PRON">We </w>
    <w c5="VVD" hw="ask" pos="VERB">asked </w>
    <w c5="PNP" hw="they" pos="PRON">them </w>
    <w c5="PRP" hw="to" pos="PREP">to </w>
    <w c5="AT0" hw="the" pos="ART">the </w>
    <w c5="NN1" hw="meeting" pos="SUBST">meeting </w>
    <w c5="CJC" hw="and" pos="CONJ">and </w>
    <w c5="PNP" hw="we" pos="PRON">we </w>
    <w c5="VVB" hw="look" pos="VERB">look </w>
    <w c5="AV0" hw="forward" pos="ADV">forward </w>
    <w c5="PRP" hw="to" pos="PREP">to </w>
    <w c5="VVG-NN1" hw="listen" pos="VERB">listening </w>
    <w c5="PRP" hw="to" pos="PREP">to </w>
    <w c5="PNP" hw="you" pos="PRON">you </w>
    <w c5="AV0" hw="later" pos="ADV">later </w>
    <w c5="AVP" hw="on" pos="ADV">on </w>
    <w c5="PRP" hw="in" pos="PREP">in </w>
    <w c5="AT0" hw="the" pos="ART">the </w>
    <w c5="NN1" hw="agenda" pos="SUBST">agenda</w>
    <c c5="PUN">.</c>
</s>

<s n="7">
    <w c5="AT0" hw="the" pos="ART">The </w>
    <w c5="NN2" hw="minute" pos="SUBST">minutes </w>
    <w c5="PRF" hw="of" pos="PREP">of </w>
    <w c5="AT0" hw="the" pos="ART">the </w>
    <w c5="NN1" hw="meeting" pos="SUBST">meeting </w>
    <w c5="VVD-VVN" hw="hold" pos="VERB">held </w>
    <w c5="PRP" hw="in" pos="PREP">in </w>
    <w c5="NP0" hw="january" pos="SUBST">January</w>
    <c c5="PUN">.</c>
</s>

<s n="8">
    <w c5="DT0" hw="any" pos="ADJ">Any </w>
    <w c5="NN2" hw="correction" pos="SUBST">corrections </w>
    <w c5="PRP" hw="to" pos="PREP">to </w>
    <w c5="AT0" hw="the" pos="ART">the </w>
    <w c5="NN2" hw="minute" pos="SUBST">minutes </w>
    <w c5="ORD" hw="first" pos="ADJ">first</w>
    <c c5="PUN">?</c>
</s>

</u>

这是我的 XQuery 表达式.它返回所有出现的目标词 'we 以及它后面的词.我也能找到频率(后继词出现在目标词之后的次数)，但我无法计算概率比.求概率的公式是(后继词出现在目标词我们"之后的次数除以后继词整体出现的次数).

This is my XQuery expression. It returns all the occurrences of the target word 'we, together with the word that comes after it. I am also able to find the frequency (number of times the successor word occurs after target word), but I cannot calculate the probability ratio. The formula to find probability is (number of times successor word appears after target word 'we' divided by the number of times successor word appears overall).

结果，我想要一个 HTML 表格来显示第一列中的目标单词we"，第二列中出现在we"之后的单词以及组合出现在第三列中的频率或次数，以及第 4 列的概率.

In result, I want to an HTML table to show the target word 'we' in 1st column, the word that occurs after 'we' in 2nd column and the frequency or number of times the combination occurred in 3rd column, and the probability in the 4th column.

<html>
<body>
<table border='1'>
<tr><td>Target</td><td>Successor</td><td>Frequency</td><td>Probability</td></tr>

{

let $target := "we"

let $x := doc("KS0.xml")//u//s//w[lower-case(normalize-space()) = $target]

for $successor in distinct-values($x/following-sibling::w[1])

let $probability := count(doc("KS0.xml")//u//s//w)

let $frequency := $x/following-sibling::w[1][. = $successor]

order by count($frequency) descending

return <tr>
           <td>{$target}</td>
           <td>{$successor}</td>
           <td>{count($frequency)}</td>
           <td>{$probability}</td>
       </tr>
}

</table>
</body>
</html>

这是我得到的输出.它在第 4 列中计数的概率不正确.

This is my output which I get. The probability it counts in the 4th column in not correct.

<?xml version="1.0" encoding="UTF-8"?>
<html>
   <body>
      <table border="1">
         <tr>
            <td>Target</td>
            <td>Successor</td>
            <td>Frequency</td>
            <td>Probability</td>
         </tr>
         <tr>
            <td>we</td>
            <td>'re </td>
            <td>44</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>'ve </td>
            <td>38</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>'ll </td>
            <td>11</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>have </td>
            <td>8</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>could </td>
            <td>7</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>have</td>
            <td>6</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>do </td>
            <td>6</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>are </td>
            <td>6</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>'d </td>
            <td>5</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>do</td>
            <td>5</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>were </td>
            <td>4</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>should </td>
            <td>4</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>see </td>
            <td>3</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>will </td>
            <td>3</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>going </td>
            <td>3</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>had </td>
            <td>3</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>shall </td>
            <td>3</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>can </td>
            <td>3</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>look </td>
            <td>2</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>did</td>
            <td>2</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>know </td>
            <td>2</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>need </td>
            <td>2</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>make </td>
            <td>2</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>would </td>
            <td>2</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>want </td>
            <td>2</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>hope </td>
            <td>2</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>looked </td>
            <td>1</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>asked </td>
            <td>1</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>erm </td>
            <td>1</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>talking </td>
            <td>1</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>Chris</td>
            <td>1</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>aiming </td>
            <td>1</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>on</td>
            <td>1</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>come </td>
            <td>1</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>occasionally </td>
            <td>1</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>should</td>
            <td>1</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>ought </td>
            <td>1</td>
            <td>11674</td>
         </tr>
         <tr>
            <td>we</td>
            <td>said</td>
            <td>1</td>
            <td>11674</td>
         </tr>

      </table>
   </body>
</html>

如何编写 XQuery flwor 表达式来计算单词之间的概率? [英] How to write an XQuery flwor expression to calculate the probability between words?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何编写 XQuery flwor 表达式来计算单词之间的概率? [英] How to write an XQuery flwor expression to calculate the probability between words?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭