马尔可夫链聊天机器人如何工作? [英] How do Markov Chain Chatbots work?

查看:124
本文介绍了马尔可夫链聊天机器人如何工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我当时正在考虑使用markov链之类的东西来创建一个聊天机器人,但是我不确定如何使它工作.据我了解,您是根据数据创建一个表的,其中包含给定的单词,然后是给定的单词.训练机器人时是否可以附加任何可能性或计数器?那是个好主意吗?

I was thinking of creating a chatbot using something like markov chains, but I'm not entirely sure how to get it to work. From what I understand, you create a table from data with a given word and then words which follow. Is it possible to attach any sort of probability or counter while training the bot? Is that even a good idea?

问题的第二部分是关键字.假设我已经可以从用户输入中识别关键字,那么如何生成一个使用该关键字的句子?我并不总是希望以关键字开头这句话,那么我该如何播种马尔可夫链呢?

The second part of the problem is with keywords. Assuming I can already identify keywords from user input, how do I generate a sentence which uses that keyword? I don't always want to start the sentence with the keyword, so how do I seed the markov chain?

推荐答案

几年前,我在Python中为IRC开发了一个Markov链聊天机器人,从中可以看出来我是怎么做的.生成的文本不一定有意义,但是阅读起来确实很有趣.让我们分步分解它.假设您有一个固定的输入文本文件(您可以使用聊天文本或歌词中的输入,也可以只是发挥想象力)

I made a Markov chain chatbot for IRC in Python a few years back and can shed some light how I did it. The text generated does not necessarily make any sense, but it can be really fun to read. Lets break it down in steps. Assuming you have a fixed input, a text file, (you can use input from chat text or lyrics or just use your imagination)

遍历文本并创建一个Dictionary,这意味着键-值容器.并将所有单词对作为键,并将其后的单词作为值. 例如:如果文本为"abcabk",则以"ab"为键,以"c"为值,然后以"bc"和"a"为值...该值应为列表或任何包含0的集合..许多项目",因为给定的一对单词可以有多个值.在上面的示例中,您将有两次"a b",然后是拳头,然后是"c",最后是"k".因此,最终您将获得一个类似于以下内容的字典/哈希:{'a b': ['c','k'], 'b c': ['a'], 'c a': ['b']}

Loop through the text and make a Dictionary, meaning key - value container. And put all pair of words as keys and the word following as a value. For example: If you have a text "a b c a b k" you start with "a b" as key and "c" as value, then "b c" and "a" as value... the value should be a list or any collection holding 0..many 'items' as you can have more than one value for a given pair of words. In the example above you will have "a b" two times followed fist by "c" then in the end by "k". So in the end you will have a dictionary/hash looking like this: {'a b': ['c','k'], 'b c': ['a'], 'c a': ['b']}

现在,您已具有构建时髦文本所需的结构.您可以选择从随机密钥或固定位置开始!因此,考虑到我们拥有的结构,我们可以从保存"ab"开始,然后从值c或k中随机取一个后续单词,因此循环中的第一个保存是"abk"(如果"k"是选择的随机值)然后您继续向右移动一个步骤(在我们的情况下为"bk"),并为该对保存一个随机值(如果没有),那么就退出循环(或者您可以决定其他内容,例如从头再来).何时循环完成,您将打印保存的文本字符串.

Now you have the needed structure for building your funky text. You can choose to start with a random key or a fixed place! So given the structure we have we can start by saving "a b" then randomly taking a following word from the value, c or k, so the first save in the loop, "a b k" (if "k" was the random value chosen) then you continue by moving one step to the right which in our case is "b k" and save a random value for that pair if you have, in our case no, so you break out of the loop (or you can decide other stuff like start over again). When to loop is done you print your saved text string.

输入越大,您将拥有更多的键值(单词对),然后将拥有一个更智能的机器人",以便您可以通过添加更多文本来训练"您的机器人(也许是聊天输入?).如果您有一本书作为输入,则可以构建一些不错的随机句子.请注意,您不必只将紧跟一对的单词作为一个值,也可以取2或10.不同之处在于,如果使用较长"的构建基块,则文本将显得更准确.以一对作为键,然后跟一个单词作为值.

The bigger the input, the more values you will have for you keys (pair of words) and will then have a "smarter bot" so you can "train" your bot by adding more text (perhaps chat input?). If you have a book as input, you can construct some nice random sentences. Please note that you don't have to take only one word that follows a pair as a value, you can take 2 or 10. The difference is that your text will appear more accurate if you use "longer" building blocks. Start with a pair as a key and the following word as a value.

因此,您看到您基本上可以完成两个步骤,首先构建一个结构,在该结构中您随机选择一个键作为开始,然后选择该键并打印该键的随机值,然后继续操作直到没有值或某个值为止.其他情况.如果愿意,可以从键值结构的聊天输入中播种"一对单词,以开始学习.如何启动连锁店取决于您的想象力.

So you see that you basically can have two steps, first make a structure where you randomly choose a key to start with then take that key and print a random value of that key and continue till you do not have a value or some other condition. If you want you can "seed" a pair of words from a chat input from your key-value structure to have a start. Its up to your imagination how to start your chain.

带有真实单词的示例:

"hi my name is Al and i live in a box that i like very much and i can live in there as long as i want"

"hi my" -> ["name"]

"my name" -> ["is"]

"name is" -> ["Al"]

"is Al" -> ["and"]

........

"and i" -> ["live", "can"]

........

"i can" -> ["live"]

......

现在构造一个循环:

选择一个随机密钥,说"hi my",然后随机选择一个值,此处仅一个,因此其为"name" (保存"hi my name").
现在,以我的名字"作为下一个键,向右移动一步,并选择一个随机值...是" (保存我叫我的名字").
现在移动并取名字是" ..."Al" (保存我叫AL").
现在取是Al" ...和" (保存我叫Al和我的名字").

Pick a random key, say "hi my" and randomly choose a value, only one here so its "name" (SAVING "hi my name").
Now move one step to the right taking "my name" as the next key and pick a random value... "is" (SAVING "hi my name is").
Now move and take "name is" ... "Al" (SAVING "hi my name is AL").
Now take "is Al" ... "and" (SAVING "hi my name is Al and").

...

当您来到和我"时,您将随机选择一个值,说可以",然后用单词我可以"等……当您达到停止条件或您没有值时在我们的情况下打印构造的字符串:

When you come to "and i" you will randomly choose a value, lets say "can", then the word "i can" is made etc... when you come to your stop condition or that you have no values print the constructed string in our case:

我叫Al,只要我愿意,我就可以住在那里".

如果您有更多值,则可以跳至任何键.值越多,组合越多,文本的随机性和趣味性就越大.

If you have more values you can jump to any keys. The more values the more combinations you have and the more random and fun the text will be.

这篇关于马尔可夫链聊天机器人如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆