Wordnet查询以返回例句 [英] Wordnet query to return example sentences

查看:77
本文介绍了Wordnet查询以返回例句的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个用例,我有一个单词,我需要了解以下内容:

  1. 单词的同义词(只要同义词就足够了)
  2. 该词的所有含义,每个含义包含-与该词匹配的同义词,该词义的示例句子(如果有的话),该词义的词性.

示例-此查询链接.屏幕截图carry:

对于每个感觉",我们都有词性(如V),匹配该意义的同义词(如第一个意义上的transport,第二个意义上的packtake等) ),包含该词的示例句子(第一个意义上的This train is carrying nuclear wastecarry the suitcase to the car等,第二个意义上的I always carry money等).

如何从 Wordnet MySQL数据库中执行此操作?我运行了这个查询,它返回了单词的含义列表:

SELECT a.lemma, c.definition FROM words a INNER JOIN senses b ON a.wordid = b.wordid INNER JOIN synsets c ON b.synsetid = c.synsetid WHERE a.lemma = 'carry';

我如何获得每种感觉的同义词,例句,词性和特定于该感觉的同义词?我查询了vframesentencesvframesentencemaps表,看到了带有%s这样的占位符的示例句子,并且基于wordid列,我试图将它们与words表进行匹配,但是结果却非常错误.

对于单词carry,如果我运行这些查询,则会正确地获得同义词并理解含义:

1. select * from words where lemma='carry' //yield wordid as 21354
2. select * from senses where wordid=21354 //yield 41 sysnsetids, like 201062889
3. select * from synsets where synsetid=201062889 //yields the explanation "serve as a means for expressing something"
4. select * from senses where synsetid=20106288` /yields all matching synonyms for that sense as wordids, including "carry" - like 21354, 29630, 45011
5. select * from words where wordid=29630 //yields 'convey'

因此,我现在需要的是一种在41种感官中的每一种中找到单词carry的例句的方法.我该怎么办?

解决方案

您可以从samples表中获取句子.例如:

SELECT sample FROM samples WHERE synsetid = 201062889;

产量:

玛丽的画带有母爱

他的声音充满了愤怒

因此,您可以按以下方式扩展查询:

SELECT 
    a.lemma AS `word`,
    c.definition,
    c.pos AS `part of speech`,
    d.sample AS `example sentence`,
    (SELECT 
            GROUP_CONCAT(a1.lemma)
        FROM
            words a1
                INNER JOIN
            senses b1 ON a1.wordid = b1.wordid
        WHERE
            b1.synsetid = b.synsetid
                AND a1.lemma <> a.lemma
        GROUP BY b.synsetid) AS `synonyms`
FROM
    words a
        INNER JOIN
    senses b ON a.wordid = b.wordid
        INNER JOIN
    synsets c ON b.synsetid = c.synsetid
        INNER JOIN
    samples d ON b.synsetid = d.synsetid
WHERE
    a.lemma = 'carry'
ORDER BY a.lemma , c.definition , d.sample;

注意:带有GROUP_CONCAT的子选择在单行中以逗号分隔的列表形式返回每种含义的同义词,以减少行数.如果愿意,您可以考虑在单独的查询中返回这些查询(或作为该查询的一部分,但重复其他所有查询).

更新 如果您确实需要同义词作为结果中的行,则可以执行以下操作,但我不建议这样做:同义词和例句都属于特定的定义,因此对于每个例句都将重复使用同义词集.例如.如果特定定义有4个例句和5个同义词,则仅针对该定义,结果将有4 x 5 = 20行.

SELECT 
    a.lemma AS `word`,
    c.definition,
    c.pos AS `part of speech`,
    d.sample AS `example sentence`,
    subq.lemma AS `synonym`
FROM
    words a
        INNER JOIN
    senses b ON a.wordid = b.wordid
        INNER JOIN
    synsets c ON b.synsetid = c.synsetid
        INNER JOIN
    samples d ON b.synsetid = d.synsetid
        LEFT JOIN
    (SELECT 
        a1.lemma, b1.synsetid
    FROM
        senses b1
    INNER JOIN words a1 ON a1.wordid = b1.wordid) subq ON subq.synsetid = b.synsetid
        AND subq.lemma <> a.lemma
WHERE
    a.lemma = 'carry'
ORDER BY a.lemma , c.definition , d.sample;

I have a use case where I have a word and I need to know the following things:

  1. Synonyms for the word (just the synonyms are sufficient)
  2. All senses of the word, where each sense contains - the synonyms matching that word in that sense, example sentences in that sense (if there), the part of speech for that sense.

Example - this query link. Screenshot for the word carry:

For each 'sense', we have the part of speech (like V), synonyms matching that sense, (like transport in the first sense, pack, take in the second sense, etc), example sentences containing that word in that sense (This train is carrying nuclear waste, carry the suitcase to the car, etc in first sense, I always carry money etc in the second sense, etc.).

How do I do this from a Wordnet MySQL database? I ran this query, it returns the list of meanings for the word:

SELECT a.lemma, c.definition FROM words a INNER JOIN senses b ON a.wordid = b.wordid INNER JOIN synsets c ON b.synsetid = c.synsetid WHERE a.lemma = 'carry';

How do I get the synonyms, example sentences, part of speech and synonyms specific to that sense for each sense? I queried the vframesentences and vframesentencemaps tables, saw example sentences with placeholders like %s, and based on the wordid column I tried to match them with the words table, but got awfully wrong results.

Edit:

For the word carry, if I run these queries, I get synonyms and sense meanings correctly:

1. select * from words where lemma='carry' //yield wordid as 21354
2. select * from senses where wordid=21354 //yield 41 sysnsetids, like 201062889
3. select * from synsets where synsetid=201062889 //yields the explanation "serve as a means for expressing something"
4. select * from senses where synsetid=20106288` /yields all matching synonyms for that sense as wordids, including "carry" - like 21354, 29630, 45011
5. select * from words where wordid=29630 //yields 'convey'

So all I now need is a way of finding the example sentence for the word carry in each of the 41 senses. How do I do it?

解决方案

You can get the sentences from the samples table. E.g:

SELECT sample FROM samples WHERE synsetid = 201062889;

yields:

The painting of Mary carries motherly love

His voice carried a lot of anger

So you could extend your query as follows:

SELECT 
    a.lemma AS `word`,
    c.definition,
    c.pos AS `part of speech`,
    d.sample AS `example sentence`,
    (SELECT 
            GROUP_CONCAT(a1.lemma)
        FROM
            words a1
                INNER JOIN
            senses b1 ON a1.wordid = b1.wordid
        WHERE
            b1.synsetid = b.synsetid
                AND a1.lemma <> a.lemma
        GROUP BY b.synsetid) AS `synonyms`
FROM
    words a
        INNER JOIN
    senses b ON a.wordid = b.wordid
        INNER JOIN
    synsets c ON b.synsetid = c.synsetid
        INNER JOIN
    samples d ON b.synsetid = d.synsetid
WHERE
    a.lemma = 'carry'
ORDER BY a.lemma , c.definition , d.sample;

Note: The subselect with a GROUP_CONCAT returns the synonyms of each sense as a comma-separated list in a single row in order to cut down on the number of rows. You could consider returning these in a separate query (or as part of this query but with everything else duplicated) if preferred.

UPDATE If you really need synonyms as rows in the results, the following will do it but I wouldn't recommend it: Synonyms and example sentences both pertain to a particular definition so the set of synonyms will be duplicated for each example sentence. E.g. if there are 4 example sentences and 5 synonyms for a particular definition, the results would have 4 x 5 = 20 rows just for that definition.

SELECT 
    a.lemma AS `word`,
    c.definition,
    c.pos AS `part of speech`,
    d.sample AS `example sentence`,
    subq.lemma AS `synonym`
FROM
    words a
        INNER JOIN
    senses b ON a.wordid = b.wordid
        INNER JOIN
    synsets c ON b.synsetid = c.synsetid
        INNER JOIN
    samples d ON b.synsetid = d.synsetid
        LEFT JOIN
    (SELECT 
        a1.lemma, b1.synsetid
    FROM
        senses b1
    INNER JOIN words a1 ON a1.wordid = b1.wordid) subq ON subq.synsetid = b.synsetid
        AND subq.lemma <> a.lemma
WHERE
    a.lemma = 'carry'
ORDER BY a.lemma , c.definition , d.sample;

这篇关于Wordnet查询以返回例句的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆