获得蛋白质矩阵 [英] obtain the matrix in protege

查看:65
本文介绍了获得蛋白质矩阵的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的工作是关于推荐系统的图书.作为输入,我需要书分类本体.在我的本体中对图书馆书籍进行分类.除兄弟姐妹类作者,书本,伊斯本以外,该分类有14个类别.图书班级的个人是图书的主题(约600个学科),作者班级的个人是姓名的作者,也是isbn班级.我用protege 4.1设计了这种本体.

My work is about library book of recommendation systems . that as input I need book Classification ontology . in my ontology classify library books. this classification has 14 categories, beside the sibling classes Author, book, Isbn. Individuals in book class are book’s subject(about 600 subjects) , and individuals in author class are name’s author and also isbn class. I design this ontology with protege 4.1.

,并手动将部分归属书归为类别.该对象的属性是与类别相关的名称"hasSubject"的相关个人书籍类.示例书籍"A"的主题类别为"S"和"F",并且...结果,我想获得属于书籍类别的矩阵.这样一来,如果这本书属于某个类别,则得到1,否则得到0.像这样:

also I collected and Have got in part of belong book to categories manually. That a object properties is name "hasSubject" related individual book class with categories. Example book "A" hasSubject Categories "S" and "F" and... As result I want to get the matrix belonging to Book Categories. This is the way that if the book belongs to a categories then get 1 and Otherwise Takes the value 0. Like this:

     cat1   cat2   cat3   
book1   1      0      0   
book2   1      0      1   
book3   1      1      0  

在此示例中,表示book1属于类别1,不属于类别2和3.我该如何使用sparql in protege进行这项工作?

In this example Expresses that book1 belong to category 1 and Does not belong category 2 and 3. How can I do this work with sparql in protege?

推荐答案

处理固定数量的类别

给出类似的数据

Handling a fixed number of categories

Given data like

@prefix : <http://example.org/books/> .

:book1 a :Book, :Cat1 .
:book2 a :Book, :Cat1, :Cat3 .
:book3 a :Book, :Cat1, :Cat2 .

您可以使用类似的查询

prefix : <http://example.org/books/>

select ?individual
       (if(bound(?cat1),1,0) as ?Cat1)
       (if(bound(?cat2),1,0) as ?Cat2)
       (if(bound(?cat3),1,0) as ?Cat3)
where {
  ?individual a :Book .
  OPTIONAL { ?individual a :Cat1 . bind( ?individual as ?cat1 ) } 
  OPTIONAL { ?individual a :Cat2 . bind( ?individual as ?cat2 ) }
  OPTIONAL { ?individual a :Cat3 . bind( ?individual as ?cat3 ) }
}
order by ?book

根据是否存在某些三元组来获取某些结果,其中绑定了某些变量(尽管绑定到的特定值实际上并不重要)

in which certain variables are bound (the particular value to which they are bound doesn't really matter though) based on the whether certain triples are present to get results like these:

$ arq --data data.n3 --query matrix.sparql
-----------------------------------
| individual | Cat1 | Cat2 | Cat3 |
===================================
| :book1     | 1    | 0    | 0    |
| :book2     | 1    | 0    | 1    |
| :book3     | 1    | 1    | 0    |
-----------------------------------

处理任意数量的类别

这是在耶拿看来可行的解决方案,尽管我不确定具体结果是否得到保证. (更新:基于此

Handling an arbitrary number of categories

Here's a solution that seems to work in Jena, though I'm not sure that the specific results are guaranteed. (Update: Based on this answers.semanticweb.com question and answer, it seems that this behavior is not guaranteed by the SPARQL specification.) If we have a little bit more data, e.g., about which things are categories and which are books, e.g.,

@prefix : <http://example.org/books/> .

:book1 a :Book, :Cat1 .
:book2 a :Book, :Cat1, :Cat3 .
:book3 a :Book, :Cat1, :Cat2 .

:Cat1 a :Category .
:Cat2 a :Category .
:Cat3 a :Category .

然后,我们可以运行一个子查询,按顺序选择所有类别,然后为每本书计算一个字符串,以指示该书是否在每个类别中.

then we can run a subquery that selects all the categories in order, and then for each book computes a string indicating whether or not the book is in each category.

prefix : <http://example.org/books/>

select ?book (group_concat(?isCat) as ?matrix) where { 
  { 
    select ?category where { 
      ?category a :Category 
    }
    order by ?category 
  }
  ?book a :Book .
  OPTIONAL { bind( 1 as ?isCat )              ?book a ?category . }
  OPTIONAL { bind( 0 as ?isCat ) NOT EXISTS { ?book a ?category } }
}
group by ?book
order by ?book

其输出为:

$ arq --data data.n3 --query matrix2.query
--------------------
| book   | matrix  |
====================
| :book1 | "1 0 0" |
| :book2 | "1 0 1" |
| :book3 | "1 1 0" |
--------------------

,它更接近问题的输出,并处理任意数字类别.但是,这取决于对每个?book以相同顺序处理的?category的值,并且我不确定是否可以保证.

which is much closer to the output in the question, and handles an arbitrary number categories. However, it depends on the values of ?category being processed in the same order for each ?book, and I'm not sure whether that's guaranteed or not.

我们甚至可以使用这种方法为表格生成标题行.同样,这取决于对每个?book以相同顺序处理的?category值,这可能无法保证,但似乎可以在耶拿工作.要获得类别标题,我们要做的就是创建一个未绑定?book的行,而?isCat的值表示特定的类别:

We can even use this approach to generate a header row for the table. Again, this depends on the ?category values being processed in the same order for each ?book, which might not be guaranteed, but seems to work in Jena. To get a category header, all we need to do is create a row where ?book is unbound, and the value of the ?isCat indicates the particular category:

prefix : <http://example.org/books/>

select ?book (group_concat(?isCat) as ?matrix) where { 
  { 
    select ?category where { 
      ?category a :Category 
    }
    order by ?category 
  }

  # This generates the header row where ?isCat is just
  # the category, so the group_concat gives headers.
  { 
    bind(?category as ?isCat) 
  }
  UNION 
  # This is the table as before
  {
    ?book a :Book .
    OPTIONAL { bind( 1 as ?isCat )              ?book a ?category . }
    OPTIONAL { bind( 0 as ?isCat ) NOT EXISTS { ?book a ?category } }
  }
}
group by ?book
order by ?book

我们得到以下输出:

--------------------------------------------------------------------------------------------------------
| book   | matrix                                                                                      |
========================================================================================================
|        | "http://example.org/books/Cat1 http://example.org/books/Cat2 http://example.org/books/Cat3" |
| :book1 | "1 0 0"                                                                                     |
| :book2 | "1 0 1"                                                                                     |
| :book3 | "1 1 0"                                                                                     |
--------------------------------------------------------------------------------------------------------

使用一些字符串操作,您可以缩短用于类别的URI,或者加宽数组条目以获得正确的对齐方式.一种可能是这样:

Using some string manipulation, you could shorten the URIs used for the categories, or widen the array entries to get correct alignment. One possibility is this:

prefix : <http://example.org/books/>

select ?book (group_concat(?isCat) as ?categories) where { 
  { 
    select ?category
           (strafter(str(?category),"http://example.org/books/") as ?name)
     where { 
      ?category a :Category 
    }
    order by ?category 
  }

  { 
    bind(?name as ?isCat)
  }
  UNION 
  {
    ?book a :Book .
    # The string manipulation here takes the name of the category (which should
    # be at least two character), trims off the first character (string indexing
    # in XPath functions starts at 1), and replaces the rest with " ". The resulting
    # spaces are concatenated with "1" or "0" depending on whether the book is a
    # member of the category.  The resulting string has the same width as the
    #  category name, and makes for a nice table.
    OPTIONAL { bind( concat(replace(substr(?name,2),"."," "),"1") as ?isCat )              ?book a ?category . }
    OPTIONAL { bind( concat(replace(substr(?name,2),"."," "),"0") as ?isCat ) NOT EXISTS { ?book a ?category } }
  }
}
group by ?book
order by ?book

产生以下输出:

$ arq --data data.n3 --query matrix3.query
-----------------------------
| book   | categories       |
=============================
|        | "Cat1 Cat2 Cat3" |
| :book1 | "   1    0    0" |
| :book2 | "   1    0    1" |
| :book3 | "   1    1    0" |
-----------------------------

这几乎就是您所提出的问题.

which is almost exactly what you had in the question.

这篇关于获得蛋白质矩阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆