提取给定节点的所有父节点 [英] Extract all parents of a given node

查看:23
本文介绍了提取给定节点的所有父节点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 EBI-RDF sparql 端点,我基于 这个 两个类似的问题要制定查询,以下是说明问题的两个示例:

I'm trying to extract all parents of a each given GO Id (a node) using EBI-RDF sparql endpoint, I was based on this two similar questions to formulate the query, here're two examples illustrating the problem:

示例 1(链接到结构):

biological_process (GO:0008150)
           |__ metabolic process (GO:0008152)
                           |__ methylation (GO:0032259)

在本例中,使用以下查询:

In this example, using the following query:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX dbpedia2: <http://dbpedia.org/property/>
PREFIX dbpedia: <http://dbpedia.org/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

PREFIX obo: <http://purl.obolibrary.org/obo/>

SELECT (count(?mid) as ?depth)
       (group_concat(distinct ?midId ; separator = " / ") AS ?treePath) 
FROM <http://rdf.ebi.ac.uk/dataset/go> 
WHERE {
    obo:GO_0032259 rdfs:subClassOf* ?mid .
    ?mid rdfs:subClassOf* ?class .
    ?mid <http://www.geneontology.org/formats/oboInOwl#id> ?midId.
}
GROUP BY ?treePath
ORDER BY ?depth

我毫无问题地得到了想要的结果:

I got the desired results without problems:

c |              treePath
--|-------------------------------------
6 | GO:0008150 / GO:0008152 / GO:0032259

但是当术语存在于多个分支中时(例如 GO:0007267),如下例所示,之前的方法不起作用:

But when the term exists in multiple branches (e.g GO:0007267) as in the case below, the previous approach didn't work:

示例 2(结构链接)

biological_process (GO:0008150)
           |__ cellular_process (GO:0009987)
           |           |__ cell communication (GO:0007154)
           |                       |__ cell-cell signaling (GO:0007267)
           |
           |__ signaling (GO:0023052)
                      |__ cell-cell signaling (GO:0007267)

结果:

c |                            treePath
--|---------------------------------------------------------------
15| GO:0007154 / GO:0007267 / GO:0008150 / GO:0009987 / GO:0023052

我想得到的是以下内容:

What I wanted to get is the following:

GO:0008150 / GO:0009987 / GO:0007154 / GO:0007267
GO:0008150 / GO:0023052 / GO:0007267

<小时>

我的理解是,在幕后我正在计算每个级别的深度并使用它来构建路径,当我们有一个仅属于一个分支的元素时,这可以正常工作.


What I understood is that under the hood I'm calculating the depth of each level and using it to construct the path, this works fine when we have an element that belongs only to one branch.

SELECT (count(?mid) as ?depth) ?midId
FROM <http://rdf.ebi.ac.uk/dataset/go> 
WHERE {
    obo:GO_0032259 rdfs:subClassOf* ?mid .
    ?mid rdfs:subClassOf* ?class .
    ?mid <http://www.geneontology.org/formats/oboInOwl#id> ?midId.
}
GROUP BY ?midId
ORDER BY ?depth

结果:

depth |   midId
------|------------
1     | GO:0008150
2     | GO:0008152
3     | GO:0032259

在第二个例子中,事情被遗漏了,我不明白为什么,无论如何我确定问题的一部分是具有相同深度/级别的术语,但我不知道如何我解决了这个问题.

In the second example, things are missed up and I didn't get why, in any ways I'm sure that part of the problem are terms that have the same depth/level, but I don't know how can I solve this.

depth |   midId
------|------------
2     | GO:0008150
2     | GO:0009987
2     | GO:0023052
3     | GO:0007154
6     | GO:0007267

推荐答案

感谢@AKSW,我找到了一个使用 的不错的解决方案HyperGraphQL(一个 GraphQL 接口,用于在 Web 上查询和提供链接数据).

Thanks to @AKSW I found a decent solution using HyperGraphQL (a GraphQL interface for querying and serving linked data on the Web).

我会在这里留下详细的答案,它可能对某人有所帮助.

I'll leave the detailed answer here, it may help someone.

  1. 我下载并设置了 HyperGraphQL 下载页面
  2. 本教程

我使用的 config.json 文件:

{
    "name": "ebi-hgql",
    "schema": "ebischema.graphql",
    "server": {
        "port": 8081,
        "graphql": "/graphql",
        "graphiql": "/graphiql"
    },
    "services": [
        {
            "id": "ebi-sparql",
            "type": "SPARQLEndpointService",
            "url": "http://www.ebi.ac.uk/rdf/services/sparql",
            "graph": "http://rdf.ebi.ac.uk/dataset/go",
            "user": "",
            "password": ""
        }
    ]
}

这是我的 ebischema.graphql 文件的样子(因为我只需要 Classidlabel> 和 subClassOf):

Here's how my ebischema.graphql file looks like (Since I needed only the Class, id, label and subClassOf):

type __Context {
    Class:          _@href(iri: "http://www.w3.org/2002/07/owl#Class")
    id:             _@href(iri: "http://www.geneontology.org/formats/oboInOwl#id")
    label:          _@href(iri: "http://www.w3.org/2000/01/rdf-schema#label")
    subClassOf:     _@href(iri: "http://www.w3.org/2000/01/rdf-schema#subClassOf")
}

type Class @service(id:"ebi-sparql") {
    id: [String] @service(id:"ebi-sparql")
    label: [String] @service(id:"ebi-sparql")
    subClassOf: [Class] @service(id:"ebi-sparql")
}

  • 我开始测试一些简单的查询,但不断得到空响应;这个问题的答案解决了我的问题.

    最后我构造了查询来获取树

    Finally I constructed the query to get the tree

    使用此查询:

    {
      Class_GET_BY_ID(uris:[
        "http://purl.obolibrary.org/obo/GO_0032259",
        "http://purl.obolibrary.org/obo/GO_0007267"]) {
        id
        label
        subClassOf {
          id
          label
          subClassOf {
            id
            label
          }
        }
      }
    }
    

    我得到了一些有趣的结果:

    I got some interesting results:

    {
      "extensions": {},
      "data": {
        "@context": {
          "_type": "@type",
          "_id": "@id",
          "id": "http://www.geneontology.org/formats/oboInOwl#id",
          "label": "http://www.w3.org/2000/01/rdf-schema#label",
          "Class_GET_BY_ID": "http://hypergraphql.org/query/Class_GET_BY_ID",
          "subClassOf": "http://www.w3.org/2000/01/rdf-schema#subClassOf"
        },
        "Class_GET_BY_ID": [
          {
            "id": [
              "GO:0032259"
            ],
            "label": [
              "methylation"
            ],
            "subClassOf": [
              {
                "id": [
                  "GO:0008152"
                ],
                "label": [
                  "metabolic process"
                ],
                "subClassOf": [
                  {
                    "id": [
                      "GO:0008150"
                    ],
                    "label": [
                      "biological_process"
                    ]
                  }
                ]
              }
            ]
          },
          {
            "id": [
              "GO:0007267"
            ],
            "label": [
              "cell-cell signaling"
            ],
            "subClassOf": [
              {
                "id": [
                  "GO:0007154"
                ],
                "label": [
                  "cell communication"
                ],
                "subClassOf": [
                  {
                    "id": [
                      "GO:0009987"
                    ],
                    "label": [
                      "cellular process"
                    ]
                  }
                ]
              },
              {
                "id": [
                  "GO:0023052"
                ],
                "label": [
                  "signaling"
                ],
                "subClassOf": [
                  {
                    "id": [
                      "GO:0008150"
                    ],
                    "label": [
                      "biological_process"
                    ]
                  }
                ]
              }
            ]
          }
        ]
      },
      "errors": []
    }
    

  • 编辑

    这正是我想要的,但我注意到我不能像这样添加另一个子级别:

    This was exactly what I wanted, but I noticed that I can't add another sublevel like this:

    {
      Class_GET_BY_ID(uris:[
        "http://purl.obolibrary.org/obo/GO_0032259",
        "http://purl.obolibrary.org/obo/GO_0007267"]) {
        id
        label
        subClassOf {
          id
          label
          subClassOf {
            id
            label
            subClassOf {  # <--- 4th sublevel
              id
              label
            }
          }
        }
      }
    }
    

    我创建了一个新问题:端点返回的 Content-Type: text/html 无法被 SELECT 查询识别

    这篇关于提取给定节点的所有父节点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆