为什么 findOne(<id>, <depth>) 在添加更多相同标签的节点时性能会慢得令人无法接受? [英] Why is findOne(<id>, <depth>) getting unacceptably slow performance when adding more nodes of the same label?

查看:53
本文介绍了为什么 findOne(<id>, <depth>) 在添加更多相同标签的节点时性能会慢得令人无法接受?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

上下文

我一直在开发一个由 Neo4j 数据库支持的 Spring Boot 网站.它旨在用作大学课程搜索系统.(相关结构是课程有模块集,有模块,与科目相关,等等......)

I've been developing a spring boot website backed by a Neo4j database. It is designed to work as a university course search system. (the relevant structure is that courses have modulesets, that have modules, that are related to subjects, etc...)

@JsonIdentityInfo(generator=JSOGGenerator.class)
public class Course extends DomainObject {
  @NotNull private String name;
  @NotNull private String courseCode;
  private String description;
  private School school;

  @Convert(AttendanceTypeConverter.class)
  private E_AttendanceType attendanceType;

  @Convert(CourseTypeConverter.class)
  private E_CourseType courseType;

  @Convert(SandwichYearTypeConverter.class)
  private E_SandwichYearType sandwichYearType;

  @Relationship(type = "COURSE_DESCRIPTION_FOR", direction =     Relationship.OUTGOING)
  private Set<CourseYearDescription> courseYearDescription;

  @Relationship(type = "COURSE_REQUISITES_SET_FOR", direction =  Relationship.OUTGOING)
  private Set<EntryRequirementsSet> entryRequirementsSets;

  @Relationship(type = "RUNS_COURSE", direction = Relationship.OUTGOING)
  Set<MemberOfFaculty> courseRunners;

对于课程页面,我需要填充课程的所有复杂字段,以便它们可以显示在页面上.我一直在通过 GraphRepository 使用深度为 4 的 T findOne(Long var1, int var2) 来获取综合课程对象.据我所知,我担心这是一个非常罕见的深度.但是,在运行该方法时,它返回时没有任何明显的延迟.

For course pages I need to populate all of the complex fields of a course so they can be displayed on a page. I'd been using the T findOne(Long var1, int var2) with a depth of 4 through a GraphRepository to get a comprehensive course object. I was concerned as to my knowledge this is a very uncommon depth. However, when running the method it returned without any noticeable delay.

问题在做一些压力测试时,我将数据库中的课程数量增加到 4000,发现延迟呈指数增长.向后工作深度 2 长达 20 秒,3 大约 60 秒,4 从未返回超过 5 分钟.尽管之前所有 3 个都以毫秒为单位返回.

PROBLEM When doing some stress testing I increased the number of courses in the database to 4000 and found the delay increase exponentially. Working backwards depth 2 was up to 20 seconds, 3 was about 60 seconds and 4 never returned over 5 minutes. This is despite that fact that all 3 previously returned in milliseconds.

我发现这很奇怪,因为我正在构建单个课程节点(由长节点 ID 标识),因此课程数量的增加不应该以这种方式改变 findOne 方法的速度.它仍然会构建相同大小的对象.

I found this odd as I was building off of a single course node (identified by long node id) so the increased number of courses shouldn't have changed the speed of the findOne method in this way. It would still be building the same size object.

测试为了测试替代方案,我运行了 MATCH (course:Course{courseCode:'HG65'})-[*1..4]->(x)RETURN * 以查看需要多长时间(obv 此处的课程代码将查询限制为一个课程节点而不是节点 ID).它立即返回了我想要的东西:

TESTING To test out alternatives I ran MATCH (course:Course{courseCode:'HG65'})-[*1..4]->(x)RETURN * to see how long that would take (obv here course code is limiting the query to one course node instead of node id). It returned instantly with exactly what I wanted:

这让我觉得这可能与 GraphRepository 中到 POJO 的结果映射有关.为了测试这一点,我创建了一些映射函数,用于获取 Neo4jOperation Result 对象并通过解析 + 的方式实例化/填充我的 Course 对象迭代结果图.从这个意义上说,我将模拟深度为 4 的 findOne.它毫无延迟地运行.我对此的唯一想法是 findOne 忽略了导致课程 1 -> 学校 -> 课程 2"事件的关系方向,从而导致获取大量增加.虽然我现在不知道如何确认,也不知道如何绕过.

This made me think it might be something to do with the result mapping to a POJO in the GraphRepository.To test this I created some mapping functions for taking a Neo4jOperation Result object and instantiating/populating my Course object by way of parsing + iterating through the Results Map. In this sense I would be emulating the findOne of depth 4. This ran with no delay. my only thought on this is that findOne ignores relationship directions leading to "course1 -> school -> course2" eventing in massive increase in fetching. Although I do not now how to confirm that as the case, nor how to get around it.

问题

当我添加更多 Course 对象时,为什么 findOne(ID, 4) 运行如此缓慢?每次我想要获取复杂的 POJO 时,我如何在不编写定制查询和结果映射器的情况下克服这个问题.

Why is findOne(ID, 4) running so slowly when I add more Course objects? How can I overcome this issue without writing bespoke queries and result mappers every time I want a complex POJO fetched.

我应该采取替代方法吗?

Is there an alternative approach I should take?

推荐答案

在检查了从我的 spring 项目到 Neo4j 数据库的调用后,我确认了这个问题.findOne() 使用 (n)-[]-(m) 关系.具体查询如下:

After inspecting the calls from my spring project to the Neo4j database I have confirmed the problem. findOne() uses a (n)-[]-(m) relationship. The exact query is as follows :

MATCH (n) WITH n.nodeId = {id} MATCH p=(n)-[*0..4]- (m) RETURN p

这正是我所期望的.如果我有 10000 门课程,这些课程都与距离一个深度的单个节点相关,那么它们都将以 2 个深度相互匹配.course -[]- school -[]- course.这意味着任何其他与课程相关的查询都会呈指数级增长.

This is what I expected. In the case that I have 10000 courses that are all related to a single node one depth away, they will all match to each other with 2 depth. course -[]- school -[]- course. This means any other course related queries would exponentially increase in size.

我的解决方案是更改默认查询并将其作为 GraphRepository 查询放置,如下所示:

My solution was to alter the default query and place it as a GraphRepository query as follows :

MATCH (n:Course{courseCode:{courseCode}}) WITH n MATCH p=(n)-[*0..4]->(m) RETURN p

注意关系已经从双向变成了-[]->向外的方向.此解决方案与 sping 映射 OGM 完美配合,并且我的复杂 POJO 中的所有子类都按预期填充.

Note that the relationship has changed from bidirectional to a -[]-> outwards direction. This solution works with the sping mapping OGM perfectly and all sub classes in my complex POJO are populated as expected.

这篇关于为什么 findOne(&lt;id&gt;, &lt;depth&gt;) 在添加更多相同标签的节点时性能会慢得令人无法接受?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆