一张或两张表都保存后,如何加快更新表之间的关系? [英] How to speed up updating relationship among tables, after one or both tables are already saved?

查看:22
本文介绍了一张或两张表都保存后,如何加快更新表之间的关系?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题:在两个表或其中一个表都已保存后,更新和保存速度很快,具有大量数据的表之间的关系.

我有五个表 TvGenres、TvSubgenre、TvProgram、Channels、TvSchedules,它们之间的关系如下图所示

I have five tables TvGenres, TvSubgenre, TvProgram, Channels, TvSchedules with the relationship between them as shown in below image

现在的问题是所有数据下载都是根据以前的数据按顺序发生的,与 SQLite 不同,我需要设置它们之间的关系,为此我必须一次又一次地搜索表并设置它们之间的关系,即时间-消耗所以我怎样才能更快地做到这一点

Now the problem is all data downloading happens in sequence based on previous data and unlike SQLite, I need to set relationship between them and to do that I have to search table again and again and set the relation between them which is time-consuming so how can I do that faster

我使用了 2 种不同的方法来解决,但都没有按预期工作

I use 2 different approaches to solve but both are not working as expected

首先让我告诉你,下载是如何工作的

First let me tell, how downloading is working

首先我根据用户语言获取所有频道的详细信息从频道中,我获取了下一周的所有时间表(这是大量数据(大约 30k+))从日程数据中,我获取了所有节目数据(这又是大量数据)

First I fetch all the channels details based on user languages From channels, I fetch all the schedules for next one week (that's a lot of data (around 30k+ )) And from schedules data, I fetch all the programs data (that's again a lot of data )

方法一,

下载所有数据并创建它们的对象列表,然后在所有下载完成后立即存储它们但仍然设置它们之间的关系需要时间,最糟糕的是现在循环发生两次,我必须循环创建所有类list 然后再次循环将它们存储在 table view 中,仍然没有解决关系耗时的问题.

Download all data and create object list of them and then store them at once after all downloading is done but still setting relationship among them takes time and worst thing now the loop happens twice as first I have to loop to create all the class list and then loop again to store those in table view and still don’t solve the relationship time-consuming issue.

方法 2

一个一个的下载就像下载频道存储然后下载时间表存储然后下载节目然后将它们存储在核心数据中这一切都可以但是现在频道与时间表有关系,时间表与节目有关系并设置在我存储日程表时,我还获取与该日程表相关的频道,然后设置关系,程序和日程表也是如此,下面是代码,所以我该如何解决这个问题,或者我应该如何下载和存储它尽可能快.

Download one by one like download channels store them and then download schedules store them and then download programs and then store them in core data this is all ok but now channels have relationship with schedule and schedules have relationship with programs and to set the relation while I am storing schedules I also fetch channel related to that schedule and then set the relationship, same for program and schedules and that's taking time below is the code so how can I fix this problem or how should I download and store so it becomes as fast as possible.

仅存储时间表的代码

func saveScheduleDataToCoreData(withScheduleList scheduleList: [[String : Any]], completionBlock: @escaping (_ programIds: [String]?) -> Void) {
    let start = DispatchTime.now()
    let context = coreDataStack.managedObjectContext

    var progIds = [String]()
    context.performAndWait {
        var scheduleTable: TvSchedule!

        for (index,response) in scheduleList.enumerated() {
            let schedule: TvScheduleInformation = TvScheduleInformation(json: response )
            scheduleTable = TvSchedule(context: context)
            scheduleTable.channelId = schedule.channelId
            scheduleTable.programId = schedule.programId
            scheduleTable.startTime = schedule.startTime
            scheduleTable.endTime = schedule.endTime
            scheduleTable.day = schedule.day
            scheduleTable.languageId = schedule.languageId
            scheduleTable.isReminderSet = false

            //if I comment out the below code then it reduce the time significantly from 5 min to 34.74 s
            let tvChannelRequest: NSFetchRequest<Channels> = Channels.fetchRequest()
            tvChannelRequest.predicate = NSPredicate(format: "channelId == %d", schedule.channelId)
            tvChannelRequest.fetchLimit = 1
            do {
                let channelResult = try context.fetch(tvChannelRequest)
                if channelResult.count == 1 {
                    let channelTable = channelResult[0]
                    scheduleTable.channel = channelTable
                }
            }
            catch {
                print("Error: (error)")
            }
            progIds.append(String(schedule.programId))
            //storeing after 1000 schedules 
            if index % 1000 == 0 {
                print(index)
                do {
                    try context.save()
                } catch let error as NSError {
                    print("Error saving schdeules object context! (error)")
                }

            }
        }
    }
    let end = DispatchTime.now()
    let nanoTime = end.uptimeNanoseconds - start.uptimeNanoseconds
    print("Saving (scheduleList.count) Schedules takes (nanoTime) nano time")
    coreDataStack.saveContext()
    completionBlock(progIds)
}

还有如何使用自动释放池进行正确的批量保存

Also how to do proper batch save using autoreleas pool

PS:我找到的所有与核心数据相关的资料都很贵,要3k多,而且免费的,基本资料不多,即使apple docs也没有太多与性能调优和批量更新相关的代码和交接关系.在此先感谢您的帮助.

PS: All the material I found related to core data are expensive costing more than 3k, and with free, there isn't much information just basic stuff even apple docs don't have much code related to performance tuning and batch updates and handing relationship. Thanks in advance for anyknid of help.

推荐答案

我以前有过这样的项目.没有单一的解决方案可以解决所有问题,但以下几点很有帮助:

I've had projects like this before. There isn't a single solution that solves everything, but these are some things that help a lot:

您似乎试图一次插入所有内容,然后尝试一一插入.在我的应用程序中,我发现大约 300 是最佳批量大小,但是您必须对其进行调整以查看应用程序中哪些有效,它可能高达 5000 或少至 100.从 300 开始并调整以查看什么变得更好结果.

It seems like you attempted to insert it all at once, and then tried doing it one by one. In my apps I found around 300 to be best batch size, but you have to tweak it to see what works in your application, it could be as much as 5000 or at little as 100. Start with 300 and tweak to see what gets better results.

您有几个过程正在进行,您提到了下载和保存到数据库,但如果还有更多您没有提到的,我不会感到惊讶.队列 (NSOperationsQueue) 是一个了不起的工具.您可能认为排队会减慢速度,但事实并非如此.当你试图一次做太多事情时,事情就会变慢.

You have a few processes going on, you mentioned downloading and saving to the database, but I wouldn't be surprised if there are more that you haven't mentioned. Queues (NSOperationsQueue) are an amazing tool for this. You might think that making a queue will slow things down, but it is not true. When you try to do too much at once things get slow.

因此,您有一个队列正在下载信息(我建议限制为 4 个并发请求),另一个队列将数据保存到核心数据(将并发限制为 1 以防止出现写入冲突).当每个下载任务完成时,它会将结果放入更易于管理的大小和队列中以写入数据库.如果最后一批比其余的小一点,请不要担心.

So you have one queue that is downloading the information (I suggest limiting to 4 concurrently requests), and one that is saving the data to core data (limit concurrency to 1 to not have write conflicts). As each download task finishes, it chucks the results into more manageable size and queues to be written to the database. Don't worry if the last batch is a little smaller than the rest.

每次插入核心数据都会创建它自己的上下文,它自己执行获取,保存它然后丢弃对象.不要从其他任何地方访问这些对象会导致崩溃 - 核心数据不是线程安全的.也只能使用此队列写入核心数据,否则您将遇到写入冲突.(请参阅用于保存到核心数据的 NSPersistentContainer 并发,了解更多信息设置).

Each insert into core data creates it own context, does it own fetches, saves it and then discards the objects. Don't access these objects from anywhere else of you will get crashes - core data is not thread safe. Also only write to core data using this queue or you will get write conflicts. (see NSPersistentContainer concurrency for saving to core data for more information about this setup).

现在您正在尝试插入 300 多个实体,每个实体都必须查找或创建相关实体.您可能有一些分散在周围的函数来完成此任务.如果您在不考虑性能的情况下进行编程,您将轻松执行 300 甚至 600 个获取请求.取而代之的是,您只执行一次 fetchRequest.predicate = NSPredicate(format: "channelId IN %@", objectIdsIamDealingWithNow).获取后,将数组转换为以 id 为键的字典

Now you are trying to insert 300ish entities and each have to find or create related entities. You might have a few function that are scattered around that accomplish this. If you program this without considering performance you will easily do 300 or even 600 fetch requests. Instead you do a single fetch fetchRequest.predicate = NSPredicate(format: "channelId IN %@", objectIdsIamDealingWithNow). After you fetch convert the array to a dictionary with the id as the key

  var lookup:[String: TvSchedule] = [:]
  if let results = try? context.fetch(fetchRequest) {
      results.forEach { if let channelId = $0.channelId { lookup[channelId] = $0  } }
  }

一旦你有了这个查找地图,不要丢失它.将它传递给需要它的每个函数.如果您创建对象,请考虑在之后将它们插入字典中.在核心数据操作中,这个查找字典是你最好的朋友.不过要小心.此对象包含非线程安全的托管对象.您在数据库块的开头创建此对象,并且必须在结尾丢弃它.

Once you have this lookup map do not lose it. Pass it to every function that needs it. If you create objects then consider inserting them into the dictionary afterwards. Inside the core data operation this lookup dictionary is your best friend. Be careful though. This object contains managedObjects which are not thread safe. You create this object at the beginning of your database block and must discard it at the end.

您没有任何明确处理此问题的代码,但如果您遇到它,我不会感到惊讶.假设您有一个特定的 TvSchedule,并且您想以特定语言查找时间表中的所有程序.执行此操作的自然方法是创建一个类似于以下内容的谓词:TvSchedule == %@ AND langId == %@".但它实际上要快得多 mySchedule.programs.filter {%@.langId = myLangId }

You don't have any code that explicitly deals with this, but I wouldn't be surprise if you run into it. Lets say you have a particular TvSchedule and you want to find all of the Programs that are in the schedule in a particular language. The natural way to do this would be to create a predicate that looks something like: "TvSchedule == %@ AND langId == %@". But it is actually much faster to do mySchedule.programs.filter {%@.langId = myLangId }

我看到您已经在代码中添加了日志以查看需要多长时间,这真的很好.我还建议使用 xCode 的 Profile 工具.这对于查找占用大部分时间的函数非常有用.

I see you are already adding logs to the code to see how long stuff takes, that is really good. I would also recommend using the Profile tools of xCode. This can be really good for finding the functions that are taking up most of the time.

这篇关于一张或两张表都保存后,如何加快更新表之间的关系?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆