核心数据云同步 - 需要帮助逻辑 [英] Core Data cloud sync - need help with logic

查看:164
本文介绍了核心数据云同步 - 需要帮助逻辑的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为我目前正在开发的Core Data应用程序集思广益的云同步解决方案。我计划打开源代码为这一旦它的完成,任何人使用他们的Core Data应用程序,所以从社区输入如何这个系统应该工作非常感谢:-)这里是我在想什么: / p>

服务器端






>



与所有云同步系统一样,存储是一个重要的难题。有很多方法来处理这个。我可以设置自己的服务器用于存储,或者使用像Amazon S3这样的服务,但是由于我开始使用$ 0资金,在这一刻,付费存储解决方案不是一个可行的选择。经过一番思考,我决定与 Dropbox (已经成熟的云同步应用程序和存储提供商)达成协议。使用Dropbox的优点是:




  • 免费(对于有限的空间)


  • 他们最近发布了一个Objective-C SDK,可以更容易地在Mac和iPhone应用程序中与其进行交互。



如果我决定在将来切换到不同的存储提供商,我打算添加服务到这个云同步框架,基本上允许任何人创建一个服务类来与他们选择的存储提供程序接口,然后可以简单地插入到框架中。



存储结构 p>

这是一个很难解决的部分,所以我需要尽可能多的输入。我一直在想这样的结构:

  CloudSyncFramework 
======> [app name]
==========> devices
=============> (device id)
================> deviceinfo
================> changeset
==========>实体
=============> (实体名称)
================> (object id)

此结构的快速说明:




  • 主CloudSyncFramework(名称未定)文件夹将为每个使用框架的应用程序包含单独的文件夹

  • 每个应用程序文件夹包含设备文件夹和实体文件夹

  • 设备与帐户。设备文件夹将根据设备ID命名,使用 [[UIDevice currentDevice] uniqueIdentifier] (在iOS上)或序列号(在Mac OS上)。

  • 每个设备文件夹都包含两个文件: deviceinfo 更改集 deviceinfo 包含有关设备的信息(例如操作系统版本,上次同步日期,型号等),更改集文件包含自设备上次同步以来更改的对象的相关信息。这两个文件只是使用 NSKeyedArchiver 归档到文件中的简单NSDictionaries。

  • 每个Core Data实体在实体文件夹

  • 在每个实体文件夹下,属于该实体的每个对象都将有一个单独的文件。此文件将包含具有键值对的JSON字典。



同步同步

这是我几乎完全无能为力的领域之一。 我如何处理2个设备同时与云连接和同步?似乎很有可能在此处出现同步,甚至数据损坏。



处理迁移



我如何处理Core Data托管对象模型的迁移?最简单的做法是清除云数据存储,并从设备上传新的数据副本



客户端






将NSManagedObjects转换为JSON



将属性转换为JSON不是一项非常艰巨的任务theres很多代码为它漂浮在网络)。关系是这里的关键问题。在 stackoverflow帖子中,Marcus Zarra在其中关系对象本身被添加到JSON字典。但是,他提到这可能导致无限循环,这取决于模型的结构,我不知道这是否会与我的方法,因为我存储每个对象作为一个单独的文件。



我一直在试图找到一种方法来获取一个字符串作为 NSManagedObject 的字符串。然后我可以将JSON中的关系保存为ID数组。我发现的最接近的事情是 [[managedObject objectID] URIRepresentation] ,但这不是一个真正的对象的ID,它更多的持久化对象的位置存储,我不知道它的具体足够用作对象的引用。



我想我可以为每个对象生成一个UUID字符串并保存


$ b

将更改同步到云



第一个(也是最好的)解决方案是听到 NSManagedObjectContextObjectsDidChangeNotification 获取更改的对象的列表,然后更新/删除/在云数据存储中插入这些对象。保存更改后,我需要为每个其他注册的设备更新更改集文件,以反映新更改的对象。



这里出现的一个问题是,如何处理失败或中断的同步?。我有一个想法是首先将更改推送到云上的临时目录,然后一旦确认成功,将其与云上的主数据合并,以便同步中间的中断不会损坏数据。然后,我会将需要在云中更新的对象的记录保存为plist文件或某些内容,以便在下次将应用连接到互联网时进行推送。



检索更改的对象



这很简单,设备会下载其更改集文件,对象需要更新/插入/删除,然后采取相应的行动。



这总结了我对这个系统将使用的逻辑的想法:-)任何洞察,建议



UPDATE



许多思考和阅读TechZens建议,我想出了一些修改我的概念。



我想到的最大的变化是让每个设备有一个在云中分隔数据存储。基本上,每次管理对象上下文保存(感谢TechZen),它会将更改上传到该设备的数据存储。更新更改后,它将创建一个包含更改详细信息的更改集文件,并将其保存到正在使用该应用程序的其他设备的更改集文件夹中。当其他设备连接到同步时,他们将通过更改集文件夹,并将每个更改集应用到本地数据存储,然后更新云中的各自的数据存储。



现在,如果向该帐户注册了新设备,则会在所有设备中找到该数据的最新副本,并下载该设备以用作其本地存储。 这解决了同时同步问题,并减少了数据损坏的可能性,因为没有中央数据存储,每个设备只触及其数据,只更新更改,而不是每个设备访问和修改



有一些明显的冲突情况需要处理,主要是关于删除对象。如果更改集正在下载,指示应用程序删除当前正在编辑的对象等,则需要有处理此问题的方法。

解决方案

您想查看这种悲观的云同步问题:为什么云同步将永远不会工作。
它涵盖了许多你摔跤的问题。其中许多很大程度上是难以解决的。



同步信息周期非常非常困难。添加在不同的设备,不同的操作系统,不同的数据结构等等雪球的复杂性往往是致命的。人们一直在研究这个问题的变体,因为70年代和事情真的没有改善太多。



根本的问题是,如果你离开系统灵活和可定制,那么同步所有变化的复杂性会随着定制数量的变化而呈指数级爆炸。如果你使它僵硬,你可以同步,但你可以同步的限制。


如何处理2台设备
同时连接和同步云


如果你知道,你会富有。这是当前云同步提供商的一个大问题。他们真正的问题是,你不同步你的合并。软件在合并时很糟糕,因为它很难建立一个预定义规则集来描述所有可能的合并。



最简单的系统是建立规范设备或设备层次结构使得系统总是知道选择哪个输入。然而,这破坏了灵活性。


如何处理
Core Data管理对象模型的迁移?




Core Data模型的迁移在很大程度上与服务器无关。这是Core Data自己内部管理的东西。


将NSManagedObjects转换为JSON


建模关系很难,尤其是那些不像Core Data那样容易支持的工具。但是,永久管理对象ID的URI应该用作UUID,该UUID将对象指向特定设备上特定存储中的特定位置。它在技术上不是保证是普遍唯一的,但它足够接近所有实际目的。


正在将更改同步到云端


我认为你混淆了核心数据的实现细节与云本身。如果使用 NSManagedObjectContextObjectsDidChangeNotification ,每次观察到的上下文变化时都会唤起网络流量,无论这些更改是否持久。根据应用程序,这可能在几分钟内驱动连接数千次。相反,您只希望在上下文保存最多时进行同步。


这里出现的一个问题是,
如何处理失败或中断的
同步? p>

在同步完成之前,不要提交更改。这是一个大问题,导致数据损坏。再次,你可以有灵活性,复杂性和脆弱性或不灵活性,简单性和鲁棒性。


检索更改的对象:这是
非常简单,设备下载
它的更改集文件, b $ b对象需要
更新/插入/删除,然后行为
相应


如果你有一个不灵活的数据结构。描述对灵活数据结构的更改是一场噩梦。



不知道我有没有帮助过任何。没有一个问题有优雅的解决方案。大多数设计师最终以刚性和/或慢,强力迭代合并。


I'm in the middle of brainstorming a cloud sync solution for a Core Data app that I am currently developing. I'm planning to open source the code for this once its done, for anyone to use with their Core Data apps, so input from the community on how this system should work is much appreciated :-) Here's what I'm thinking:

Server Side


Storage Provider

As with all cloud sync systems, storage is a major piece of the puzzle. There are many ways to handle this. I could set up my own server for storage, or use a service like Amazon S3, but because I'm starting out with $0 capital, at this moment, a paid storage solution isn't a viable option. After some thought, I decided to settle with Dropbox (an already well established cloud sync application and storage provider). The pros of using Dropbox are:

  • It's free (for a limited amount of space)
  • In addition to being a storage service, it also handles cloud sync
  • They recently released an Objective-C SDK which makes it much easier to interface with it in Mac and iPhone apps

In case I decide to switch to a different storage provider in the future, I intend to add "services" to this cloud sync framework, basically allowing anyone to create a service class to interface with their choice of storage provider, which can then simply be plugged into the framework.

Storage Structure

This is a really difficult part to figure out, so I need as much input as I can here. I've been thinking about a structure like this:

CloudSyncFramework
======> [app name]
==========> devices
=============> (device id)
================> deviceinfo
================> changeset
==========> entities
=============> (entity name)
================> (object id)

A quick explanation of this structure:

  • The master "CloudSyncFramework" (name undecided) folder will contain separate folders for each app that uses the framework
  • Each app folder contains a devices folder and an entities folder
  • The devices folder will contain a folder for each device that is registered with the account. The device folder will be named according to the device ID, obtained using something like [[UIDevice currentDevice] uniqueIdentifier] (on iOS) or a serial number (on Mac OS).
  • Each device folder contains two files: deviceinfo and changeset. deviceinfo contains information about the device (e.g. OS version, last sync date, model, etc.) and the changeset file contains information about objects that have changed since the device last synchronized. Both files will just be simple NSDictionaries archived into files using NSKeyedArchiver.
  • Each Core Data entity has a subfolder under the entities folder
  • Under each entity folder, every object that belongs to that entity will have a separate file. This file will contain a JSON dictionary with the key-value pairs.

Simultaneous Sync

This is one of the areas where I am almost completely clueless. How would I handle 2 devices connecting and syncing with the cloud at the same time? There seems to be a high risk of things getting out of sync here, or even data corruption.

Handling migrations

Once again, another clueless area here. How would I handle migrations of the Core Data managed object model? The easiest thing to do here seems to be just to wipe the cloud data store clean and upload a new copy of the data from a device which has undergone the migration process, but this seems somewhat risky, and there may be a better way.

Client Side


Converting NSManagedObjects into JSON

Converting attributes into JSON isn't a very hard task (theres lots of code for it floating around the web). Relationships are the key problem here. In this stackoverflow post, Marcus Zarra posts code in which the relationship objects themselves are added to the JSON dictionary. However, he mentions that this can cause an infinite loop depending on the structure of the model, and I'm not sure if this would work with my method, because I store each object as an individual file.

I've been trying to find a way to get an ID as a string for an NSManagedObject. Then I could save relationships in JSON as an array of IDs. The closest thing I found was [[managedObject objectID] URIRepresentation], but this isn't really an ID for an object, its more of a location for the object in the persistent store, and I don't know if its concrete enough to use as a reference for an object.

I suppose I could generate a UUID string for each object and save it as an attribute, but I'm open for suggestions.

Syncing changes to the cloud

The first (and still best) solution that popped into my head for this was to listen for the NSManagedObjectContextObjectsDidChangeNotification to get a list of changed objects, then update/delete/insert those objects in the cloud data store. After the changes have been saved, I would need to update the changeset file for every other registered device to reflect the newly changed objects.

One problem that comes up here is, how would I handle a failed or interrupted sync?. One idea I have is to first push changes to a temporary directory on the cloud, then once that has been confirmed as successful, to merge it with the master data on the cloud so that an interruption in the middle of the sync won't corrupt data. Then I would save records of the objects that need to be updated in the cloud into a plist file or something, to be pushed during the next time the app is connected to the internet.

Retrieving changed objects

This is fairly simple, the device downloads its changeset file, figures out which objects need to be updated/inserted/deleted, then acts accordingly.

And that sums up my thoughts for the logic that this system will use :-) Any insight, suggestions, answers to problems, etc. is greatly appreciated.

UPDATE

After lots of thinking, and reading TechZens suggestions, I have come up with some modifications to my concept.

The largest change I've thought up is to make each device have a separate data store in the cloud. Basically, every time the managed object context saves (thanks TechZen), it will upload the changes to that device's data store. After those changes are updated, it will create a "changeset" file with change details, and save it into the changeset folders of the OTHER devices that are using the application. When the other devices connect to sync, they will go through the changeset folder and apply each changeset to the local data store, then update their respective data stores in the cloud as well.

Now, if a new device is registered with the account, it will find the newest copy of the data out of all the devices and download that for use as its local storage. This solves the problem of simultaneous sync and reduces the chances for data corruption because there is no "central" data store, each devices touches only its data and just updates changes rather than every device accessing and modifying the same data at the same time.

There's some obvious conflict situations to deal with, mainly in relation to deleting objects. If a changeset is downloading instructing the app to delete an object that is currently being edited, etc. there needs to be ways to deal with this.

解决方案

You want to look at this pessimistic take on cloud sync: Why Cloud Sync Will Never Work. It covers a lot of the issues that you are wrestling with. Many of them are largely intractable.

It is very, very, very difficult to synchronize information period. Adding in different devices, different operating systems, different data structures, etc snowballs the complexity often fatally. People have been working on variants of this problem since the 70s and things really haven't improve much.

The fundamental problem is that if you leave the system flexible and customizable, then the complexity of synchronizing all the variations explodes exponentially as a function of the number of customization. If you make it rigid, you can sync but you are limited in what you can sync.

How would I handle 2 devices connecting and syncing with the cloud at the same time?

If you figure that out, you will be rich. It's a big issue for current cloud sync providers. They real problem here is that your not "syncing" your merging. Software sucks at merging because its very hard to establish a predefined rule set to describe all the possible merges.

The simplest system is to establish either a canonical device or a device hierarchy such that the system always knows which input to choose. This however, destroys flexibility.

How would I handle migrations of the Core Data managed object model?

The migration of the Core Data model is largely irrelevant to the server. That's something that Core Data manages internally to itself. Model migration updates the model i.e. the entity graph, not the actual data.

Converting NSManagedObjects into JSON

Modeling relationships is hard especially with tools that don't support it as easily as Core Data does. However, the URI of a permanent managed object ID is supposed to serve as a UUID that nails the object down to a specific location in a specific store on a specific device. It's not technically guaranteed to be universally unique but its close enough for all practical purposes.

Syncing changes to the cloud

I think you're confusing implementation details of Core Data with the cloud itself. If you use NSManagedObjectContextObjectsDidChangeNotification you will evoke network traffic every time the observed context changes regardless of whether those changes are persisted or not. Depending on the app, this could drive connections thousands of times in a few minutes. Instead, you only want to sync when context is saved at the most.

One problem that comes up here is, how would I handle a failed or interrupted sync?

You don't commit changes until the sync completes. This is a big problem and leads to corrupt data. Again, you can have flexibility, complexity and fragility or inflexibility, simplicity and robustness.

Retrieving changed objects: This is fairly simple, the device downloads its changeset file, figures out which objects need to be updated/inserted/deleted, then acts accordingly

It's only simple if you have an inflexible data structure. Describing changes to a flexible data structure is a nightmare.

Not sure if I have helped any. None of the problems have elegant solutions. Most designer end up with rigidity and/or slow, brute force iterative merging.

这篇关于核心数据云同步 - 需要帮助逻辑的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆