创建数据迁移工具 [英] creating a data migration tool

查看:81
本文介绍了创建数据迁移工具的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,我的工作任务是为我公司软件的新版本创建数据迁移工具。此工具的目的是帮助使用该软件的客户端将数据从旧数据库中的表迁移到新数据库。



我已经完成了整个过程的映射,但我遇到的挑战是,旧系统和新系统都与其他表有关系所以我必须从旧系统迁移数据和关系。为了使这个迁移工具将从中获取数据的旧数据库中的实际表更加烦恼,有几列指示记录是否属于某个类别,并且此列具有1或0作为数据以显示何时但是,新系统更加规范化,我们有一个Category_Table(包含类别描述和category_type ID)和Category_Types表(包含记录ID和category_type ID)。这些复杂的问题在我脑海中浮现,并且在精神上是详尽无遗的。



现在,我需要的确切帮助来自于构建数据迁移工具的人之前或类似的事情,给我一些关于如何开始这个项目的建议,也许还有一些对我有用甚至是教程的技巧。我仍然是企业软件开发的新手,仍然试图使用复杂的项目,所以任何有用的提示都表示赞赏。



注意:我们使用的数据库服务器是Sybase SQL旧系统是Sybase SQL Anywhere 10的任何地方16

So, I have been given a task at work to create a data migration tool for the new version of my company's software. The purpose of this tool is to help the clients that use the software migrate data from table(s) in the old database to the new database.

I have already worked out the mapping for the whole process but the challenge I am having is, the old and the new system have both got relationships to other tables so I have to migrate both the data and the relationships from the old system. To make matters even more annoying the actual table in the old database which this migration tool will get data from, has a couple of columns that indicate if a Record belongs to a certain category and this columns have a 1 or 0 as data to show when a record belongs to that category or not.However, the new system is more normalized and we have a Category_Table (which holds the category description and category_type ID) and Category_Types table (which hold the Record ID, and category_type ID). These sort of complex issues is getting to my head and is mentally exhaustive.

Now the exact kind of help i need is, from someone who has built a data migration tool before or something similar to give me some advise on how to start this project and maybe several tips that could be useful to me or even tutorials. I am still new to enterprise software development and still trying to get use to complex projects so any helpful tip is appreciated.

NB: The database server we use is Sybase SQL Anywhere 16 while the old system is Sybase SQL Anywhere 10

推荐答案

用于数据迁移的最佳工具之一是SSIS(SQL Server集成服务),它专为Extract Transform Load而设计(ETL)数据,

检查链接并开始创建ETL包:



SSIS概述 - 第一部分 [ ^ ]

创建您的第一个SSIS包的步骤 [ ^ ]



http://technet.microsoft.com/en-us/library/ms169917.aspx [ ^ ]
One of the Best tools for data migration is SSIS (SQL Server Integration Service) which designed for Extract Transform Load(ETL) the data,
Check the links and start your creating your ETL package:

SSIS Overview - Part I[^]
Steps to Create your First SSIS Package[^]

http://technet.microsoft.com/en-us/library/ms169917.aspx[^]


您需要通过一些研究开始这个项目。而且,你可能获得的最好的技术手段之一就是沙箱,你可以从头开始创建数据库,开发你的技术而不用担心和浪费时间。



看起来你的问题被称为欢迎关系代数词,听起来很平常。所以,首先你需要开发一些数学模型,用两个数据库映射两个数据库模式。语义方式。首先,尝试确定存储模式和语义的信息的位置。语义只存储在开发人员的头脑中(可能附带一些文档,并以某种方式反映在语义应用程序代码中),但模式(元数据)存储在数据库本身中。您可以开发代码以从数据库中检索它,但这取决于特定的RDBMS。



现在,关于这个问题本身的复杂性。你听说过发明算法吗?其中一种方法是:用这个问题的概括来代替你的问题,这似乎太难了。当然,如果你解决一个更普遍的问题,它会自动解决你当前的问题。但是,如果更普遍且定义更为一般的问题至少与特定问题一样难以解决,那么它是如何成为可能的呢?奇妙的是,这种方法有很多帮助。不同之处在于心理:看一个特定的问题会通过某些特定的细节覆盖我们的视野,这些细节看起来很重要,但实际上与真正的问题无关,只会让你分心。你的 Category_Table (在表名中使用表这个词是荒谬的:谁发明了它?!这就像微软给可执行文件命名不同:而不是WinWord.EXE它将是WinWord_Application,而不是IExplore.EXE它将是IEApplication.exe:-);在将来,永远不要使用这样的名称)和 Category_Types 只是交叉相关表的一个例子。



它导致另一个想法:尝试从问题的语义中抽象出来。如果您不知道类别表包含类别,您会怎么做?也许,您可以将其形式化,需要一些额外的信息,也从语义中抽象出来:旧数据库中的某些表如何正式映射到新表中其他表中的某些其他表?



最后,从另一方面来说,创建一个广义问题应该会减少模式的大小。这可以使您更好地了解如何解决问题。替换当前涉及两个数据库中每个数十(或数百?)个表的模型,更常规的一个,然而在旧数据库中使用四个表,在新数据库中使用五个(或左右)。将原则复杂性与大数字的复杂性隔离开来,它们往往是无关的。



另一个替换建议:在心理上,用挑战取代你的形容词烦人。把每一个麻烦都视为机会。



-SA
You need to start this project with some research. And, one of the best technical means you can possibly get would be a sandbox where you can create databases from scratch and develop your technique without worries and waste of time.

It looks like your problem is called "welcome to the relational algebra word", sounds very usual. So, first of all you need to develop some mathematical model of mapping two databases with different database schemas in a semantic way. First, try to determine where the information on the schema and semantics is stored. The semantics is only stored in a head of a developer (maybe accompanied with some documentation, and somehow reflected in semantic application code), but a schema (metadata) is stored in a database itself. You can develop the code to retrieve it from the database, but it depends on the particular RDBMS.

Now, about the complexity of this problem itself. Did you ever heard of the "algorithm of the invention"? One of the approaches is: replace your problem, which seems to be too difficult, with the generalization of this problem. Naturally, if you solve a more general problem, it will automatically solve your current problem. But how it's possible that a more general problem, which is more general and by definition is at least as difficult as the particular one, can be easier for resolution? Wonderfully, there are so many cases when this approach helps. The difference is mental: looking at a particular problem clouds our vision by some particular detail which only look important but in fact are unrelated to the real problem and only distracts you from the real resolution. Your Category_Table (using the word "Table" in the name of a table is ridiculous: who invented it?! This is the same as if Microsoft gave different names to executable files: instead of "WinWord.EXE" it would be "WinWord_Application", and, instead of "IExplore.EXE" it would be "IEApplication.exe" :-); in future, never use such names) and Category_Types are just an example of cross-related tables.

It leads to another idea: try to abstract out from the semantics of the problem. If you did not know that the category tables contains "categories", what would you do? Probably, you would be able to formalize it, requiring some additional information, also abstracted from semantics: how some tables in old database is formally mapped onto some other tables in other tables of a new one?

And, finally, creation of a generalized problem, from the other hand, should lead to reduction of the size of the schema. This along can lead you to better understanding of how you can resolve the problem. Replace the model which currently involves tens (or hundreds?) of tables in each of the two databases, with more general one, which however operates with four tables in old database and five (or so) in the new one. Isolate principle complexity from complexity of big numbers, they are often unrelated.

One more "replacing" advice: mentally, replace your adjective "annoying" with "challenging". Consider every hassle as an opportunity.

—SA


这篇关于创建数据迁移工具的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆