哪些算法可以分析库裂变的调用依赖关系? [英] What algorithms can analyse call dependencies for library fission?

查看:45
本文介绍了哪些算法可以分析库裂变的调用依赖关系?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个包含一堆相互依赖的函数的库,这个库太大了,我想拆分它.有哪些算法可以找到合适的分区?

Suppose I have a library which contains a bunch of interdependent functions, this library is too big and I want to split it up. What algorithms are there to find suitable partitions?

简单的例子,里面有四个函数:alpha、beta、gamma和delta.

Simple example, there are four functions in it: alpha, beta, gamma, and delta.

  • beta 和 gamma 都调用 delta.
  • module1 调用 alpha 和 beta.
  • module2 调用 gamma.
  • module3 调用 alpha、beta 和 gamma.

算法的输出可以是:

  • LibA 包含(alpha、beta)
  • LibB 包含 (gamma)
  • LibC 包含(增量)
  • module1 依赖于 LibA
  • module2 依赖于 LibB
  • module3 依赖于 LibA 和 LibB
  • LibA 依赖于 LibC
  • LibB 依赖于 LibC

即它找到具有以下属性的最细粒度的 Lib* 分区

i.e. it finds the most-finely-grained Lib* partition with the following property

对于所有 X,如果 LibX 被任何方法划分为 LibY 和 LibZ,那么所有依赖于 LibY 的模块/库也依赖于 LibZ,反之亦然.

For all X, if LibX is partitioned by any method into LibY and LibZ then all modules/libraries which depend on LibY also depend on LibZ and vice-versa.

有没有标准的解决方案?

Is there a standard solution for this?

推荐答案

(这也是人们在 C 和 C++ 程序中遇到的头文件问题.)

(This is the same kind of problem that people have with header files in C and C++ programs, too.)

创建依赖的不仅仅是调用";它是对成员变量、静态变量甚至常量定义的任何 类型的引用.

It isn't just "calls" which create dependencies; it is any kind of reference, to a member variable, a static variable or even a constant definition.

基本上你需要做的是发现所有细粒度的依赖关系(这通常需要一个类似编译器的分析工具来读取代码并发现声明的语言元素(声明、字段、方法、类、包)之间的这种依赖关系如果您以 Java 为中心,等等)和其他语言元素.使用编写库的语言的语义.(这样的分析可能是保守的.这就是本质给你一个巨大的图,节点是语言元素和弧被使用".

Basically what you need to do is to discover all the fine grain dependencies (this usually requires a compiler-like analysis tool that reads the code and discovers such dependencies, between declared language elements (declarations, fields, methods, classes, packages if you are java-centric, etc.) and other language elements. using the semantics of the language in which the libraries are written. (Such an analyis is probably conservative). This is essence gives you a giant graph, with nodes being language elements, and arcs being "uses".

抽象的库打包问题是将这个图分解成块,最小化跨块依赖弧.这可能会为您提供大量的小型库.

The library packaging problem in the abstract is breaking this graph apart into chunks, minimizing cross-chunk dependency arcs. This may give you huge numbers of small libraries.

实际问题是将一些相互之间没有实际依赖关系但通常一起使用的块组合在一起.例如,一组缓冲区访问过程可能对默认缓冲区大小的定义没有任何显式依赖,但您可能想要一个包含两者的库,而不是两个库,其中一个库只包含默认缓冲区大小声明.这种一起使用的概念实际上是一个问题域工件,在代码中的任何地方都看不到,除了一些统计上的共同使用.

The practical problem is grouping together some chunks which have no actual dependency on one another, but are commonly used together. For instance, a set of buffer access procedures may not have any explicit dependency on a definition of default buffersize, but you probably want one library containing both, rather than two libraries with one containing just the default buffersize declaration. This notion of used-together is really a problem domain artifact, and isn't visible anywhere in the code except for perhaps some statistical co-occurence of uses.

这个问题的难点在于发现细粒度的语义依赖.您可以手动进行近似处理,但如果问题有任何规模,您就没有兴趣去做.(人们不会出于同样的原因重新组织头文件).您几乎需要语言工具来进行分析、大图管理来提出块、统计分析来获得色调分组,还可能需要一个用户界面来允许领域专家编辑分组以生成修订后的库.

The hard part of this problem is discovering the fine-grain semantic dependencies. You can approximate this by hand, but if there is any scale to the problem, you won't have the appetite to do it. (People don't reorganize header files for the same reason). Pretty much you need language tools to do the analysis, big graph management to propose chunks, statistical analysis to get a hueristic grouping, and probably a UI to allow a domain expert to editing the grouping to produce the revised libraries.

然后您需要一个工具来返回使用旧库的代码,并修改它们以使用修订后的库.库重构和代码库修订都需要大量的代码分析和更改,需要自动化.

Then you need a tool to go back to code that uses the legacy libraries, and modify them to use the revised libraries. Both the library refactoring and the code base revision requires massive code analysis and change, which needs automation.

我们的 DMS 软件再造工具包及其许多语言前端 可能是实现这种库重组的良好基础.我们已经考虑过为 C 和 C++ 做这件事 [这就是我有这个回应的原因],但即使对我们来说,这也是一项艰巨的任务.我们希望获得一些严肃的额外动力!

Our DMS Software Reengineering Toolkit with its many language front ends is probably a good foundation for implementing such a library reorganization. We've considered doing this for C and C++ [which is why I have this response], but its a big task even for us to do. We'd love some serious additional motivation!.

这篇关于哪些算法可以分析库裂变的调用依赖关系?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆