您使用什么重复检测阈值? [英] What duplication detection threshold do you use?

查看:154
本文介绍了您使用什么重复检测阈值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们都认为重复是邪恶的,应该避免(不要重复自己的原则)。
为确保这一点,应使用静态分析代码,例如 Simian (多语言)或克隆侦探(Visual Studio加载项)

We all agree that duplication is evil and should be avoid (Don't Repeat Yourself principle). To ensure that, static analysis code should be used like Simian (Multi Language) or Clone Detective (Visual Studio add-in)

我刚刚阅读了 Ayende的关于科比的帖子,他说的是:

I just read Ayende's post about Kobe where he is saying that :


8.5%的科比是复制&粘贴的代码。这就是灵敏度
调高的情况,如果我们将阈值
设置为3(这是我通常的做法),则
会上升到12.5%。

8.5% of Kobe is copy & pasted code. And that is with the sensitivity dialed high, if we set the threshold to 3, which is what I commonly do, is goes up to 12.5%.

我认为3作为阈值非常低。
在我公司中,我们提供质量代码分析即服务,我们的默认重复阈值设置为20,并且有很多重复项。我无法想象如果将其设置为3,我们的客户甚至都不可能考虑改正。

I think that 3 as threshold is very low. In my company we offer quality code analysis as a service, our default threshold for duplication is set to 20 and there is a lot of duplications. I can't imagine if we set it to 3, it would be impossible for our customer to even think about correction.

我了解Ayende对Kobe的看法:这是一个官方示例,其市场营销名称为旨在指导您规划,架构和实施Web 2.0应用程序和服务。 因此对质量的期望很高。

I understand Ayende's opinion about Kobe: it's an official sample and is marketed as "intended to guide you with the planning, architecting, and implementing of Web 2.0 applications and services." so the expectation of quality is high.

但是对于您的项目,您要使用什么最低阈值进行复制?

相关问题:您如何狂热消除代码重复?

推荐答案

三个是很好的经验法则,但这要视情况而定。重构以消除重复通常涉及将代码库和API的概念简单性换成较小的代码库,一旦有人理解它就可以更容易维护。我通常是从这种角度评估事物的。

Three is a good rule of thumb, but it depends. Refactoring to eliminate duplication often involves trading conceptual simplicity of the codebase and API for a smaller codebase that is more maintainable once someone does understand it. I generally evaluate things in this light.

在一个极端情况下,如果修复重复使得代码更具可读性,并且对代码的概念复杂性几乎没有增加或没有增加,那么任何重复都是不可接受的。例如,每当重复的代码巧妙地分解成一个简单的参照透明函数,该函数就易于解释和命名时。

At one extreme, if fixing the duplication makes the code more readable and adds little or nothing to the conceptual complexity of the code, then any duplication is unacceptable. An example of this would be whenever the duplicated code factors out neatly into a simple referentially transparent function that does something that's easy to explain and name.

当更复杂,重量级的时候解决方案,例如元编程,OO设计模式等是必需的,我可以允许4或5个实例,尤其是在重复的代码段很小的情况下。在这些情况下,我认为解决方案的概念复杂性使治愈方法胜过疾病,直到确实有很多情况。

When a more complex, heavyweight solution, such as metaprogramming, OO design patterns, etc. is necessary, I may allow 4 or 5 instances, especially if the duplicated snippet is small. In these cases I feel that the conceptual complexity of the solution makes the cure worse than the ill until there are really a lot of instances.

在最极端的情况下,我正在使用的代码库是一个发展迅速的原型,而我对项目可能向哪个方向发展还不太了解,以画出既合理又简单且可以适应未来发展的抽象路线,我只是放弃了。在这样的代码库中,即使同一段代码重复20次,我认为最好还是专注于权宜之计和完成工作,而不是良好的设计。通常,造成所有重复的原型部分将很快被丢弃,一旦您知道将保留原型的哪些部分,就可以随时重构它们。如果没有将要丢弃的部件所产生的额外约束,在此阶段重构通常会更容易。

In the most extreme case, where the codebase I'm working with is a very rapidly evolving prototype and I don't know enough about what direction the project may evolve in to draw abstraction lines that are both reasonably simple and reasonably future-proof, I just give up. In a codebase like this, I think it's better to just focus on expediency and getting things done than good design, even if the same piece of code is duplicated 20 times. Often the parts of the prototype that are creating all that duplication are the ones that will be discarded relatively soon anyhow, and once you know what parts of the prototype will be kept, you can always refactor these. Without the additional constraints created by the parts that will be discarded, refactoring is often easier at this stage.

这篇关于您使用什么重复检测阈值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆