大中型开发团队如何不断推动对DVCS的改变? [英] How do medium to large development teams constantly push changes to a DVCS?

查看:81
本文介绍了大中型开发团队如何不断推动对DVCS的改变?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我工作的公司正试图从集中式VCS StarTeam中脱身,希望能够更好地发展。我不想仅仅选择SVN,因为它是简单的选择(与我们今天的做法最相似)。有人可以帮助我理解一个大中型开发团队每天如何改变大约150个文件的操作可以在像Git或Mercurial这样的DVCS上运行吗?



我们有:




  • 50位开发人员使用一个具有
  • 15年历史的存储库
  • 15,000个文件(主要是代码/文本)

  • 每天大约150个单一文件签入

  • 需要从每个中心因为我们独特的语言要求和数据结构



经过数天的研究和实验,我了解了Git和汞柱。我不明白的是,需要对单个文件进行更改的大型团队如何能够在系统受到限制的情况下运行?例如:我可能正在处理3种不同的事情,开发人员向我提出一个小问题。我做了一点研究,做了必要的1行更改,并检入文件。从我可以告诉的是,使用git我需要提交更改,隐藏所有其他更改,拉取,推送,应用存储,处理任何合并,以便将最近提取的代码隐藏起来。有没有更好的方法?



我读过关于Facebook从Git切换到Mercurial的文章。他们如何在40,000个文件库上每天管理数千个签入?我无法理解这将如何与DVCS的拉/推模型一起工作。我对这些系统的工作流程或功能缺少一些信息。也许他们不需要每天都拉/重新基地?


解决方案


我已阅读关于Facebook从Git切换到Mercurial的文章。他们如何在40,000个文件库上每天管理数千个签入?我无法理解这将如何与DVCS的拉/推模型一起工作。我对这些系统的工作流程或功能缺少一些信息。也许他们不需要每天都拉/重新基地?

首先,Facebook不使用Mercurial纯DVCS时尚。他们使用远程文件日志扩展(他们自己编写的)只根据需求提取变更集(这是因为Mercurial存储历史记录,以便历史访问大多可以本地化)。他们还为其后端服务器使用MySQL 和memcached来提高可伸缩性。



对于重定位/合并瓶颈(当几十个开发人员需要将他们的工作同时集成到主分支中时),他们有一项正在开发的功能,主要是在服务器端完成这项工作;我注意到,这个瓶颈部分是由于(1)使用monorepo和(2)使用基于trunk的开发,这个问题并非每个大型组织都有。



< blockquote>

例如:我可能正在处理3种不同的事情,开发人员向我提出了一个小问题。我做了一点研究,做了必要的1行更改,并检入文件。从我可以告诉的是,使用git我需要提交更改,隐藏所有其他更改,拉取,推送,应用存储,处理任何合并,这些合并可能会存在于最近拉取的代码中。有更好的方法吗?

您可以使用新的Git工作特性(或者,对于Mercurial而言, hg共享)可以对同一个存储库进行多次签出并独立处理它们。市集和化石一直都能够做到这一点; Fossil还具有自动同步功能,可以像SVN一样进行操作(有注意事项),Bazaar一直能够以准中心的方式工作,并提交直接到服务器。特别是,每个存储库只有一次结账主要是一个历史的Git工件,而不是DVCS系统的代表(并且,正如我所指出的那样,更新的Git版本已经消失)。

也就是说,您通常必须提交 - > pull - > merge或rebase - >推动对主分支的更改;这是一个折衷。分布式模型允许您必须拥有简单的本地版本控制,以便正在进行的工作在准备就绪之前不会直接进入主存储库。它通常也假设你的大部分工作都会发生在一个个人分支上,所以有90%的时间只是在提交(偶尔会推送)。



请注意,与集中式VCS相比,您不会有更多或更少的冲突;他们可能会出现在不同的点。






所有这一切说,如果你满意你当前的VCS,你可以没有发现切换带来的明确好处,可能不值得更换VCS。切换VCSs会带来可衡量的成本(重建您的回购,重新培训开发人员,调整工作流程,调整周围工具,意外转换问题),并且这些成本需要被同样可衡量的收益所抵消,以使其值得。 b

The company I work for is attempting to get off of StarTeam, a centralized VCS, and hopefully into something better. I don't want to just pick SVN because it's the "easy option" (most similar to what we do today). Can someone help me understand how a medium to large team of developers changing roughly 150 files per day can operate on a DVCS like Git or Mercurial?

We have:

  • 50 developers working on a single repository that has
  • 15 years of history
  • 15,000 files (mostly code/text)
  • Approximately 150 single file check-ins a day
  • Need to pull changes from central every day due to our unique language's requirements and data structure

After days of research and experimenting, I understand the processes and workflows of both Git and Hg. What I don't understand is how large teams that need to make changes, often to single files, very quickly can operate with the restraints of the systems?

For example: I might be working on 3 different things and a developer comes up to me presenting a small issue. I do a little research, make the necessary 1 line change, and check in the file. From what I can tell, with git I would need to commit the change, stash all my other changes, pull, push, apply stash, deal with any merges applying my stash over recently pulled code might present. Is there a better way?

I've read the article about Facebook switching from Git to Mercurial. How do they manage thousands of check-ins a day on a 40,000 file repository? I can't fathom how that would work with the pull/push model of a DVCS. There must be something I'm missing about the workflow or functionality of these systems. Maybe they don't need to pull/re-base every day?

Any help is appreciated.

解决方案

I've read the article about Facebook switching from Git to Mercurial. How do they manage thousands of check-ins a day on a 40,000 file repository? I can't fathom how that would work with the pull/push model of a DVCS. There must be something I'm missing about the workflow or functionality of these systems. Maybe they don't need to pull/re-base every day?

First of all, Facebook doesn't use Mercurial in a purely DVCS fashion. They use the remotefilelog extension (which they wrote themselves) to only pull changesets on demand (this works because of how Mercurial stores history so that history accesses can mostly be localized). They also use MySQL and memcached for their backend servers to improve scalability.

For the rebase/merge bottleneck (when dozens of developers need to integrate their work in the main branch simultaneously), they have a work in progress feature to have this done mostly server-side; I note that this bottleneck is in part the result of (1) using a monorepo and (2) using trunk-based development, a problem that not every large organization will have.

For example: I might be working on 3 different things and a developer comes up to me presenting a small issue. I do a little research, make the necessary 1 line change, and check in the file. From what I can tell, with git I would need to commit the change, stash all my other changes, pull, push, apply stash, deal with any merges applying my stash over recently pulled code might present. Is there a better way?

You can use the new Git worktree feature (or, for Mercurial, hg share) to have multiple checkouts of the same repository and work on them independently. Bazaar and Fossil have always been able to do that natively; Fossil also has the autosync feature to operate in an SVN-like fashion (with caveats) and Bazaar has always been able to work in a quasi-centralized fashion with commits going directly to the server. In particular, having "only a single checkout per repository" is largely a historical Git artifact and not representative of DVCS systems (and, as I noted, gone as of the more recent Git versions).

That said, you generally have to commit -> pull -> merge or rebase -> push for changes to the main branch; this is a tradeoff. The distributed model allows you have to have easy local versioning so that work in progress doesn't go straight to the main repository before it is ready. It also generally assumes that most of your work will happen on a personal branch so that 90% of the time it's really just commits (with the occasional push).

Note that you won't have more or less conflicts than in a centralized VCS; they may just appear at different points.


All that said, if you're happy with your current VCS and you can't identify unambiguous benefits from switching, it's probably not worth changing your VCS. Switching VCSs incurs a measurable cost (rebuilding your repo, retraining your developers, adjusting workflows, adjusting the surrounding tooling, unexpected transition issues) and these costs need to be offset by equally measurable benefits to make them worthwhile.

这篇关于大中型开发团队如何不断推动对DVCS的改变?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆