TFS 2010 Build:过程中偶尔出现故障 [英] TFS 2010 Build: Sporadic failure in the process

查看:32
本文介绍了TFS 2010 Build:过程中偶尔出现故障的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们的构建已停止以稳定方式执行的情况.
我们以大约每三个的比率收到 TF215096 或 TF215097 错误&构建失败.
如果我们然后重新启动 Build 控制器,它会再次工作 - 直到下一次.

我们得到的错误是:

<块引用>

TF215096:连接到控制器 vstfs:///Build/Controller/1 时出错:在 ht*p://XXXX 上没有可以接受消息的端点侦听.这通常是由不正确的地址或 SOAP 操作引起的.有关详细信息,请参阅 InnerException(如果存在).


<块引用>

TF215096:连接到控制器 XXX 时出错 - 控制器:无法连接到 ht*p://XXX.TCP 错误代码 10061:无法建立连接,因为目标机器主动拒绝了 192.168.XXX.XXX:XXX.


<块引用>

TF215097:初始化构建定义 \XXX 的构建时出错:服务器 ht*p://XXX 中的 Team Foundation 服务不可用.技术信息(针对管理员):底层连接已关闭:服务器关闭了本应保持活动状态的连接.


<块引用>

TF215097:为构建定义初始化构建时出错 \YYY:接收对 ht*p://XXX 的 HTTP 响应时出错.这可能是由于服务端点绑定未使用 HTTP 协议.这也可能是由于服务器中止了 HTTP 请求上下文(可能是由于服务关闭).有关更多详细信息,请参阅服务器日志.

服务器日志提供的信息很少,至少我们没有发现任何可以帮助我们解决问题的信息.网络上的各种搜索也没有成效.

有人遇到过这些/类似的问题吗?关于如何/在哪里寻找解决方案的任何想法?
非常感谢您提供任何意见!

解决方案

今天是快乐的一天,因为我们设法找到了问题的根源.抱歉@Duat,我取消了答案"复选标记 - 但事实证明问题与您(和其他任何人)所预测的完全不同.

在我上次更新时,我正准备将此事转发给 MS,但当我们意识到我们的防火墙在名称解析中表现不正常时.所以我们认为这是罪魁祸首等待解决这个问题.解决这个问题后,我们仍然遇到同样的问题,我们再次重新检查情况.

我们在构建过程中隔离了问题,更具体地使用构建解决方案中包含的自定义代码活动.

我实施了一项代码活动,该活动将在每次构建的最后步骤中启动.此活动是关于收集有关正在运行的构建的 BuildDetails将它们作为新行添加到BuildLog.xls"中.
实现使用了 Microsoft.Office.Interop.Excel.
此 Excel 表驻留在另一台服务器上(不在控制器/代理所在的服务器).

在此活动的开发过程中,我遇到了诸如 this,但在我完成之后,没有任何 EXCEL 实例被挂起.所以我认为这已经完成&处理.

尝试 &错误,我们观察到当此活动不运行时,不会发生任何问题.
随着此活动的运行,构建控制器重置后的第一个构建将成功,任何下一个构建都有一定的机会失败.一旦任何构建失败,在另一个构建控制器重置之前,没有其他构建会成功.

我对问题所在只有一个大致的了解(Excel-call 是 DCOM,TFS 服务是 WCF:他们到底怎么会干扰?!为什么这有时会成功有时会失败?! ).
提供的诊断也无济于事,实际上它们误导我们进入持续数月的循环.
如果我有时间,我想干净地重现错误&提出一个服务器故障问题...

<小时>

删除此活动后,它起作用了!我现在在 SO & 中搜索找到了这个,其中 J.Saunders 评论道:一般来说,你应该永远不要在服务器环境中使用 Office Interop".
具有讽刺意味的是,一旦你解决了任何难题,整个宇宙似乎都知道它,除了你......

We have a situation where our builds have stopped executing in a stable manner.
At a rate of about one every three we receive either TF215096 or TF215097 errors & the Build fails.
If we then restart the Build controller, it works again - until next time.

The errors we get are:

TF215096: An error occurred while connecting to controller vstfs:///Build/Controller/1: There was no endpoint listening at ht*p://XXXX that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details.


TF215096: An error occurred while connecting to controller XXX - Controller: Could not connect to ht*p://XXX. TCP error code 10061: No connection could be made because the target machine actively refused it 192.168.XXX.XXX:XXX.


TF215097: An error occurred while initializing a build for build definition \XXX: Team Foundation services are not available from server ht*p://XXX. Technical information (for administrator): The underlying connection was closed: A connection that was expected to be kept alive was closed by the server.


TF215097: An error occurred while initializing a build for build definition \YYY: An error occurred while receiving the HTTP response to ht*p://XXX. This could be due to the service endpoint binding not using the HTTP protocol. This could also be due to an HTTP request context being aborted by the server (possibly due to the service shutting down). See server logs for more details.

Server logs provide with little info, at least we 've found nothing that helps us resolve the situation. Various searches in the Net were also not productive.

Does anybody had these/similar issues? Any ideas on how/where to look for a resolution?
Thank you very much in advance for any input!

解决方案

Today is a happy day, since we managed to get to the bottom of the matter. Sorry @Duat that I'm taking away the 'answer' checkmark - but it turned out that the problem was quite different from what you (and anybody else) has predicted.

In my last update I was about to forward the matter to MS, when we realized that our Firewall was misbehaving in the name resolution. So we assumed this was the culprit & awaited for this to resolve. After this was resolved, we STILL had the same issues and we went again re-examining the situation.

We isolated the problem within our Build Process, more specific with a custom code activity included in our build solution.

I had implemented a code activity that would kick in at the final steps of every build. This activity was about gathering BuildDetails about the running build & add them as a new line in a 'BuildLog.xls'.
Implementation made use of Microsoft.Office.Interop.Excel.
This excel sheet resides in another server (NOT on the Servers where the controller/agents reside).

During development of this activity I was faced with issues like this, but after I was done no instances of EXCEL were left hanging. So I thought this was done & dealt with.

With try & error, we observed that when this activity wouldn't ran, no problems would occur.
With this activity running, the very first build after a build-controller reset would succeed, any next build had a certain chance to fail. Once any build failed, no other would succeed until another build-controller reset.

I have only a general understanding of what the problem was (Excel-call is DCOM, TFS services are WCF : How on earth would they interfere?! Why would this sometimes succeed and sometimes fail?! ).
The provided diagnostics were no help either, in fact they mislead us into a loop that continued for months.
If I ever find the time, I 'd like to cleanly reproduce the error & make a Server Fault question out of it...


After removal of this activity it works! I now searched in SO & found this, where J.Saunders comments: "In general, you should never use Office Interop from a server environment".
It's ironic that once you get to the bottom of any difficult issue, the whole universe seems to have known about it except you...

这篇关于TFS 2010 Build:过程中偶尔出现故障的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆