从 Silverlight 4(测试版)流式传输网络摄像头 [英] Streaming a webcam from Silverlight 4 (Beta)

查看:27
本文介绍了从 Silverlight 4(测试版)流式传输网络摄像头的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Silverlight 4 中的新网络摄像头非常酷.通过将其暴露为画笔,它可以实现超越 Flash 的任何场景.

The new webcam stuff in Silverlight 4 is darned cool. By exposing it as a brush, it allows scenarios that are way beyond anything that Flash has.

与此同时,在本地访问网络摄像头似乎只是故事的一半.没有人会购买网络摄像头,这样他们就可以给自己拍照并用它们制作有趣的面孔.他们购买网络摄像头是因为他们希望其他人看到生成的视频流,即,他们希望将该视频流式传输到 Internet、Skype 或数十个其他视频聊天网站/应用程序中的任何一个.到目前为止,我还没有想出如何用

At the same time, accessing the webcam locally seems like it's only half the story. Nobody buys a webcam so they can take pictures of themselves and make funny faces out of them. They buy a webcam because they want other people to see the resulting video stream, i.e., they want to stream that video out to the Internet, a lay Skype or any of the dozens of other video chat sites/applications. And so far, I haven't figured out how to do that with

事实证明,获取原始(Format32bppArgb 格式)字节流非常简单,如演示 此处.

It turns out that it's pretty simple to get a hold of the raw (Format32bppArgb formatted) bytestream, as demonstrated here.

但除非我们想将原始字节流传输到服务器(这会占用太多带宽),否则我们需要以某种方式对其进行编码.这更复杂.MS 已经在 Silverlight 中实现了几种编解码器,但据我所知,它们都专注于解码视频流,而不是首先对其进行编码.除此之外,我首先无法弄清楚如何直接访问 H.264 编解码器.

But unless we want to transmit that raw bytestream to a server (which would chew up way too much bandwidth), we need to encode that in some fashion. And that's more complicated. MS has implemented several codecs in Silverlight, but so far as I can tell, they're all focused on decoding a video stream, not encoding it in the first place. And that's apart from the fact that I can't figure out how to get direct access to, say, the H.264 codec in the first place.

有大量的开源编解码器(例如,在 此处 的 ffmpeg 项目中),但是它们都是用 C 编写的,并且它们看起来不容易移植到 C#.除非翻译 10000 多行看起来像这样的代码是你的乐趣:-)

There are a ton of open-source codecs (for instance, in the ffmpeg project here), but they're all written in C, and they don't look easy to port to C#. Unless translating 10000+ lines of code that look like this is your idea of fun :-)

const int b_xy= h->mb2b_xy[left_xy[i]] + 3;
const int b8_xy= h->mb2b8_xy[left_xy[i]] + 1;
*(uint32_t*)h->mv_cache[list][cache_idx ]= *(uint32_t*)s->current_picture.motion_val[list][b_xy + h->b_stride*left_block[0+i*2]];
*(uint32_t*)h->mv_cache[list][cache_idx+8]= *(uint32_t*)s->current_picture.motion_val[list][b_xy + h->b_stride*left_block[1+i*2]];
h->ref_cache[list][cache_idx ]= s->current_picture.ref_index[list][b8_xy + h->b8_stride*(left_block[0+i*2]>>1)];
h->ref_cache[list][cache_idx+8]= s->current_picture.ref_index[list][b8_xy + h->b8_stride*(left_block[1+i*2]>>1)];

Mono 项目中的 mooncodecs 文件夹(此处)有几个音频C# 中的编解码器(ADPCM 和 Ogg Vorbis)和一个视频编解码器(Dirac),但它们似乎都只实现了各自格式的解码部分,就像移植它们的 java 实现一样.

The mooncodecs folder within the Mono project (here) has several audio codecs in C# (ADPCM and Ogg Vorbis), and one video codec (Dirac), but they all seem to implement just the decode portion of their respective formats, as do the java implementations from which they were ported.

我为 Ogg Theora 找到了一个 C# 编解码器(csTheora,http://www.wreckedgames.com/forum/index.php?topic=1053.0),但同样,它只是解码,就像它所基于的 jheora 编解码器一样.

I found a C# codec for Ogg Theora (csTheora, http://www.wreckedgames.com/forum/index.php?topic=1053.0), but again, it's decode only, as is the jheora codec on which it's based.

当然,从 Java 移植编解码器可能比从 C 或 C++ 移植更容易,但我发现的唯一 Java 视频编解码器是仅解码的(例如 jheora 或 jirac).

Of course, it would presumably be easier to port a codec from Java than from C or C++, but the only java video codecs that I found were decode-only (such as jheora, or jirac).

所以我有点回到了第一站.看起来我们通过 Silverlight 将网络摄像头(或麦克风)连接到 Internet 的选项是:

So I'm kinda back at square one. It looks like our options for hooking up a webcam (or microphone) through Silverlight to the Internet are:

(1) 等待微软对此提供一些指导;

(1) Wait for Microsoft to provide some guidance on this;

(2) 花费大脑周期将 C 或 C++ 编解码器之一移植到与 Silverlight 兼容的 C#;

(2) Spend the brain cycles porting one of the C or C++ codecs over to Silverlight-compatible C#;

(3) 将原始的、未压缩的字节流发送到服务器(或者可能用 zlib 之类的东西稍微压缩),然后在服务器端对其进行编码;或

(3) Send the raw, uncompressed bytestream up to a server (or perhaps compressed slightly with something like zlib), and then encode it server-side; or

(4) 等待比我更聪明的人解决这个问题并提供解决方案.

(4) Wait for someone smarter than me to figure this out and provide a solution.

有没有其他人有更好的指导?我是否错过了其他人都非常明显的东西?(例如,Silverlight 4 的某个地方是否有一些我错过的课程可以解决这个问题?)

Does anybody else have any better guidance? Have I missed something that's just blindingly obvious to everyone else? (For instance, does Silverlight 4 somewhere have some classes I've missed that take care of this?)

推荐答案

我想我会让感兴趣的人知道我实际采用的方法.我正在使用 CSpeex 对语音进行编码,但我编写了自己的基于块的视频编解码器来对视频进行编码.它将每一帧分成 16x16 块,确定哪些块已充分改变以保证传输,然后使用 FJCore 的大量修改版本对更改的块进行 Jpeg 编码.(FJCore 通常做得很好,但需要对其进行修改以不写入 JFIF 标头,并加快各种对象的初始化.)所有这些都被传递到使用专有协议的专有媒体服务器,大致基于RTP.

I thought I'd let interested folks know the approach I actually took. I'm using CSpeex to encode the voice, but I wrote my own block-based video codec to encode the video. It divides each frame up into 16x16 blocks, determines which blocks have sufficiently changed to warrant transmitting, and then Jpeg-encodes the changed blocks using a heavily modified version of FJCore. (FJCore is generally well done, but it needed to be modified to not write the JFIF headers, and to speed up initialization of the various objects.) All of this is being passed up to a proprietary media server using a proprietary protocol roughly based on RTP.

在 144x176 的情况下,一个流向上和四个流向下,我目前每秒获得 5 帧,总共使用 474 Kbps(~82 Kbps/视频流 + 32 Kbps/音频),并且咀嚼了大约 30%我的开发箱上的 CPU.质量不是很好,但对于大多数视频聊天应用程序来说是可以接受的.

With one stream up and four streams down at 144x176, I'm currently getting 5 frames per second, using a total of 474 Kbps (~82 Kbps / video stream + 32 Kbps / audio), and chewing up about 30% CPU on my dev box. The quality's not great, but it's acceptable for most video chat applications.

自从我发布了我的原始问题以来,已经多次尝试实施解决方案.可能最好的是在 SocketCoder 网站这里(和此处).

Since I posted my original question, there have been several attempts to implement a solution. Probably the best is at the SocketCoder website here (and here).

但是,由于 SocketCoder 运动 JPEG 风格的视频编解码器会转换每一帧的整体,而不仅仅是已更改的块,我的假设是 CPU 和带宽要求对于大多数应用程序来说将是令人望而却步的.

However, because the SocketCoder motion JPEG-style video codec translates the entirety of every frame rather than just the blocks that have changed, my assumption is that CPU and bandwidth requirements are going to be prohibitive for most applications.

不幸的是,在可预见的未来,我自己的解决方案将不得不保持专有:-(.

Unfortunately, my own solution is going to have to remain proprietary for the foreseeable future :-(.

编辑 7/3/10:我刚刚获得了共享对 FJCore 库的修改的权限.我已经在此处发布了该项目(不幸的是没有任何示例代码):

Edit 7/3/10: I just got permissions to share my modifications to the FJCore library. I've posted the project (without any sample code, unfortunately) here:

http://www.alanta.com/Alanta.Client.Media.Jpeg.zip

一个(非常粗略的)如何使用它的例子:

A (very rough) example of how to use it:

    public void EncodeAsJpeg()
    {
        byte[][,] raster = GetSubsampledRaster();
        var image = new Alanta.Client.Media.Jpeg.Image(colorModel, raster);
        EncodedStream = new MemoryStream();
        var encoder = new JpegFrameEncoder(image, MediaConstants.JpegQuality, EncodedStream);
        encoder.Encode();
    }


    public void DecodeFromJpeg()
    {
        EncodedStream.Seek(0, SeekOrigin.Begin);
        var decoder = new JpegFrameDecoder(EncodedStream, height, width, MediaConstants.JpegQuality);
        var raster = decoder.Decode();
    }

我的大部分更改都围绕着两个新类 JpegFrameEncoder(而不是 JpegEncoder)和 JpegFrameDecoder(而不是 JpegDecoder).基本上,JpegFrameEncoder 写入没有任何 JFIF 标头的编码帧,并且 JpegFrameDecoder 解码帧而不期望任何 JFIF 标头告诉它要使用的值(它假设您将以其他一些带外方式共享这些值).它还实例化它需要的任何对象一次(作为静态"),以便您可以以最小的开销快速实例化 JpegFrameEncoder 和 JpegFrameDecoder.预先存在的 JpegEncoder 和 JpegDecoder 类的工作方式应该与它们以前几乎一样,尽管我只做了很少的测试来确认这一点.

Most of my changes are around the two new classes JpegFrameEncoder (instead of JpegEncoder) and JpegFrameDecoder (instead of JpegDecoder). Basically, the JpegFrameEncoder writes the encoded frame without any JFIF headers, and the JpegFrameDecoder decodes the frame without expecting any JFIF headers to tell it what values to use (it assumes you'll share the values in some other, out-of-band manner). It also instantiates whatever objects it needs just once (as "static"), so that you can instantiate the JpegFrameEncoder and JpegFrameDecoder quickly, with minimal overhead. The pre-existing JpegEncoder and JpegDecoder classes should work pretty much the same as they always have, though I've only done a very little bit of testing to confirm that.

我想要改进它的很多方面(我不喜欢静态对象——它们应该被单独实例化和传递),但它目前对我们的目的来说已经足够了.希望它对其他人有帮助.我会看看我是否可以改进代码/文档/示例代码/等.如果我有时间.

There are lots of things I'd like to improve about it (I don't like the static objects -- they should be instantiated and passed in separately), but it works well enough for our purposes at the moment. Hopefully it's helpful for someone else. I'll see if I can improve the code/documentation/sample code/etc. if I have time.

这篇关于从 Silverlight 4(测试版)流式传输网络摄像头的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆