使用常规的前pressions解析HLS M3U8文件 [英] Parsing HLS m3u8 file using regular expressions

查看:1181
本文介绍了使用常规的前pressions解析HLS M3U8文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想分析HLS主M3U8文件,并得到了带宽,分辨率和它的文件名。目前我使用字符串分析来搜索字符串的一些模式,做子字符串来获得价值。

示例文件:

 #EXTM3U
#EXT-X-STREAM-INF:PROGRAM-ID = 1,带宽= 476416,分辨率= 416x234
流1 / index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID = 1,带宽= 763319,分辨率=小480x270
流2 / index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID = 1,带宽= 1050224,分辨率=为640x360
流3 / index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID = 1,带宽= 1910937,分辨率=为640x360
STREAM4 / index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID = 1,带宽= 3775816,分辨率1280×720 =
Stream5 / index.m3u8
 

不过,我发现,我们可以使用常规的EX pressions解析它​​就像提到了这个问题: 问题的安卓匹配的正则表达式

我没有正规EX pression任何想法,以便能有一个人请指导我来分析这个使用常规的前pression。

或者有人可以帮助我以书面形式正则表达式从下面的字符串解析出带宽和分辨率值

 #EXT-X-STREAM-INF:PROGRAM-ID = 1,带宽= 476416,分辨率= 416x234
 

解决方案

您可以尝试这样的:

 最终格局模式= Pattern.compile(^#EXT-X-STREAM-INF:*带宽=(\\ D +)*解决方案=([\\ DX] +)*)。

    匹配匹配= pattern.matcher(#EXT-X-STREAM-INF:PROGRAM-ID = 1,带宽= 476416,分辨率= 416x234);
    串的带宽=;
    字符串分辨率=;

    如果(matcher.find()){
        带宽= matcher.group(1);
        分辨率= matcher.group(2);
    }
 

会设置带宽和分辨率,以正确的(字符串)值。

我没有尝试过这个Android设备或仿真器上,而是从你发来的链接,Android的API判断它应该工作一样的上述普通的Java类。

正则表达式匹配开头#EXT-X-STREAM-INF字符串:并包含带宽解决方案其次是正确的值格式。这些都是再回到引用的反向引用组1和2,所以我们可以提取它们。

编辑:

如果解决方案并不总是present那么你可以让这部分可选的,因为这样的:

 ^#EXT-X-STREAM-INF:*带宽=(\\ D +)*(?:分辨率=([\\ DX] +))? *
 

分辨率的字符串将是的情况下,仅带宽是present。

EDIT2:

使事情可选的,而(?:___)表示被动组(而不是一回-reference组(___),所以它基本上是一个可选的被动的群体。所以,是的,它里面的东西将是可选的。

A 匹配单个字符,和 * 使得意味着将重复零次或多次。因此,。* 将匹配零个或多个字符。我们需要这样做的原因是为了消费之间,我们匹配的东西,比如什么#EXT-X-STREAM-INF之间的任何东西:带宽。有这样做,但的多种方法。* 是最通用的/广泛之一。

\ D 基本上是一组重present数字字符( 0-9 ),但由于我们定义字符串作为一个Java字符串,我们需要加倍 \\ ,否则Java编译器将失败,因为它不能识别转义字符 \ D (在Java中)。相反,它会解析 \\ \ 使我们获得 \ D 传递给模式,最后一个字符串的构造。

[\ DX] + 意味着一个或多个字符( + )出来的字符 0-9 X [\ DX \ D] 将是一个单个字符(没有 + )出相同的字符集的。

如果您有兴趣,正则表达式,你可以检查出定期-EX pressions.info 或/和 regexone.com ,在那里你会发现很多更深入你所有的的问题。

I want to parse HLS master m3u8 file and get the bandwidth, resolution and file name from it. Currently i am using String parsing to search string for some patterns and do the sub string to get value.

Example File:

#EXTM3U
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=476416,RESOLUTION=416x234
Stream1/index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=763319,RESOLUTION=480x270
Stream2/index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=1050224,RESOLUTION=640x360
Stream3/index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=1910937,RESOLUTION=640x360
Stream4/index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=3775816,RESOLUTION=1280x720
Stream5/index.m3u8

But i found that we can parse it using regular expressions like mentioned in this question: Problem matching regex pattern in Android

I don't have any Idea of regular expression so can some one please guide me to parse this using regular expression.

Or can someone help me in writing regexp for parsing out BANDWIDTH and RESOLUTION values from below string

#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=476416,RESOLUTION=416x234

解决方案

You could try something like this:

    final Pattern pattern = Pattern.compile("^#EXT-X-STREAM-INF:.*BANDWIDTH=(\\d+).*RESOLUTION=([\\dx]+).*");

    Matcher matcher = pattern.matcher("#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=476416,RESOLUTION=416x234");
    String bandwidth = "";
    String resolution = "";

    if (matcher.find()) {
        bandwidth = matcher.group(1);
        resolution = matcher.group(2);
    }

Would set bandwidth and resolution to the correct (String) values.

I haven't tried this on an android device or emulator, but judging from the link you sent and the android API it should work the same as the above plain old java.

The regex matches strings starting with #EXT-X-STREAM-INF: and contains BANDWIDTH and RESOLUTION followed by the correct value formats. These are then back-referenced in back-reference group 1 and 2 so we can extract them.

Edit:

If RESOLUTION isn't always present then you can make that portion optional as such:

"^#EXT-X-STREAM-INF:.*BANDWIDTH=(\\d+).*(?:RESOLUTION=([\\dx]+))?.*"

The resolution string would be null in cases where only BANDWIDTH is present.

Edit2:

? makes things optional, and (?:___) means a passive group (as opposed to a back-reference group (___). So it's basically a optional passive group. So yes, anything inside it will be optional.

A . matches a single character, and a * makes means it will be repeated zero or more times. So .* will match zero or more characters. The reason we need this is to consume anything between what we are matching, e.g. anything between #EXT-X-STREAM-INF: and BANDWIDTH. There are many ways of doing this but .* is the most generic/broad one.

\d is basically a set of characters that represent numbers (0-9), but since we define the string as a Java string, we need the double \\, otherwise the Java compiler will fail because it does not recognize the escaped character \d (in Java). Instead it will parse \\ into \ so that we get \d in the final string passed to the Pattern constructor.

[\dx]+ means one or more characters (+) out of the characters 0-9 and x. [\dx\d] would be a single character (no +) out of the same set of characters.

If you are interested in regex you could check out regular-expressions.info or/and regexone.com, there you will find much more in depth answers to all your questions.

这篇关于使用常规的前pressions解析HLS M3U8文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆