使用Jsoup检测脚本中的文本并获取脚本标签中的文本 [英] Detect text in script and Obtain text within script tag using Jsoup

查看:93
本文介绍了使用Jsoup检测脚本中的文本并获取脚本标签中的文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想编码以获取价值流,但失败了. 网址检测 http://m.xemtvhd.com/vtv1.php

I want to code get value stream but fail. url detect http://m.xemtvhd.com/vtv1.php

如何获取价值流: 123.30.215.65/hls/4545780bfa790819/5/3/d836ad614748cdab11c9df291254cf836f21144da20bf08142455a8735b328ca/dnR2MQ==_m.m3u8 使用Jsoup吗?

How to get value stream : http://123.30.215.65/hls/4545780bfa790819/5/3/d836ad614748cdab11c9df291254cf836f21144da20bf08142455a8735b328ca/dnR2MQ==_m.m3u8 using Jsoup ?

    <html>
 <head>
  <style>html,body{margin:0;padding:0;background:#000;;}</style>
  <meta charset="utf-8">
  <script src="https://code.jquery.com/jquery-2.1.4.js"></script>
  <script type="text/javascript" src="https://cdn.jsdelivr.net/clappr/latest/clappr.min.js"></script>
  <meta name="referrer" content="no-referrer">
 </head>      
 <body> 
  <div style="width: 100%;"> 
  </div> 
  <div id="player"></div> 
  <script>
	player = new Clappr.Player({source: "http://123.30.215.65/hls/4545780bfa790819/5/3/d836ad614748cdab11c9df291254cf836f21144da20bf08142455a8735b328ca/dnR2MQ==_m.m3u8",
			parentId: '#player',
			width: '100%', height: "100%",
		    hideMediaControl: true,
		    autoPlay: true
					        });	
	</script>   
 </body>
</html>

用Java代码编写我:

Code Java me :

Elements script = doc.select("script");
Pattern p = Pattern.compile("player = new Clappr.Player(\\(\"source:{\", \"(.*)\", false\\)");
                                                    //  ^^ is the capturing group
String url = "";

for (Element element : script) {
    Matcher m = p.matcher(element.data());
    if (m.find()){
        url = m.group(1);
    }
}
System.out.println(url);

请支持 .谢谢!

please support . thank you !

推荐答案

您的正则表达式中存在一些错误,例如,您实际上将{放在了source之后,因此您需要对其进行两次转义,因为{i,j}是量词,,也不是一个好地方,我不知道你为什么这么做,它与句子的顺序不一样

There is some mistakes in your regex, for ex you put the { after source in fact it's before and you need to double escape it because {i,j} are quantifiers, and the , is also not at good place, I don't know why you did that, it's not the same order than the sentence

要修正使用方法:Pattern.compile("player = new Clappr\\.Player\\(\\{source: \"(http:.*)\",.*");

使用正则表达式时,请像 regex101 一样在线尝试,并逐个单词地写正则表达式,最后您可以尝试用\ d,\ w,...替换单词/组以减少

When you use regex, try online like regex101 and write the regex like word by word, and at the end you can try to replace word/group by \d, \w,... to reduce

这篇关于使用Jsoup检测脚本中的文本并获取脚本标签中的文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆