如何使用正则表达式在Matlab中解析HTML标签? [英] How to parse HTML tags in Matlab using regexp?

查看:358
本文介绍了如何使用正则表达式在Matlab中解析HTML标签?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我时间不够,特意想要提取一个如下所示的字符串。问题是标签不是< a>

code> s =< em style =font-size:medium> 5,888< / em>

如何在matlab中提取出5,888?

解决方案

谢谢你们的帮助。我基本上是想通过Matlab获得美国的人口。以为我会分享我的代码,尽管不是最优雅的。可能帮助一些灵魂。 :)

  county ='morris'; 
state ='ks';

county = strrep(county,'','+');
str = sprintf('https://www.google.com/search?&q=population+%s+%s',county,state);
s = urlread(str);
pop = regexp(s,'< em>> *>(。*?)< / em>','tokens');
pop = char(pop {:});
pop = strrep(pop,',','');
pop = str2num(pop);


I'm short on time and specifically wanted to extract a string like the one below. Problem is the tag isn't of the form <a> data </a>.

Given,

s = <em style="font-size:medium"> 5,888 </em>

how to extract out just 5,888 in matlab?

解决方案

Thanks folks for your help. I'm basically trying to get the population of a US county on Matlab. Thought I'l share my code, though not the most elegant. Might help some soul. :)

county = 'morris';
state = 'ks';

county = strrep(county, ' ' , '+');
str = sprintf('https://www.google.com/search?&q=population+%s+%s',county,state);
s = urlread(str);
pop = regexp(s,'<em[^>]*>(.*?)</em>', 'tokens');
pop = char(pop{:});
pop = strrep(pop, ',' , '');
pop = str2num(pop);

这篇关于如何使用正则表达式在Matlab中解析HTML标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆