CPP +正则表达式以验证URL [英] CPP + Regular Expression to Validate URL

查看:482
本文介绍了CPP +正则表达式以验证URL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



正则表达式必须满足以下条件。



有效网址: -
http://cu-241.dell-tech.co.in/MyWebSite/ISAPIWEBSITE/Denypage.aspx/
http://www.google.com
http://www.google.co.in



网址无效: -


  1. http://cu-241.dell-tech.co.in/ \MyWebSite / \ISAPIWEBSITE / \Denypage.aspx / = Regx必须检查& /\\\\\\\ MyWebSite/\ISAPIWEBSITE/\Denypage.aspx /

  2. 之间的无效网址://cu-241.dell-tech.co.in//////MyWebSite/ISAPIWEBSITE/Denypage.aspx/rel =nofollow> http://cu-241.dell-tech.co.in //////MyWebSite/ISAPIWEBSITE/Denypage.aspx/ = Regx必须检查&


  3. http://news.google.co.in/%5Cnwshp?hl=zh-CN&tab=wn =正则表达式必须检查&无效的网址,用于额外插入%5C& %2F字符。


如何开发一个满足上述条件的通用正则表达式。
请帮助我们,通过提供一个正则表达式来处理CPP {MFC}中的上述场景。

解决方案

尝试使用 RFC 3986 建议?如果您能够使用GCC-4.9,那么您可以直接使用< regex>



使用 ^(([^:/?#] +):)?(//([^ /?#] *) ([^#] *))?(#(。*))?您可以作为子匹配:

  scheme = $ 2 
authority = $ 4
path = $ 5
query = $ 7
fragment = $ 9

例如:

  int main(int argc,char * argv [])
{
std :: string url(argv [1]);
unsigned counter = 0;

std :: regex url_regex(
R(^((^:\ /?#] +):)?(//([^ \ /?# *))?([^?#] *)(\?([^#] *))?(#(。*))?),
std :: regex :: extended
);
std :: smatch url_match_result;

std :: cout<< 检查:< url<< std :: endl;

if(std :: regex_match(url,url_match_result,url_regex)){
for(const auto& res:url_match_result){
std :: cout< counter ++<< :<< res<< std :: endl;
}
} else {
std :: cerr<< 格式错误的网址。 << std :: endl;
}

return EXIT_SUCCESS;
}

然后:

  ./ url-matcher http://localhost.com/path\?hue\=br\#cool 

检查:http:// localhost.com/path?hue=br#cool
0:http://localhost.com/path?hue=br#cool
1:http:
2:http
3://localhost.com
4:localhost.com
5:/ path
6:?hue = br
7:hue = br
8: #cool
9:cool


I want to build a regular expression in c++{MFC} which validates the URL.

The regular expression must satisfy following conditions.

Valid URL:- http://cu-241.dell-tech.co.in/MyWebSite/ISAPIWEBSITE/Denypage.aspx/ http://www.google.com http://www.google.co.in

Invalid URL:-

  1. http://cu-241.dell-tech.co.in/\MyWebSite/\ISAPIWEBSITE/\Denypage.aspx/ = Regx must check & invalid URL as '\' character between "/\MyWebSite/\ISAPIWEBSITE/\Denypage.aspx/"

  2. http://cu-241.dell-tech.co.in//////MyWebSite/ISAPIWEBSITE/Denypage.aspx/ = Regx must check & invalidate URL due to multiple entries of "///////" in url.

  3. http://news.google.co.in/%5Cnwshp?hl=en&tab=wn = Regex must check & invalidate URL for additional insertion of %5C & %2F character.

How can we develop a generic Regular Expression satisfying above condition. Please, Help us by providing a regular expression that will handle above scenario's in CPP{MFC}

解决方案

Have you tried using the RFC 3986 suggestion? If you're capable of using GCC-4.9 then you can go directly with <regex>.

It states that with ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))? you can get as submatches:

  scheme    = $2
  authority = $4
  path      = $5
  query     = $7
  fragment  = $9

For example:

int main(int argc, char *argv[])
{
  std::string url (argv[1]);
  unsigned counter = 0;

  std::regex url_regex (
    R"(^(([^:\/?#]+):)?(//([^\/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?)",
    std::regex::extended
  );
  std::smatch url_match_result;

  std::cout << "Checking: " << url << std::endl;

  if (std::regex_match(url, url_match_result, url_regex)) {
    for (const auto& res : url_match_result) {
      std::cout << counter++ << ": " << res << std::endl;
    }
  } else {
    std::cerr << "Malformed url." << std::endl;
  }

  return EXIT_SUCCESS;
}

Then:

./url-matcher http://localhost.com/path\?hue\=br\#cool

Checking: http://localhost.com/path?hue=br#cool
0: http://localhost.com/path?hue=br#cool
1: http:
2: http
3: //localhost.com
4: localhost.com
5: /path
6: ?hue=br
7: hue=br
8: #cool
9: cool

这篇关于CPP +正则表达式以验证URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆