Python - 如何按非字母字符拆分字符串 [英] Python - How to split a string by non alpha characters

查看:116
本文介绍了Python - 如何按非字母字符拆分字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 python 来解析 C++ 源代码行.我唯一感兴趣的是包含指令.

 #include "header.hpp"

我希望它灵活,并且仍然可以使用糟糕的编码风格,例如:

 # 包含header.hpp"

我已经到了可以在# 之前和之后读取行和修剪空白的地步.但是,我仍然需要通过读取字符串来找出它是什么指令,直到遇到非字母字符,无论天气如何,它都是空格、引号、制表符或尖括号.

所以基本上我的问题是:如何拆分以字母开头的字符串,直到遇到非字母?

我想我可以用正则表达式来做到这一点,但我没有在文档中找到任何我想要的东西.

另外,如果有人对我如何在引号或尖括号内获取文件名有建议,那将是一个加号.

解决方案

您可以使用正则表达式来做到这一点.但是,您也可以使用简单的 while 循环.

def splitnonalpha(s):位置 = 1当 pos 

测试:

<预><代码>>>>splitnonalpha('#include"blah.hpp"')('#include', '"blah.hpp"')

I'm trying to use python to parse lines of c++ source code. The only thing I am interested in is include directives.

    #include "header.hpp"

I want it to be flexible and still work with poor coding styles like:

          #   include"header.hpp"  

I have gotten to the point where I can read lines and trim whitespace before and after the #. However I still need to find out what directive it is by reading the string until a non-alpha character is encountered regardless of weather it is a space, quote, tab or angled bracket.

So basically my question is: How can I split a string starting with alphas until a non alpha is encountered?

I think I might be able to do this with regex, but I have not found anything in the documentation that looks like what I want.

Also if anyone has advice on how I would get the file name inside the quotes or angled brackets that would be a plus.

解决方案

You can do that with a regex. However, you can also use a simple while loop.

def splitnonalpha(s):
   pos = 1
   while pos < len(s) and s[pos].isalpha():
      pos+=1
   return (s[:pos], s[pos:])

Test:

>>> splitnonalpha('#include"blah.hpp"')
('#include', '"blah.hpp"')

这篇关于Python - 如何按非字母字符拆分字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆