如何从字符串中删除\ n和\ r [英] How to remove \n and \r from a string

查看:493
本文介绍了如何从字符串中删除\ n和\ r的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在尝试从以下网站获取代码: http://netherkingdom.netai.net/pycake.html 然后,我有一个python脚本解析html div标签中的所有代码,最后将div标签之间的文本写入文件.问题是它在文件中添加了一堆\ r和\ n.如何避免这种情况或删除\ r和\ n.这是我的代码:

I currently am trying to get the code from this website: http://netherkingdom.netai.net/pycake.html Then I have a python script parse out all code in html div tags, and finally write the text from between the div tags to a file. The problem is it adds a bunch of \r and \n to the file. How can I either avoid this or remove the \r and \n. Here is my code:

import urllib.request
from html.parser import HTMLParser
import re
page = urllib.request.urlopen('http://netherkingdom.netai.net/pycake.html')
t = page.read()
class MyHTMLParser(HTMLParser):
    def handle_data(self, data):
        print(data)
        f = open('/Users/austinhitt/Desktop/Test.py', 'r')
        t = f.read()
        f = open('/Users/austinhitt/Desktop/Test.py', 'w')
        f.write(t + '\n' + data)
        f.close()
parser = MyHTMLParser()
t = t.decode()
parser.feed(t)

这是它生成的结果文件:

And here is the resulting file it makes:

b'
import time as t\r\n
from os import path\r\n
import os\r\n
\r\n
\r\n
\r\n
\r\n
\r\n'

最好我也想删除开头b'和last'.我在Mac上使用的是Python 3.5.1.

Preferably I would also like to have the beginning b' and last ' removed. I am using Python 3.5.1 on a Mac.

推荐答案

一个简单的解决方案是去除尾随空白:

A simple solution is to strip trailing whitespace:

with open('gash.txt', 'r') as var:
    for line in var:
        line = line.rstrip()
        print(line)

与使用[:-2] slice相比,rstrip()的优势在于,这对于UNIX样式文件也是安全的.

The advantage of rstrip() over using a [:-2] slice is that this is safe for UNIX style files as well.

但是,如果您只想摆脱\r并且他们可能不在行尾,那么str.replace()是您的朋友:

However, if you only want to get rid of \r and they might not be at the end-of-line, then str.replace() is your friend:

line = line.replace('\r', '')

如果您有一个字节对象(即领先的b'),则可以使用以下命令将其转换为原生Python 3字符串:

If you have a byte object (that's the leading b') the you can convert it to a native Python 3 string using:

line = line.decode()

这篇关于如何从字符串中删除\ n和\ r的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆