大 pandas 中的read_table,如何从文本到数据框获取输入 [英] read_table in pandas, how to get input from text to a dataframe
问题描述
Alabama[edit]
Auburn (Auburn University)[1]
Florence (University of North Alabama)
Jacksonville (Jacksonville State University)[2]
Alaska[edit]
Fairbanks (University of Alaska Fairbanks)[2]
Arizona[edit]
Flagstaff (Northern Arizona University)[6]
Tempe (Arizona State University)
Tucson (University of Arizona)
这是我的文字,我需要创建一个数据框,其中一栏为州名,另一栏为镇名,我知道如何删除大学名.但是我怎么告诉熊猫,每个[edit]都是一个新状态.
This is my text, i need to create a data frame with 1 column for the state name, and another column for the town name, i know how to remove the university names. but how do i tell pandas that at every [edit] is a new state.
预期的输出数据帧
Alabama Auburn
Alabama Florence
Alabama Jacksonville
Alaska Fairbanks
Arizona Flagstaff
Arizona Tempe
Arizona Tucson
我不确定是否可以使用read_table,如何确定?我确实将所有内容导入数据框,但是州和城市在同一列中.我也尝试了一个列表,但问题仍然相同.
I am not sure if i can use read_table, if i can how? I did import everything into a dataframe but the state and the city are in the same column. Also i tried with a list, but the problem is still the same.
我需要的东西的工作方式是,如果该行中有一个[edit],则该行之后和下一个[edit]行之前的所有值都是它们之间的行的状态
I need something that works like if there is a [edit] in the line then all the value after it and before the next [edit] line is the state of the lines in between
推荐答案
也许pandas
可以做到,但您可以轻松做到.
Maybe pandas
can do it but you can do it easily.
data = '''Alabama[edit]
Auburn (Auburn University)[1]
Florence (University of North Alabama)
Jacksonville (Jacksonville State University)[2]
Alaska[edit]
Fairbanks (University of Alaska Fairbanks)[2]
Arizona[edit]
Flagstaff (Northern Arizona University)[6]
Tempe (Arizona State University)
Tucson (University of Arizona)'''
# ---
result = []
state = None
for line in data.split('\n'):
if line.endswith('[edit]'):
# remember new state
state = line[:-6] # without `[edit]`
else:
# add state, city to result
city, rest = line.split(' ', 1)
result.append( [state, city] )
# --- display ---
for state, city in result:
print(state, city)
如果您从文件中读取
result = []
state = None
with open('your_file') as f:
for line in f:
line = line.strip() # remove '\n'
if line.endswith('[edit]'):
# remember new state
state = line[:-6] # without `[edit]`
else:
# add state, city to result
city, rest = line.split(' ', 1)
result.append( [state, city] )
# --- display ---
for state, city in result:
print(state, city)
现在您可以使用result
来创建DataFrame
.
Now you can use result
to create DataFrame
.
这篇关于大 pandas 中的read_table,如何从文本到数据框获取输入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!