如何以正确的方式将一列分成2个? [英] How can I split a column into 2 in the correct way?

查看:56
本文介绍了如何以正确的方式将一列分成2个?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在从网站上对表格进行网上抓取,并将其放入Excel文件中. 我的目标是以正确的方式将一列分为2列.

我要拆分的列:"FLIGHT"

我想要这种形式:

第一个示例:KL744-> KL和0744

第二个示例:BE1013-> BE和1013

因此,我需要分隔第一个2个字符(在第一列中),然后分隔下一个1-2-3-4个字符.如果可以,则保留4,如果保留,则保留3,如果要保留2,则要在其前面放置0,如果要保留2:我想在其前面放置00(所以我的目标是在第二列中获取4个字符/数字.)

我该怎么做?

这是我的相关代码,该代码已经包含格式代码.

df2 = pd.DataFrame(datatable,columns = cols)
df2["UPLOAD_TIME"] = datetime.now()
mask = np.column_stack([df2[col].astype(str).str.contains(r"Scheduled", na=True) for col in df2])
df3 = df2.loc[~mask.any(axis=1)] 

if os.path.isfile("output.csv"):
    df1 = pd.read_csv("output.csv", sep=";")
    df4 = pd.concat([df1,df3])
    df4.to_csv("output.csv", index=False, sep=";")

else:
    df3.to_csv
    df3.to_csv("output.csv", index=False, sep=";")

这是我表中的excel prt sc:

解决方案

您可以使用编制索引" rel ="nofollow noreferrer"> zfill :

df = pd.DataFrame({'FLIGHT':['KL744','BE1013']})


df['a'] = df['FLIGHT'].str[:2]
df['b'] = df['FLIGHT'].str[2:].str.zfill(4)
print (df)
   FLIGHT   a     b
0   KL744  KL  0744
1  BE1013  BE  1013

我相信您的代码需要:

df2 = pd.DataFrame(datatable,columns = cols)
df2['a'] = df2['FLIGHT'].str[:2]
df2['b'] = df2['FLIGHT'].str[2:].str.zfill(4)
df2["UPLOAD_TIME"] = datetime.now()
...
...

I am web-scraping tables from a website, and I am putting it to the Excel file. My goal is to split a columns into 2 columns in the correct way.

The columns what i want to split: "FLIGHT"

I want this form:

First example: KL744 --> KL and 0744

Second example: BE1013 --> BE and 1013

So, I need to separete the FIRST 2 character (in the first column), and after that the next characters which are 1-2-3-4 characters. If 4 it's oke, i keep it, if 3, I want to put a 0 before it, if 2 : I want to put 00 before it (so my goal is to get 4 character/number in the second column.)

How Can I do this?

Here my relevant code, which is already contains a formatting code.

df2 = pd.DataFrame(datatable,columns = cols)
df2["UPLOAD_TIME"] = datetime.now()
mask = np.column_stack([df2[col].astype(str).str.contains(r"Scheduled", na=True) for col in df2])
df3 = df2.loc[~mask.any(axis=1)] 

if os.path.isfile("output.csv"):
    df1 = pd.read_csv("output.csv", sep=";")
    df4 = pd.concat([df1,df3])
    df4.to_csv("output.csv", index=False, sep=";")

else:
    df3.to_csv
    df3.to_csv("output.csv", index=False, sep=";")

Here the excel prt sc from my table:

解决方案

You can use indexing with str with zfill:

df = pd.DataFrame({'FLIGHT':['KL744','BE1013']})


df['a'] = df['FLIGHT'].str[:2]
df['b'] = df['FLIGHT'].str[2:].str.zfill(4)
print (df)
   FLIGHT   a     b
0   KL744  KL  0744
1  BE1013  BE  1013

I believe in your code need:

df2 = pd.DataFrame(datatable,columns = cols)
df2['a'] = df2['FLIGHT'].str[:2]
df2['b'] = df2['FLIGHT'].str[2:].str.zfill(4)
df2["UPLOAD_TIME"] = datetime.now()
...
...

这篇关于如何以正确的方式将一列分成2个?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆