将文本文件中的数据导入到 Pandas 数据框中 [英] Import data from a text file into a pandas dataframe

查看:63
本文介绍了将文本文件中的数据导入到 Pandas 数据框中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 Django 构建一个网络应用程序.我使用

I'm building a web app using Django. I uploaded a text file using

csv_file = request.FILES['file'].

csv_file = request.FILES['file'].

我无法将 csv 读入 Pandas.我试图导入的文件有文本和数据,但我只想要数据.

I can't read the csv into pandas. The file that i'm trying to import has text and data, but I only want the data.

我已经尝试了以下

  1. df = pd.read_csv(csv_file, sep=" ", header=None, names=["col1","col2","col3"], skiprows = 2) 尝试删除评论并阅读数字

错误:pandas 不会读取所有 3 列.它只读取 1 列

Error: pandas will not read all 3 columns. It only reads 1 column

  1. 我试过 df = pd.read_csv(csv_file, sep="\s{2}", sep=" ", header=None, names=["col1","col2","col3"], skiprows =2) 尝试删除评论并只阅读数字

错误:不能在类似字节的对象上使用字符串模式

Error: cannot use a string pattern on a bytes-like object

  1. 我试过 df = pd.read_csv(csv_file.read(), sep=" ", header=None, names=["col1","col2","col3"], skiprows = 2) 试图删除评论并阅读数字

我上传的文件

% filename
% username
2.0000  117.441  -0.430
2.0100  117.499  -0.337
2.0200  117.557  -0.246
2.0300  117.615  -0.157
2.0400  117.672  -0.069

views.py

def new_measurement(request, pk):
    material = Material.objects.get(pk=pk)
    if request.method == 'POST':
        form = NewTopicForm(request.POST)
        if form.is_valid():
            topic = form.save(commit=False)
            topic.material = material
            topic.message=form.cleaned_data.get('message')
            csv_file = request.FILES['file']
            df = genDataFrame(csv_file)
            topic.data = df
            topic.created_by = request.user
            topic.save()
            return redirect('topic_detail', pk =  material.pk)
    else:
        form = NewTopicForm()
    return render(request, 'new_topic.html', {'material': material, 'form': form})

def genDataFrame(csv_file):
    df = pd.read_csv(csv_file, sep=" ", header=None, names=["col1","col2","col3"])
    df = df.convert_objects(convert_numeric=True)
    df = df.dropna()
    df = df.reset_index(drop = True)
    return df_list

我想得到一个像

col1   col2     col3
2.0000  117.441  -0.430
2.0100  117.499  -0.337
2.0200  117.557  -0.246
2.0300  117.615  -0.157
2.0400  117.672  -0.069

推荐答案

您在描述点 #2 中的方法几乎是正确的.另外,我的回答只是在@prooffreader 的回答中添加了正则表达式作为分隔符,因为它会使语句更不容易出错.

You had almost the right approach in your description point #2. Also, my answer just adds regex as separator to @prooffreader's answer as it will make the statement less error prone.

 df = pd.read_csv('file_path', sep="\s+",header=None, 
                    names=['col1', 'col2','col3'], skiprows=2)

这篇关于将文本文件中的数据导入到 Pandas 数据框中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆