“没有这样的文件"使用torchtext.data.TabularDataset将存储在G驱动器中的csv数据加载为torchtext格式时, [英] "No such file" when loading csv data stored in G drive to torchtext format using torchtext.data.TabularDataset,
问题描述
我已将一个csv文件存储在G驱动器中,并尝试将其加载到torchtext data.TabularDataset中.错误消息是"FileNotFoundError:[Errno 2]没有这样的文件或目录:'https://.....'"
I have stored a csv file in G drive and try to load it to torchtext data.TabularDataset. The error message is "FileNotFoundError: [Errno 2] No such file or directory: 'https://.....'"
是否无法将g驱动器中的csv文件直接加载到torchtext TabularDataset中?
Is it impossible to load csv file from g drive directly to torchtext TabularDataset?
这是代码.我还制作了一个公开的可乐笔记本,其中包含可公开获得的数据.
Here is the code. I have also made a public colab notebook with data publicly available.
import torch
from torchtext import data, datasets
!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
TEXT = data.Field(tokenize = 'spacy', batch_first = True, lower=False)
LABEL = data.LabelField(sequential=False, dtype = torch.float)
train = data.TabularDataset(path = 'https://drive.google.com/open?id=1eWMjusU3H34m0uml5SdJvYX6gQuB8zta',
format = 'csv',
fields = [('Insult', LABEL), (None, None), ('Comment', TEXT)],
skip_header=False)
推荐答案
让我们假设您可以负担得起下载此CSV文件的费用.我建议您在torchtext上使用功能上内置的download_from_url
.
Let's assume you can afford to download this CSV file. I would suggest you to use a functionally built-in on torchtext: download_from_url
.
import os
import torch
from torchtext import data, datasets
from torchtext.utils import download_from_url
# download the file
CSV_FILENAME = 'data.csv'
CSV_GDRIVE_URL = 'https://drive.google.com/uc?export=download&id=1eWMjusU3H34m0uml5SdJvYX6gQuB8zta'
download_from_url(CSV_GDRIVE_URL, CSV_FILENAME)
TEXT = data.Field(tokenize = 'spacy', batch_first = True, lower=False) #from torchtext import data
LABEL = data.LabelField(sequential=False, dtype = torch.float)
# if you're on Colab, you'll need this /content
train = data.TabularDataset(path=os.path.join('/content', CSV_FILENAME),
format='csv',
fields = [('Insult', LABEL), (None, None), ('Comment', TEXT)],
skip_header=False )
请注意,Google云端硬盘链接不应是带有open?id
的链接,而应将其更改为uc?export=download&id
.
Notice that the Google Drive link should not be the one with open?id
, but change it to uc?export=download&id
.
这篇关于“没有这样的文件"使用torchtext.data.TabularDataset将存储在G驱动器中的csv数据加载为torchtext格式时,的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!