如何使用张量流进行文本分类? [英] How to do Text classification using tensorflow?

查看:57
本文介绍了如何使用张量流进行文本分类?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是Tensorflow和机器学习的新手.我在编写一个tensorflow代码时遇到了问题,该代码的文本分类类似于我使用sklearn库尝试的文本分类.我在对数据集进行矢量化并将输入提供给tensorflow层时遇到了主要问题.

I am new to tensorflow and machine learning. I am facing issues with writing a tensorflow code which does the text classification similar to one I tried using sklearn libraries. I am facing major issues with vectorising the dataset and providing the input to tensorflow layers.

我确实记得在一次热编码标签上成功,但是前面的tensorflow层不接受创建的数组. 请注意,我已经阅读了大多数关于stackoverflow的文字分类问题,但是它们过于具体或需要解决复杂的问题. 我的问题案例太狭窄,需要非常基本的解决方案.

I do remember being successful in one hot encoding the labels but the tensorflow layer ahead did not accept the created array. Please note, I have read majority of text clasification answered questions on stackoverflow but they are too specific or have complex needs to resolve. My problem case is too narrow and requires very basic solution.

如果有人能告诉我类似于我的sklearn机器学习算法的步骤或张量流代码,那将是非常有用的帮助.

It would be great help if anyone could tell me the steps or tensorflow code similar to my sklearn machine learning algorithm.

使用的数据集位于: https://www.kaggle.com/virajgala/classifying -文本


from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.linear_model import SGDClassifier
from sklearn.pipeline import Pipeline

#Reading the csv dataset
df = pd.read_csv(('/Classifyimg_text.csv'), index_col=False).sample(frac=1)

#Splitting the dataset
train_data, test_data, train_labels, test_labels = train_test_split(df['sentence'], df['label'], test_size=0.2)

#Vectorization and Classification 
streamline = Pipeline([('vect', TfidfVectorizer(max_features=int(1e8))),
                           ('clf', SGDClassifier())]).fit(train_data, train_labels)

#Prediction
Output = streamline.predict(["This is my action to classify the text."])

推荐答案

这个问题有点广泛.也许您可以看看Tensorflow网站上发布的用于二进制文本分类的教程(正面和负面)并尝试实施它.在此过程中,如果您遇到任何需要进一步说明的问题或概念,请搜索StackOverflow以查看是否有人提出了与您类似的问题.如果没有,请花些时间按照这些准则写一个问题,以便有能力回答的人将拥有所有他们需要的信息.希望这些信息能使您有个好的开始,并欢迎您使用Stack Overflow!

this question is a bit broad. Perhaps you can take a look at the tutorial posted on Tensorflow's website for binary text classification (positive and negative) and try to implement it. During the process, if you come across any problems or concepts that need further explanation, search StackOverflow to see if someone has asked a question similar to yours. If not, take the time to write a question following these guidelines so people with the ability to answer will have all the information they need. I hope this information gets you off to a good start and welcome to Stack Overflow!

这篇关于如何使用张量流进行文本分类?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆