python中具有多种功能类型的机器学习 [英] Machine learning with multiple feature types in python

查看：67 发布时间：2020/5/4 9:56:54 python machine-learning scikit-learn nltk feature-extraction

本文介绍了python中具有多种功能类型的机器学习的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我能够使用Python中的scikit-learn和NLTK模块进行一些简单的机器学习.但是在使用具有不同值类型(数字，字符串列表，是/否等)的多个功能进行训练时，我遇到了问题.在以下数据中，我有一个单词/短语列，在其中提取信息并创建相关列(例如，length列是'word/phrase'的字符长度).标签栏就是标签.

I am able to do some simple machine learning using scikit-learn and NLTK modules in Python. But I have problems when it comes to training with multiple features that have different value types (number, list of string, yes/no, etc). In the following data, I have a word/phrase column in which I extract the information and create relevant columns (for example, the length column is the character lengths of 'word/phrase'). Label column is the label.

Word/phrase Length  '2-letter substring'    'First letter'  'With space?'       Label
take action 10  ['ta', 'ak', 'ke', 'ac', 'ct', 'ti', 'io', 'on']    t   Yes     A
sure    4   ['su', 'ur', 're']  s   No      A
That wasn't     10  ['th', 'ha', 'at', 'wa', 'as', 'sn', 'nt']  t   Yes     B
simply  6   ['si', 'im', 'mp', 'pl', 'ly']  s   No      C
a lot of    6   ['lo', 'ot', 'of']  a   Yes     D
said    4   ['sa', 'ai', 'id']  s   No      B

我应该将它们合并为一个词典，然后使用sklearn的DictVectorizer将其保存在工作存储器中吗?然后在训练ML算法时将这些特征视为一个X向量?

Should I make them into one dictionary and then use sklearn's DictVectorizer to hold them in a working memory? And then treat these features as one X vector when training the ML algorithms?

python中具有多种功能类型的机器学习 [英] Machine learning with multiple feature types in python

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

python中具有多种功能类型的机器学习 [英] Machine learning with multiple feature types in python

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭