我如何在python中编写函数? [英] How do I write a function in python

查看:189
本文介绍了我如何在python中编写函数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这个脚本,它读取文件(文件包含收集的推文),清理它,获取频率分布和创建情节,但现在我只能用一个文件工作,我需要的是从它创建功能,能够传递更多的文件。所以我可以用freqdist的结果创建数据框从更多的文件中绘制出来

i have this script, it reads file(file consists of collected tweets), cleans it, gets frequency distribution and creates plot, but now i can work only with one file, what i need is to create function from it, to be able to pass more files. So i can create dataframe with results of freqdist from more files to plot it

f = open(.......)
text = f.read()
text = text.lower()
for p in list(punctuation):
    text = (text.replace(p, ''))

allWords = nltk.tokenize.word_tokenize(text)
allWordDist = nltk.FreqDist(w.lower() for w in allWords)
stopwords = set(stopwords.words('english'))

allWordExceptStopDist = nltk.FreqDist(w.lower() for w in allWords if w not in stopwords)
mostCommon = allWordExceptStopDist.most_common(25)

frame = pd.DataFrame(mostCommon, columns=['word', 'frequency'])
frame.set_index('word', inplace=True)
print(frame)
histog = frame.plot(kind='barh')
plt.show()

非常感谢您的帮助!

thank you very much for any help!

推荐答案

def readStuff( filename )
    with open(filename) as f:
        text = f.read()
    text = text.lower()
    for p in list(punctuation):
        text = (text.replace(p, ''))

    allWords = nltk.tokenize.word_tokenize(text)
    allWordDist = nltk.FreqDist(w.lower() for w in allWords)
    stopwords = set(stopwords.words('english'))

    allWordExceptStopDist = nltk.FreqDist(w.lower() for w in allWords if w not in stopwords)
    mostCommon = allWordExceptStopDist.most_common(25)

    frame = pd.DataFrame(mostCommon, columns=['word', 'frequency'])
    frame.set_index('word', inplace=True)
    print(frame)
    histog = frame.plot(kind='barh')
    plt.show()

这篇关于我如何在python中编写函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆