计算电影脚本中角色说的单词 [英] Counting the words a character said in a movie script

查看:96
本文介绍了计算电影脚本中角色说的单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经设法在一些帮助下发现了口语。
现在,我正在寻找让选定的人说的文字。
所以我可以输入MIA并获得她在电影中说的每个单词
像这样:

I already managed to uncover the spoken words with some help. Now I'm looking for to get the text spoken by a chosen person. So I can type in MIA and get every single words she is saying in the movie Like this:

name = input("Enter name:")
wordsspoken(script, name)
name1 = input("Enter another name:")
wordsspoken(script, name1)

所以我以后可以计算单词了。

So I'm able to count the words afterwards.

这是电影脚本的样子

An awkward beat. They pass a wooden SALOON -- where a WESTERN
 is being shot. Extras in COWBOY costumes drink coffee on the
 steps.
                     Revision                        25.


                   MIA (CONT'D)
      I love this stuff. Makes coming to work
      easier.

                   SEBASTIAN
      I know what you mean. I get breakfast
      five miles out of the way just to sit
      outside a jazz club.

                   MIA
      Oh yeah?

                   SEBASTIAN
      It was called Van Beek. The swing bands
      played there. Count Basie. Chick Webb.
             (then,)
      It's a samba-tapas place now.

                   MIA
      A what?

                   SEBASTIAN
      Samba-tapas. It's... Exactly. The joke's on
      history.


推荐答案

如果您只想通过一次就可以计算出计数通过脚本(我想可能会很长),您可以跟踪哪个角色在说话;像个小状态机一样设置东西:

If you want to compute your tally with only one pass over the script (which I imagine could be pretty long), you could just track which character is speaking; set things up like a little state machine:

import re
from collections import Counter, defaultdict

words_spoken = defaultdict(Counter)
currently_speaking = 'Narrator'

for line in SCRIPT.split('\n'):
    name = line.replace('(CONT\'D)', '').strip()
    if re.match('^[A-Z]+$', name):
        currently_speaking = name
    else:
        words_spoken[currently_speaking].update(line.split())

一个更复杂的正则表达式来检测说话者何时发生变化,但这应该可以解决问题。

You could use a more sophisticated regex to detect when the speaker changes, but this should do the trick.

演示

这篇关于计算电影脚本中角色说的单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆