蟒蛇猪拉丁语转换器 [英] python pig latin converter

查看:41
本文介绍了蟒蛇猪拉丁语转换器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请帮帮我!

我正在将多行文本文件转换为猪拉丁语.

I am converting a text file with multiple lines to pig latin.

示例:Pig 拉丁语翻译:这是一个示例.应该是:Histay siay naay xampleeay.

Example: the Pig Latin translation of: This is an example. should be: Histay siay naay xampleeay.

我需要将标点符号保留在应有的位置(大多数情况下在句尾)我还需要原文中以大写字母开头的任何单词在猪拉丁版本中以大写字母开头,其余字母小写.

I need any punctuation to be left where it should be (end of the sentence in most cases) I also need any word that starts with a capital letter in the original to start with a capital letter in the pig latin version, with the rest of the letters lowercase.

这是我的代码:

def main():
    fileName= input('Please enter the file name: ')

    validate_file(fileName)
    newWords= convert_file(fileName)
    print(newWords)


def validate_file(fileName):
    try:
        inputFile= open(fileName, 'r')
        inputFile.close()
    except IOError:
        print('File not found.')


def convert_file(fileName):
    inputFile= open(fileName, 'r')
    line_string= [line.split() for line in inputFile]

    for line in line_string:
        for word in line:
            endString= str(word[1:])
            them=endString, str(word[0:1]), 'ay'
            newWords="".join(them)
            return newWords

我的文本文件是:

This is an example. 

My name is Kara!

然后程序返回:

Please enter the file name: piglatin tester.py
hisTay
siay
naay
xample.eay
yMay
amenay
siay
ara!Kay
None

我如何让它们在它们所在的行中打印出来?还有我该如何处理标点符号和大小写问题?

How do I get them to print out in the lines they were in? And also how do I deal with the punctuation issue and the capitalization?

推荐答案

这是我重新编写的代码.您应该考虑使用 nltk.它对单词标记化的处理要强得多.

Here is my reworking of your code. You should consider working with nltk. It has much more robust handling of word tokenisation.

def main():
    fileName= raw_input('Please enter the file name: ')

    validate_file(fileName)
    new_lines = convert_file(fileName)
    for line in new_lines:
        print line

def validate_file(fileName):
    try:
        inputFile= open(fileName, 'r')
        inputFile.close()
    except IOError:
        print('File not found.')

def strip_punctuation(line):
    punctuation = ''
    line = line.strip()
    if len(line)>0:
        if line[-1] in ('.','!','?'):
            punctuation = line[-1]
            line = line[:-1]
    return line, punctuation

def convert_file(fileName):
    inputFile= open(fileName, 'r')
    converted_lines = []
    for line in inputFile:
        line, punctuation = strip_punctuation(line)
        line = line.split()
        new_words = []
        for word in line:
            endString= str(word[1:])
            them=endString, str(word[0:1]), 'ay'
            new_word="".join(them)
            new_words.append(new_word)
        new_sentence = ' '.join(new_words)
        new_sentence = new_sentence.lower()
        if len(new_sentence):
            new_sentence = new_sentence[0].upper() + new_sentence[1:]
        converted_lines.append(new_sentence + punctuation)
    return converted_lines

这篇关于蟒蛇猪拉丁语转换器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆