更改.txt文件中的文本并在MATLAB中创建新文件输出 [英] Altering text in a .txt file and creating a new file output in MATLAB

查看:116
本文介绍了更改.txt文件中的文本并在MATLAB中创建新文件输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果标题看起来有点偏,我提前道歉。我无法确定我应该给它命名的内容。无论如何,基本上我现在正在做的是完全功课,处理低级I / O.对于我的一项任务,我提供了两个.txt文件,一个包含电子邮件地址列表,另一个包含不再列在电子邮件列表中的列表成员。我要做的是从第二个列表中删除成员的电子邮件。此外,.txt文件中可能存在一些令人讨厌的意外。我必须清理电子邮件并在电子邮件后删除任何不需要的标点符号,例如分号,逗号和空格。此外,我需要小写所有文本。我正在以多种方式解决这个问题(我不完全确定如何让我的文件在我的输出中写出我需要它的内容),但是现在我主要担心的是输出正确的取消订阅消息订购。 Sortrow似乎不起作用。



以下是一些测试用例:

 测试用例
取消订阅('Grand Prix Mailing List.txt',...
'取消订阅Grand Prix.txt')
=>名为'Grand Prix Mailing List_updated.txt'的输出文件,看起来像'Grand Prix Mailing List_updated_soln.txt'
'
=>输出文件名为'取消订阅Grand Prix_messages.txt',
看起来像'取消订阅Grand Prix_messages_soln.txt'

原始邮件列表

  Grand Prix邮件列表:
MPLUMBER3@gatech.edu,
lplumber3@gatech.edu
Ttoadstool3@gatech.edu;
bkoopa3@gatech.edu
ppeach3@gatech.edu,
ydinosaur3@gatech.edu
kBOO3@gatech.edu
WBadguy3@gatech.edu;
FKong3@gatech.edu
dkong3@gatech.edu
dbones3@gatech.edu

与nope相似的人:

  MARIO PLUMBER; 
bowser koopa
Luigi Plumber,
Donkey Kong
King BOO;
Princess Peach

之后应该是什么样子:

  ttoadstool3@gatech.edu 
ydinosaur3@gatech.edu
wbadguy3@gatech.edu
fkong3@gatech.edu
dbones3@gatech.edu

我的文件输出:

 马里奥,您已从大奖赛邮件列表中取消订阅。 
Luigi,您已从大奖赛邮件列表中取消订阅。
Bowser,您已从大奖赛邮件列表中取消订阅。
Princess,你已经取消了大奖赛邮件列表的订阅。
King,您已从大奖赛邮件列表中取消订阅。
驴,您已从大奖赛邮件列表中取消订阅。

所以Amro非常友好地提供解决方案,尽管它比我现在知道的要高一点。我现在的主要问题是,当我输出取消订阅消息时,我需要它与原始电子邮件列表的顺序相同。例如,当Bowser在Luigi之前的投诉名单上时,在取消订阅消息中,Luigi需要来到他面前。



这是我的原始代码:

  function [] = unsubscribe(email_ids,member_emails)
Old_list = fopen(email_ids,'r'); %//打开我的电子邮件列表
Old_Members = fopen(member_emails,'r'); %//打开想要取消订阅
电子邮件= fgets(Old_list)的人的姓名; %//读取第一行电子邮件
member_emails = [member_emails]; %//创建一个数组来填充
而ischar(电子邮件)%//开始我的while循环
%//拉出电子邮件中的一行
emails = fgets(Old_list);
%//如果发送电子邮件== -1
break,则会在看到这个混蛋
时退出;
结束

%//我在这里清理东西,但它没有做任何事情。它仍然在while循环中,所以我不确定错误在哪里
proper_emails = lower(member_emails); %//这应该是小写电子邮件,但它不起作用
unwanted = findstr(member_emails,',;');
member_emails(unwanted)='';
member_emails = [member_emails,emails];
结束

而ischar(Old_Members)%//对于想要取消订阅
names = fgetl(member_emails)的成员是否相同;
如果电子邮件== -1
中断
结束
proper_emails = lower(names); %//降低所有内容
unwanted = findstr(names,',;');
names(unwanted)='';
end

投诉人=查找(电子邮件);

New_List = fopen('Test2','w'); %//创建一个要写入
fprintf的文件(New_List,'%s',member_emails); %//写入
Sorry_Message = fopen('Test.txt','w');
fprintf(Sorry_Message,'%s',Complainers);

%//有这些问题,所以我暂时将它们评论出来
%// fclose(New_List);
%// fclose(Sorry_Message);
%// fclose(email_ids);
%// fclose(成员);

结束


解决方案

以下是我对这个问题的实施。代码在每个步骤都被注释,应该易于理解。我正在使用正则表达式,因为这是他们擅长的东西...还要注意我在代码中没有任何循环:)



unsubscribe.m



  function unsubscribe(mailinglist_file,names_file)

%%
%读取想要取消订阅
names = read_file(names_file)的人的姓名列表;

%将名字分为第一个/最后一个部分
first_last = regexp(名称,'(\ w +)\s +(\ w +)','tokens','一次') ;
first_last = vertcat(first_last {:});

%构建电子邮件句柄(首字母组合+名称+域名)
emails_exclude = strcat(cellfun(@(str)str(1),first_last(:,1)),.. 。
first_last(:,2),'3@gatech.edu');

%%
%阅读邮件列表中的电子邮件
emails = read_file(mailinglist_file);

%通过删除那些希望取消订阅
电子邮件的人来更新电子邮件(ismember(email,emails_exclude))= [];

%%
%写更新的邮件列表
[〜,fName,fExt] = fileparts(mailinglist_file);
fid = fopen([fName'_updated'fExt],'wt');
fprintf(fid,'%s \ n',电子邮件{:});
fclose(fid);

%写入名单删除
%capilaize名字的第一个字母
first_names = cellfun(@(str)[upper(str(1))str(2:end) )],...
first_last(:,1),'UniformOutput',false);
msg = strcat(first_names,...
',您已从邮件列表中取消订阅。');
fid = fopen([fName'_messages'fExt],'wt');
fprintf(fid,'%s \ n',msg {:});
fclose(fid);

end

function C = read_file(filename)
%读取从文件到字符串数组的行
fid = fopen(filename, 'RT');
C = textscan(fid,'%s','Delimiter','');
fclose(fid);

%通过删除尾随标点来清理行
C = lower(regexprep(C {1},'[,; \ s] + $',''));
结束






给出以下文本文件:



list.txt



  MPLUMBER3@gatech.edu,
lplumber3@gatech.edu
Ttoadstool3@gatech.edu;
bkoopa3@gatech.edu
ppeach3@gatech.edu,
ydinosaur3@gatech.edu
kBOO3@gatech.edu
WBadguy3@gatech.edu;
FKong3@gatech.edu
dkong3@gatech.edu
dbones3@gatech.edu



names.txt



  MARIO PLUMBER; 
bowser koopa
Luigi Plumber,
Donkey Kong
King BOO;
Princess Peach

以下是运行代码时的结果:

 >> unsubscribe('list.txt','names.txt')



list_messages.txt



 马里奥,您已从邮件列表中取消订阅。 
Bowser,您已从邮件列表中取消订阅。
Luigi,您已从邮件列表中取消订阅。
驴,您已从邮件列表中取消订阅。
King,您已从邮件列表中取消订阅。
Princess,您已经从邮件列表中取消订阅。



list_updated.txt



  ttoadstool3@gatech.edu 
ydinosaur3@gatech.edu
wbadguy3@gatech.edu
fkong3@gatech.edu
dbones3@gatech.edu


I apologize in advance if the title seems a bit off. I was having trouble deciding what exactly I should name it. Anyway, basically what I am doing now is completely homework that deals with low-level I/Os. For my one assignment, I have given two .txt files, one that includes a list of email addresses and another that includes a list members who no longer was to be on an email list. What I have to do is delete the emails of the members from the second list. Additionally, there may be some nasty surprises in the .txt files. I have to clean-up the emails and take out any unwanted punctuation after the emails, such as semi-colons, commas and spaces. Furthermore, I need to lowercase all of the text. I'm struggling with this problem in more ways than one (I'm not entirely sure how to get my file to write what I need it to in my output), but right now my main concern is outputting the unsubscribe message in the correct order. Sortrow doesn't seem to work.

Here are some test cases:

Test Cases
unsubscribe('Grand Prix Mailing List.txt', ...
              'Unsubscribe from Grand Prix.txt')
     => output file named 'Grand Prix Mailing List_updated.txt' that looks
        like 'Grand Prix Mailing List_updated_soln.txt'
     => output file named 'Unsubscribe from Grand Prix_messages.txt' that 
        looks like 'Unsubscribe from Grand Prix_messages_soln.txt'

The original mailing list

Grand Prix Mailing List:
MPLUMBER3@gatech.edu, 
lplumber3@gatech.edu 
Ttoadstool3@gatech.edu;
bkoopa3@gatech.edu
ppeach3@gatech.edu,
ydinosaur3@gatech.edu
kBOO3@gatech.edu
WBadguy3@gatech.edu;
FKong3@gatech.edu
dkong3@gatech.edu
dbones3@gatech.edu

People who are like nope:

MARIO PLUMBER; 
bowser koopa 
Luigi Plumber,
Donkey Kong 
King BOO;
Princess Peach

What it's supposed to look like afterwards:

ttoadstool3@gatech.edu
ydinosaur3@gatech.edu
wbadguy3@gatech.edu
fkong3@gatech.edu
dbones3@gatech.edu

My file output:

Mario, you have been unsubscribed from the Grand Prix mailing list.
Luigi, you have been unsubscribed from the Grand Prix mailing list.
Bowser, you have been unsubscribed from the Grand Prix mailing list.
Princess, you have been unsubscribed from the Grand Prix mailing list.
King, you have been unsubscribed from the Grand Prix mailing list.
Donkey, you have been unsubscribed from the Grand Prix mailing list.

So Amro has been kind enough to provide a solution, though it's a little above what I know right now. My main issue now is that when I output the unsubscribe message, I need it to be in the same order as the original email list. For instance, while Bowser was on the complaining list before Luigi, in the unsubscribe message, Luigi needs to come before him.

Here is my original code:

function[] = unsubscribe(email_ids, member_emails)
    Old_list = fopen(email_ids, 'r'); %// opens my email list
    Old_Members = fopen(member_emails, 'r'); %// Opens up the names of people who want to unsubscribe
    emails = fgets(Old_list); %// Reads first line of emails
    member_emails = [member_emails]; %// Creates an array to populate
while ischar(emails) %// Starts my while loop
%// Pulls out a line in the email
    emails = fgets(Old_list);
%// Quits when it sees this jerk
    if emails == -1
        break;
    end

%// I go in to clean stuff up here, but it doesn't do any of it. It's still in the while loop though, so I am not sure where the error is
proper_emails = lower(member_emails); %// This is supposed to lowercase the emails, but it's not working
unwanted = findstr(member_emails, ' ,;');
member_emails(unwanted) = '';
member_emails = [member_emails, emails];
end

while ischar(Old_Members) %// Does the same for the members who want to unsubscribe
    names = fgetl(member_emails);
    if emails == -1
        break
    end
proper_emails = lower(names); %// Lowercases everything
unwanted = findstr(names, ' ,;');
names(unwanted) = '';
end

Complainers = find(emails);

New_List = fopen('Test2', 'w'); %// Creates a file to be written to
fprintf(New_List, '%s', member_emails); %// Writes to it
Sorry_Message = fopen('Test.txt', 'w');
fprintf(Sorry_Message, '%s', Complainers);

%// Had an issue with these, so I commented them out temporarily
%// fclose(New_List);
%// fclose(Sorry_Message);
%// fclose(email_ids); 
%// fclose(members);

end

解决方案

Below is my implementation for the problem. The code is commented at each step and should be easy to understand. I'm using regular expressions when I can because this is the sort of thing they're good at... Also note that I don't have any loops in the code :)

unsubscribe.m

function unsubscribe(mailinglist_file, names_file)

    %%
    % read list of names of those who want to unsubscribe
    names = read_file(names_file);

    % break names into first/last parts
    first_last = regexp(names, '(\w+)\s+(\w+)', 'tokens', 'once');
    first_last = vertcat(first_last{:});

    % build email handles (combination of initials + name + domain)
    emails_exclude = strcat(cellfun(@(str) str(1), first_last(:,1)), ...
        first_last(:,2), '3@gatech.edu');

    %%
    % read emails in mailing list
    emails = read_file(mailinglist_file);

    % update emails by removing those who wish to unsubscribe
    emails(ismember(emails, emails_exclude)) = [];

    %%
    % write updated mailing list
    [~,fName,fExt] = fileparts(mailinglist_file);
    fid = fopen([fName '_updated' fExt], 'wt');
    fprintf(fid, '%s\n', emails{:});
    fclose(fid);

    % write list of names removed
    % capilaize first letter of first name
    first_names = cellfun(@(str) [upper(str(1)) str(2:end)], ...
        first_last(:,1), 'UniformOutput',false);
    msg = strcat(first_names, ...
        ', you have been unsubscribed from the mailing list.');
    fid = fopen([fName '_messages' fExt], 'wt');
    fprintf(fid, '%s\n', msg{:});
    fclose(fid);

end

function C = read_file(filename)
    % read lines from file into a cell-array of strings
    fid = fopen(filename, 'rt');
    C = textscan(fid, '%s', 'Delimiter','');
    fclose(fid);

    % clean up lines by removing trailing punctuation
    C = lower(regexprep(C{1}, '[,;\s]+$', ''));
end


Given the following text files:

list.txt

MPLUMBER3@gatech.edu, 
lplumber3@gatech.edu 
Ttoadstool3@gatech.edu;
bkoopa3@gatech.edu
ppeach3@gatech.edu,
ydinosaur3@gatech.edu
kBOO3@gatech.edu
WBadguy3@gatech.edu;
FKong3@gatech.edu
dkong3@gatech.edu
dbones3@gatech.edu

names.txt

MARIO PLUMBER; 
bowser koopa 
Luigi Plumber,
Donkey Kong 
King BOO;
Princess Peach

Here is what I get when running the code:

>> unsubscribe('list.txt', 'names.txt')

list_messages.txt

Mario, you have been unsubscribed from the mailing list.
Bowser, you have been unsubscribed from the mailing list.
Luigi, you have been unsubscribed from the mailing list.
Donkey, you have been unsubscribed from the mailing list.
King, you have been unsubscribed from the mailing list.
Princess, you have been unsubscribed from the mailing list.

list_updated.txt

ttoadstool3@gatech.edu
ydinosaur3@gatech.edu
wbadguy3@gatech.edu
fkong3@gatech.edu
dbones3@gatech.edu

这篇关于更改.txt文件中的文本并在MATLAB中创建新文件输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆