文本文件为JSON,结果中不包含最后几行 [英] Text file to JSON, last few lines are not included in the result
问题描述
我正在读取文本文件并将其在我的react项目中使用regex将其转换为JSON格式.它工作正常,但不包括文本文件的最后20-30行.将其转换为JSON时存在一些问题,但我无法理解该问题.
I'm reading a text file and converting it to JSON format using regex in my react project.It is working fine but not including last 20-30 lines of the text file. There is some problem while converting it to JSON but I am unable to understand the problem.
这是我的代码:
readTextFile = file => {
let rawFile = new XMLHttpRequest();
rawFile.open("GET", file, false);
rawFile.onreadystatechange = () => {
if (rawFile.readyState === 4) {
if (rawFile.status === 200 || rawFile.status === 0) {
let allText = rawFile.responseText;
// console.log(allText)
let reg = /\d\d\d\d-(0?[1-9]|1[0-2])-(0?[1-9]|[12][0-9]|3[01]) (00|[0-9]|1[0-9]|2[0-3]):([0-9]|[0-5][0-9]):([0-9]|[0-5][0-9])/g;
let arr = [];
let start = null;
let line, lastSpacePos;
let match;
while ((match = reg.exec(allText)) != null) {
if(start) {
line = allText.slice(start, match.index).trim();
lastSpacePos = line.lastIndexOf(' ');
arr.push({
date: line.slice(0, 19),
text: line.slice(20, lastSpacePos).trim(),
user_id: line.slice(lastSpacePos).trim()
});
}
start = match.index
}
console.log(arr);
this.setState({
// text: JSON.stringify(arr)
text: allText
});
}
}
};
推荐答案
不确定对Question中现有代码的问题.
Am not certain about the issue with the existing code at Question.
要使用替代方法获得问题"中所述的预期结果,可以使用RegExp
/\s{2,}|\n+/g
替换大于2的空格字符和换行符; /[\d-]+\s[\d:]+/g
获取日期; /.+(?=\s\w+\s$|\s\w+$)|\w+\s$|\w+$/g
匹配后跟空格,单词字符和空格字符或字符串结尾的文本,以及匹配空格字符之前的字符,空格字符,空格字符或字符串结尾的字符;从.map()
To get expected result described at Question utilizing an alternative approach you can use RegExp
/\s{2,}|\n+/g
to replace space characters greater than 2 and new line characters; /[\d-]+\s[\d:]+/g
to get dates; /.+(?=\s\w+\s$|\s\w+$)|\w+\s$|\w+$/g
to match text that is followed by space, word characters and space character or end of string and characters before space characters followed by word characters and space character or end of string; return an object with a property set for each element of the array from .map()
let allText = `2014-06-01 23:07:58 President Resigns in Georgia’s Breakaway Region of
Abkhazia t.co/DAploRvCvV nytimes
2014-06-01 23:48:06 The NYT FlipBoard guide to understanding climate
change and its consequences t.co/uPGTuYiSmQ nytimes
2014-06-01 23:59:06 For all the struggles that young college grads
face, a four-year degree has probably never been more valuable
t.co/Gjf6wrwMsS nytimes
2014-06-01 23:35:09 It's better to be a community-college graduate than
a college dropout t.co/k3CO7ClmIG nytimes
2014-06-01 22:47:04 Share your experience with Veterans Affairs health
care t.co/PrDhLC20Bt nytimes
2014-06-01 22:03:27 Abandon Hope, Almost All Ye Who Enter the N.B.A.
Playoffs t.co/IQAJ5XNddR nytimes`;
// replace more than one consecutive space character and new line characters
allText = allText.replace(/\s{2,}|\n+/g, " ");
// get dates
let dates = allText.match(/[\d-]+\s[\d:]+/g);
// get characters that are not dates
// spread `dates` to resulting array
// return object
let res = allText
.split(/[\d-]+\s[\d:]+\s/)
.filter(Boolean)
.map((text, index) =>
[dates[index], ...text.match(/.+(?=\s\w+\s$|\s\w+$)|\w+\s$|\w+$/g)])
.map(([date, text, user_id]) => ({date, text, user_id}));
console.log(res);
这篇关于文本文件为JSON,结果中不包含最后几行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!