如何在不重复的情况下附加json(包括CasperJS代码)? [英] How to append json with no duplicating (including CasperJS code)?

查看:96
本文介绍了如何在不重复的情况下附加json(包括CasperJS代码)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用CasperJS解析网页的内部文本并将其保存到json文件中.

I'm using CasperJS to parse inner texts of webpages and save to json file.

这是我的代码,我将在下面向您显示结果(我遇到的问题)!

Here is my code and I'll show you result(problem that I have) below!

var words = [];
var casper = require('casper').create();
var x = require('casper').selectXPath;
var fs = require('fs');

function getWords() {
    var words = document.querySelectorAll('span.inner_tit');
    return Array.prototype.map.call(words, function(e) {
        return e.innerHTML;
    });
}

function createFinal(wordArray) {
    var out = [];
    wordArray.forEach(function(word) {
        out.push({"type": "river", "name": word, "spell": word.length});
    });
    return out;
}    

casper.start('http://dic.daum.net/index.do?dic=kor');


casper.thenClick(x('//*[@id="searchSubmit"]'), function(){
    console.log('searching');
});

casper.wait(2000, function() {
    casper.then(function() {
        words = this.evaluate(getWords);
    });
});

casper.wait(3000, function() {
    casper.thenClick(x('//*[@id="mArticle"]/div[2]/a[2]'), function (){
        words = words.concat(this.evaluate(getWords));
    });
});

casper.run(function() {
    var my_object = { "my_initial_words": createFinal(words)};
    this.echo(JSON.stringify(my_object, null, '\t'))
    var result = JSON.stringify(my_object, null, '\t')
    fs.write('myresults.json', result, 'a');
    this.exit();

});

此代码的问题是,当我有这样的json代码时,

This code's problem is, when I have json code like this,

{
    "my_initial_words": [
        {
            "type": "river",
            "name": "apple",
            "spell": "5"
        },
        {
            "type": "river",
            "name": "banana",
            "spell": "6"
        }   
    ]
}

我的代码将所有内容都附加到其中,包括json数组的名称!

My code appends all of it including the name of json arrays like this!

{
    "my_initial_words": [
        {
            "type": "river",
            "name": "apple",
            "spell": "5"
        },
        {
            "type": "river",
            "name": "banana",
            "spell": "6"
        }   
    ]
}  {
    "my_initial_words": [
        {
            "type": "river",
            "name": "apple",
            "spell": "5"
        },
        {
            "type": "river",
            "name": "banana",
            "spell": "6"
        }   
    ]
}

所以..我不需要添加所有元素,而只想添加这些元素! (没有我的_initial_words:[]")

So.. I don't need to add all of it but want to add only these elements! (Without "my _initial_words: []")

{"type": "river",   "name": "apple","spell": "5"},
{"type": "river",   "name": "banana","spell": "6"}  

推荐答案

更新文件中的对象

JSON的定义方式是,您不能将对象附加到现有对象上,而期望从中获取有效的JSON.但是,您可以

Updating an object in file

JSON is defined in such a way that you can't append an object to an existing object and expect to get valid JSON out of it. You can however

  1. 读取先前的序列化JSON字符串
  2. 将其解析为对象
  3. 将新值添加到数组
  4. 再次序列化对象,然后
  5. 用它完全覆盖现有文件.

例如这样的

var previousDataString = fs.read('myresults.json');
var previousData = JSON.parse(previousDataString);
previousData["my_initial_words"] = previousData["my_initial_words"].concat(createFinal(words));
var newData = JSON.stringify(previousData, null, '\t')
fs.write('myresults.json', newData, 'w');

将大块写入文件

如果您仍然想将数据文件编写为JSON的单独块,则可以执行以下操作:

Writing chunks to file

If you still want to write your data file as separate chunks of JSON, then you can do this:

// Combine all items into a single string
var newItemsString = createFinal(words).reduce(function(combinedString, currentItem){
    return combinedString + JSON.stringify(currentItem) + "\n";
}, "")
// append new items to previous items
fs.write('myresults.json', newItemsString, 'a');

每个项目(单词的对象)只写在一行上.当您以其他方式读取文件时,可以使用诸如readLine()之类的功能一次只读取一项.

Each item (word's object) is written on exactly one line. When you read the file in some other process then you can use functions such as readLine() to read exactly one item at a time.

您还必须记住如何退出CasperJS.如果提供对casper.run()的回调,则需要显式调用casper.exit()才能退出该过程.问题是您做得太早了:

You also have to keep in mind how you're exiting CasperJS. If you provide a callback to casper.run(), then you need to explicitly call casper.exit() in order to exit the process. The problem is that you're doing that too early:

this.echo(JSON.stringify(previousData, null, '\t')).exit();
//                                                 ^^^^^^^ calling exit
var newData = JSON.stringify(previousData, null, '\t'); // not executed
fs.write('myscript.json', newData, 'w');  // not executed

您需要将退出位置放在回调的末尾:

Either you need to put the exit at the end of the callback:

this.echo(JSON.stringify(previousData, null, '\t'));
var newData = JSON.stringify(previousData, null, '\t');
fs.write('myscript.json', newData, 'w');
this.exit();

或者不要将最终代码放入casper.then()而不是casper.run():

or don't put your final code into casper.then() instead of casper.run():

casper.then(function() {
    var previousDataString = fs.read('myscript.json');
    var previousData = JSON.parse(previousDataString);
    previousData["my_initial_words"] = previousData["my_initial_words"].concat(createFinal(words));
    this.echo(JSON.stringify(previousData, null, '\t'));
    var newData = JSON.stringify(previousData, null, '\t')
    fs.write('myscript.json', newData, 'w');
});
casper.run();

这篇关于如何在不重复的情况下附加json(包括CasperJS代码)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆