如何在nodejs中加载非常大的csv文件? [英] How to load very large csv files in nodejs?

查看：415 发布时间：2020/7/11 23:11:52 node.js csv

本文介绍了如何在nodejs中加载非常大的csv文件?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试将2个大型csv加载到nodejs中，第一个的大小为257597 ko，第二个的大小为104330 ko.我正在使用文件系统(fs)和csv模块，这是我的代码:

I'm trying to load 2 big csv into nodejs, first one has a size of 257 597 ko and second one 104 330 ko. I'm using the filesystem (fs) and csv modules, here's my code :

fs.readFile('path/to/my/file.csv', (err, data) => {
  if (err) console.err(err)
  else {
    csv.parse(data, (err, dataParsed) => {
      if (err) console.err(err)
      else {
        myData = dataParsed
        console.log('csv loaded')
      }
    })
  }
})

经过一段时间(1-2小时)后，它只是崩溃并显示以下错误消息:

And after ages (1-2 hours) it just crashes with this error message :

<--- Last few GCs --->

[1472:0000000000466170]  4366473 ms: Mark-sweep 3935.2 (4007.3) -> 3935.2 (4007.
3) MB, 5584.4 / 0.0 ms  last resort GC in old space requested
[1472:0000000000466170]  4371668 ms: Mark-sweep 3935.2 (4007.3) -> 3935.2 (4007.
3) MB, 5194.3 / 0.0 ms  last resort GC in old space requested


<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 000002BDF12254D9 <JSObject>
    1: stringSlice(aka stringSlice) [buffer.js:590] [bytecode=000000810336DC91 o
ffset=94](this=000003512FC822D1 <undefined>,buf=0000007C81D768B9 <Uint8Array map
 = 00000352A16C4D01>,encoding=000002BDF1235F21 <String[4]: utf8>,start=0,end=263
778854)
    2: toString [buffer.js:664] [bytecode=000000810336D8D9 offset=148](this=0000
007C81D768B9 <Uint8Array map = 00000352A16C4D01>,encoding=000002BDF1...

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memo
ry
 1: node::DecodeWrite
 2: node_module_register
 3: v8::internal::FatalProcessOutOfMemory
 4: v8::internal::FatalProcessOutOfMemory
 5: v8::internal::Factory::NewRawTwoByteString
 6: v8::internal::Factory::NewStringFromUtf8
 7: v8::String::NewFromUtf8
 8: std::vector<v8::CpuProfileDeoptFrame,std::allocator<v8::CpuProfileDeoptFrame
> >::vector<v8::CpuProfileDeoptFrame,std::allocator<v8::CpuProfileDeoptFrame> >
 9: v8::internal::wasm::SignatureMap::Find
10: v8::internal::Builtins::CallableFor
11: v8::internal::Builtins::CallableFor
12: v8::internal::Builtins::CallableFor
13: 00000081634043C1

已加载最大文件，但另一个节点的内存不足.分配更多的内存可能很容易，但是这里的主要问题是加载时间，尽管文件很大，但是看起来很长.那么正确的方法是什么呢? Python使用pandas btw(3-5秒)非常快地加载了这些csv.

The biggest file is loaded but node runs out of memory for the other. It's probably easy to allocate more memory, but the main issue here is the loading time, it seems very long despite the size of files. So what is the correct way to do it? Python loads these csv really fast with pandas btw (3-5 seconds).

如何在nodejs中加载非常大的csv文件? [英] How to load very large csv files in nodejs?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何在nodejs中加载非常大的csv文件? [英] How to load very large csv files in nodejs?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭