Laravel 从模型中插入数百万个数据库行 [英] Laravel insert millions of database rows from models

查看:57
本文介绍了Laravel 从模型中插入数百万个数据库行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个文本文件,其中包含用逗号分隔的值,代表字符串中每一行的数据集.其中大约有 200 万个,我想解析字符串,从它们创建 Laravel 模型并将每个模型作为一行存储在我的数据库中.

I have a text file that contains comma delineated values representing data set with each row within the string. There are about 2 million of them and I want to parse the string, create Laravel models from them and store each as a row in my database.

此时,我有一个类,逐行解析文件并为每个创建模型如下:

At this time, I have a class that parses the file line by line and creates a model for each as follows:

class LargeFileParser{

    // File Reference
    protected $file;

    // Check if file exists and create File Object
    public function __construct($filename, $mode="r"){
        if(!file_exists($filename)){
            throw new Exception("File not found");
        }

        $this->file = new SplFileObject($filename, $mode);
    }

    // Iterate through the text or binary document
    public function iterate($type = "Text", $bytes = NULL)
    {
        if ($type == "Text") {

            return new NoRewindIterator($this->iterateText());

        } else {

            return new NoRewindIterator($this->iterateBinary($bytes));
        }

    }

    // Handle Text iterations
    protected function iterateText()
    {
        $count = 0;

        while (!$this->file->eof()) {

            yield $this->file->fgets();

            $count++;
        }
        return $count;
    }

    // Handle binary iterations
    protected function iterateBinary($bytes)
    {
        $count = 0;

        while (!$this->file->eof()) {

            yield $this->file->fread($bytes);

            $count++;
        }
    }
}

然后我有一个控制器(我希望能够偶尔通过路由运行此迁移)来处理创建模型并将其插入数据库:

I then have a controller (I want to be able to run this migration via a route occasionally) that handles creating and inserting the models into the database:

class CarrierDataController extends Controller
{
    // Store the data keys for a carrier model
    protected $keys;

    //Update the Carrier database with the census info
    public function updateData(){
        // File reference
        $file = new LargeFileParser('../storage/app/CENSUS.txt');

        //Get iterator for the file 
        $iterator = $file->iterate("Text");

        // For each iterator, store the data object as a carrier in the database
        foreach ($iterator as $index => $line) {   
            // First line sets the keys specified in the file 
            if($index == 0){
                $this->keys = str_getcsv(strtolower($line), ",", '"');
            }
            // The rest hold the data for each model
            else{                
                if ($index <= 100) {
                    // Parse the data to an array
                    $dataArray = str_getcsv($line, ",", '"');

                    // Get a data model
                    $dataModel = $this->createCarrierModel(array_combine($this->keys, $dataArray));

                    // Store the data
                    $this->storeData($dataModel);
                }
                else{
                    break;
                }
            }   
        }
    }

    // Return a model for the data
    protected function createCarrierModel($dataArray){

        $carrier = Carrier::firstOrNew($dataArray);

        return $carrier;
    }

    // Store the carrier data in the database
    protected function storeData($data){    
        $data->save();
    }
}

这很完美……那是我将功能限制为 100 个插入.如果我删除这个检查并允许它在整个 200 万个数据集上运行这个函数,它就不再起作用了.要么有超时,要么如果我通过诸如 ini_set('max_execution_time', 6000); 之类的东西删除超时,我最终会从浏览器收到无法响应"消息.

This works perfectly...that is while I'm limiting the function to 100 inserts. If I remove this check and allow it to run this function over the entire 2 million data sets, it no longer works. Either there is a timeout, or if I remove the timeout via something like ini_set('max_execution_time', 6000); I eventually get a "failed to respond" message from the browser.

我的假设是需要某种类型的分块,但老实说,我不确定处理此卷的最佳方法.

My assumption is that there needs to be some sort of chunking in place, but I'm honestly not sure of the best approach for handling this volume.

预先感谢您提出的任何建议.

Thank you in advance for any suggestions you may have.

推荐答案

我会创建一个 artisan 命令来处理导入,而不是通过浏览器执行此操作.你喜欢让用户等到这个大文件被导入吗?如果他移动使用后退按钮或关闭页面会发生什么?

I would create an artisan command who handles the import rather than doing this via the browser. Do you like to let the user wait until this big file is imported? What happens if he moves uses the back button or closes the page?

如果您想要或需要某种用户交互,例如用户上传文件并单击导入"按钮,请将导入推送到 作业队列 使用例如豆茎.上述工匠将运行并导入内容,如果完成,您可以向用户发送电子邮件或松弛通知.如果您需要一些 UI 交互,您可以通过 ajax 发出请求,并且该脚本向 API 端点发出请求,请求导入状态或自其异步以来,等待完成并显示一些 UI 通知,停止微调器或在错误情况下,显示错误信息.

If you want or need to have some kind of user interaction, like the user uploads the file and clicks on an Import button, push the import to a job queue using e.g. Beanstalk. The aforementioned artisan will be run and import the stuff and if its done, you can send the user an e-mail or a slack notification. If you need some UI interaction you can make the request via ajax and that script makes request to an API endpoint requesting the status of the import or since its asynchron, waiting for completion and shows some UI notification, stops a spinner or in error case, shows an error message.

这篇关于Laravel 从模型中插入数百万个数据库行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆