如何计算并记录具有特定月/年值的工作表中的行数 [英] How can I count and log the number of rows in a sheet with a specific month/year value

查看:203
本文介绍了如何计算并记录具有特定月/年值的工作表中的行数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经下载了一个数据集,详细描述了1979年1月至2004年12月期间在英国报道的所有车祸事件 - 该文件采用csv格式,可以理解为相当大(6,224,199行)。由于文件的大小超过了Excel 2010可以处理的行数,因此我必须将文件分割成较小的文件,以便在Excel中立即打开该文件。我尝试使用记事本和记事本++,但记事本崩溃,记事本++拒绝打开如此大的(720MB)文件。我已经考虑过使用像Delimit这样的Excel 替换,但它不支持宏。
现在,忽略大小问题,我需要计算每个月的崩溃总数,并记下它们的记录。有一列用于指定每次崩溃的日期,但是行不会根据崩溃日期进行排序。我正在考虑使用CTRL + F计算具有特定月/年值的行数,然后记录每次搜索的结果数,但考虑到数据跨越25年,我必须手动搜索并记录结果从300个月。

I've downloaded a dataset which details all of the car accidents reported in England between January 1979 and December 2004 - this file is in csv format and is understandably quite large (6,224,199 rows, to be exact). Because the size of the file exceeds the number of rows that Excel 2010 can handle, I'd have to split the file into smaller ones in order to open it all at once in Excel. I tried using Notepad and Notepad++, but Notepad crashed, and Notepad++ refused to open such a large (720MB) file. I've considered using an Excel replacement like Delimit, but it doesn't support Macros. Now, overlooking the size issue, I need to count the total number of crashes from each month and make a note of them. There's a column to specify the date of each crash, but the rows aren't sorted according to the crash date. I was considering using CTRL+F to count the number of rows with a specific month/year value and then logging the number of results for each search, but considering that the data spans 25 years, I'd have to manually search and record the results from 300 months.

推荐答案

我同意Jeanno和Brad,Access是一个比Excel更好的工具,这种类型的要求。但是,我想知道是否尝试使用Excel读取这么大的文件将具有实际的持续时间。

I agree with Jeanno and Brad, Access is a better tool than Excel for this type of requirement. However, I wondered if an attempt to read such a large file with Excel would have a realistic duration.

我连接了一些大文本文件来创建一个663 Mb的文件,我以为足够近了下面的宏读取文件的每一行,并将其分割成准备分析的字段。注意:我的文件使用|作为分隔符而不是,。

I concatenated some large text files to create a file of 663 Mb which I thought was close enough. The macro below read each line of the file and splits it into fields ready for analysis. Note: my file uses "|" as a delimiter instead of ",".

该宏在100秒以上读取7,782,013条记录。访问仍然是更好的选择,但如果Access不可用,Excel是可行的。

The macro reads 7,782,013 records in a little over 100 seconds. Access is still the better option but Excel is feasible if Access is not available.

注意:此宏需要引用Microsoft Scripting Runtime。

Note: this macro needs a reference to "Microsoft Scripting Runtime".

Sub ReadAndSplit()

  Dim FileStream As TextStream
  Dim FileSysObj As FileSystemObject
  Dim Line As String
  Dim LinePart() As String
  Dim NumLines As Long
  Dim TimeStart As Double

  TimeStart = Timer

  Set FileSysObj = CreateObject("Scripting.FileSystemObject")
  NumLines = 0

  ' 1 means open read only
  Set FileStream = FileSysObj.OpenTextFile(ThisWorkbook.Path & "\Test4.txt", 1)

  Do While Not FileStream.AtEndOfStream
    Line = FileStream.ReadLine
    NumLines = NumLines + 1
    LinePart = Split(Line, "|")
  Loop

  FileStream.Close

  Debug.Print NumLines
  Debug.Print Timer - TimeStart

End Sub

这篇关于如何计算并记录具有特定月/年值的工作表中的行数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆