解决方案对300万个数据点执行大量计算,并制作图表 [英] solution to perform lots of calculations on 3 million data points and make charts

查看:166
本文介绍了解决方案对300万个数据点执行大量计算,并制作图表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个excel电子表格,约30万行,大约100列



我需要在这个电子表格上执行各种功能,而在这个电子表格中,我需要创建大约3000个其他电子表格对于每个创建的电子表格都将显着更小



我将需要一个单独的powerpoint文件,该文件将具有自动生成的图形



我已经做了很多VBA编程,但是我有点迷失了这个项目


  1. 如果我将数据转储到mysql文件中,我可以更容易地处理我的任务吗?

  2. 是否可以在VBA excel中执行此操作?

  3. 是否可以通过编程方式轻松地将excel中的图形添加到powerpoint中?或者我应该为图形使用不同的解决方案?


解决方案


  1. 这很大程度上取决于您如何计划处理数据。如果您打算在Excel中编写代码,将其放在Excel中更有意义。话虽如此,我会将数据转储为CSV(逗号分隔),以便使用其他工具进行进一步处理,如Python。


  2. 足够的时间和金钱。如果你像大多数其他程序员一样,你也没有太多的,所以你想要最有效的解决方案,或者靠近它。如果是我,我将在Python中编写代码以从CSV文件中读取数据,执行所有必需的操作,并将3000个单独的输出集保存为可以导入Excel的单独CSV文件。


  3. 图表可能难以从VBA创建和操纵。我将使用像 Matplotlib 这样的Python库来生成所有的图形输出,将其作为PNG图像保存到磁盘,可以插入到Powerpoint演示文稿中。


Python仅作为示例提及。你应该使用一个你最熟悉的工具;然而,以编程方式处理数据的概念(不是通过互连的单元格引用和具有一小部分VBA的公式抛在复印表中等等)应该仍然适用,并且将是您在这里最好的方法。我做了一大堆你所描述的工作。将数据导入CSV并使用代码处理数据。


i have an excel spreadsheet that is about 300,000 rows and about 100 columns

i need to perform various functions on this spreadsheet and out of this spreadsheet i need to create about 3000 other spreadsheets which are SIGNIFICANTLY smaller

for every created spreadsheet i will need to have a separate powerpoint file that will have an automatically generated graph

i've done lots of VBA programming, but i am a little lost with this project

  1. if i dump the data into a mysql file would it be easier for me to handle my task?
  2. is it feasible to do this all in VBA excel?
  3. is it possible to easily add graphs from excel into powerpoint programmatically? or perhaps should i use a different solution for graphs?

解决方案

  1. It depends strongly on how you plan to process the data. If you plan to write code in Excel, it makes much more sense to leave it in Excel. Having said that, I would dump the data to CSV (comma-delimited) for further processing with a different tool, like Python.

  2. Everything is always feasible given enough time and money. If you're like most other programmers, you don't have too much of either, so you want the most efficient solution, or close to it. If it were me, I would write code in Python to read the data from a CSV file, perform all required operations, and save the 3000 separate output sets as individual CSV files which can be imported back into Excel.

  3. Charts can be tricky to create and manipulate from VBA. I would use a Python library like Matplotlib to produce all graphical output, which would be saved to disk as PNG images, which can be inserted into the Powerpoint presentation(s).

Python is mentioned here only as an example. You should use a tool that you feel most familiar with; however, the concepts of processing the data programmatically (not via interconnected cell references and formulas with a little VBA thrown in to copy sheets and so on) should still apply, and will be your best way forward here. I have done a ton of the kind of work you describe. Get the data into CSV and process the data with code.

这篇关于解决方案对300万个数据点执行大量计算,并制作图表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆