在Excel查询编辑器中合并具有不同列数的CSV文件文件夹 [英] Combine folder of CSV files with different number of columns in Excel Query Editor

查看:52
本文介绍了在Excel查询编辑器中合并具有不同列数的CSV文件文件夹的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含许多CSV文件的文件夹,其中前八列具有相同的标题,但前八列之后的列数不同.我正在尝试使用Excel 2016中的查询和连接来完成此操作.

I have a folder of many CSV files in which the first eight columns have the same header, but varying amounts of columns after the first eight. I am attempting to use Queries and Connections in Excel 2016 to complete this.

以下是CSV文件格式的示例(假设A-H中有字符串值):

Here is an example of the format of the CSV files (pretend there are string values in A - H):

----------------------------------------------------------------------------------------------------------------
|     A     |  B  |  C  |  D  |  E  |  F  |  G  |  H  | Company 1 | Value (1) | Company 2 | Value (2) | etc... |  
----------------------------------------------------------------------------------------------------------------  
| Product 1 |     |     |     |     |     |     |     |    .05    |   25.00   |    .08    |   14.00   |        |  
----------------------------------------------------------------------------------------------------------------  
| Product 2 |     |     |     |     |     |     |     |    .16    |   43.00   |    .06    |   18.00   |        |  
----------------------------------------------------------------------------------------------------------------  

再一次:每个CSV文件的A到H列相同,但是每个文件具有不同的公司/值(每个文件的公司名称都不同).

Once again: Columns A - H are the same for each CSV file, but each file has a different amount of Companies/Values (the company names are different for each file).

我已经为一个CSV文件完成了必要的步骤,并希望有某种方法可以将其用作其他CSV文件的模板.

I have done the necessary steps for one CSV file and was hoping there was some way I could possibly use it as a template for the others.

以下是我在Excel的查询编辑器中需要执行的步骤:

Here are the steps I need to do in Excel's Query editor:

  • 删除一些"A-H"列(这些标题适用于所有文件,应该不难)
  • 删除所有值(#)"列-每个CSV文件中的列数均不同
  • 取消透视所有公司编号"列(由于在上一步中删除了值(#)"列,因此现在是匹配列A-H之后的每​​一列)
  • 将具有所有公司名称的列重命名为公司"
  • 使用以前每个公司下的所有值将列重命名为成本"


这是高级查询编辑器"中的应用步骤:


Here is the applied steps in the 'Advanced Query Editor':

#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"A", type text}, {"B", type text}, {"C", type text}, {"D", type text}, {"E", type number}, {"F", type text}, {"G", Currency.Type}, {"H", Currency.Type}, {"Company 1", type text}, {"Value (1)", Currency.Type}, {"Company 2", type text}, {"Value (2)", Currency.Type}, {"Company 3", type text}, {"Value (3)", Currency.Type}, {"Company 4", type text}, {"Value (4)", Currency.Type}, {"Company 5", type text}, {"Value (5)", Currency.Type}, {"Company 6, type text}, {"Value (6)", Currency.Type}, {"Company 7", type text}, {"Value (7)", Currency.Type}, {"Company 8", type text}, {"Value (8)", Currency.Type}, {"Company 9", type text}, {"Value (9)", Currency.Type}}),
#"Removed Other Columns" = Table.SelectColumns(#"Changed Type",{"A", "C", "D", "E", "Company 1", "Company 2", "Company 3", "Company 4", "Company 5", "Company 6", "Company 7", "Company 8", "Company 9"}),
#"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Removed Other Columns", {"A", "C", "D", "E"}, "Attribute", "Value")

是否可以导入CSV文件的文件夹并对每个文件夹应用上述步骤,以便仅对所有CSV文件进行一次查询?(下面是我希望它的外观的简短示例)

Is there any way to import the folder of CSV files and apply the steps stated above to each one so there is just one query of all the CSV files? (brief example of how I want it to look below)

----------------------------------------------------
|     A     |  C  |  D  |  E  |  Company  |  Cost  |  
----------------------------------------------------
| Product 1 |     |     |     | Company 1 |  .05   |
----------------------------------------------------
| Product 1 |     |     |     | Company 2 |  .08   |
----------------------------------------------------
| Product 2 |     |     |     | Company 1 |  .16   |
----------------------------------------------------
| Product 2 |     |     |     | Company 2 |  .06   |
----------------------------------------------------

推荐答案

我相信是这样.这似乎有效.

I believe so. This seems to work.

首先选择获取数据(或新源)">文件">文件夹",然后浏览到包含csv文件的文件夹,然后选择它并单击确定".然后再次单击确定".

Start by selecting Get Data (or New Source) > File > Folder and browse to your folder with the csv files in it and select it and click OK. Then click OK again.

您将看到类似以下的内容.

You'll see something like the following.

点击编辑"按钮.

右键单击内容"列,然后单击删除其他列".您会看到类似这样的内容.

Right-click the Content column and Remove Other Columns. You'll see something like this.

通过单击右上角的按钮来扩展内容"列,然后在出现的对话框中单击确定",如下所示.

Expand the Content column by clicking the button at its top right and click OK in the dialog box that appears, like below.

在屏幕左侧的查询"下,您将看到以下内容:

On the left side of your screen, under Queries, You'll see something like this:

我突出显示的查询是您要执行工作的地方.请注意,它具有一个表格图标,并以 Transform Sample File 开头.点击它.

The query that I've highlighted is where you want to do your work. Note that it has a table icon and starts with Transform Sample File. Click on it.

在屏幕右侧的应用步骤"下,单击"".

Under APPLIED STEPS on the right side of the screen, click on Source.

在编辑栏中,您应该看到类似以下内容: = Csv.Document(#"Sample File Parameter1",[Delimiter =,",Columns = 12,Encoding = 1252,QuoteStyle = QuoteStyle.None]).删除部分 Columns = 12 (从C到逗号(包括逗号)的所有内容,最有可能与12的数字不同).

In the formula bar, you should see something like this: = Csv.Document(#"Sample File Parameter1",[Delimiter=",", Columns=12, Encoding=1252, QuoteStyle=QuoteStyle.None]). Delete the part, Columns=12, (everything from the C through and including the comma--it most likely has a different number than 12).

然后从表中删除要删除的列.

Then remove the columns you want removed from your table.

单击变换">将第一标题行用作标题"下拉箭头>将标题用作第一行".

Click Transform > Use First Header Row as Headers dropdown arrow > Use Headers as First Row.

单击变换">转置"以翻转表格.

Click Transform > Transpose to flip the table.

选择Column1下拉箭头>文本过滤器">不以>开头",然后键入 Value ,然后单击确定".

Select the Column1 dropdown arrow > Text Filters > Does Not Begin With > and Type Value and click OK.

单击变换">转置"将桌子翻转回其原始方向.

Click Transform > Transpose to flip the table back to its original orientation.

点击变换>将第一行用作标题.

Click Transform > Use First Row as Headers.

选择除公司列以外的所有列,然后点击取消透视列下拉箭头>取消其他列.(不要直接选择和旋转公司"列.像我说的那样,间接执行应该从各个文件中捕获不同数量的公司列.)

Select all of the columns except your Company columns and click the Unpivot Columns dropdown arrow > Unpivot Other Columns. (Don't select and pivot the Company columns directly. Doing it indirectly, like I said, should capture the varying numbers of company columns from the various files.)

在屏幕右侧的应用的步骤"下,删除每个更改的类型" 步骤.如果您不这样做,它们可能会导致查询阻塞.

Under APPLIED STEPS on the right side of the screen, delete each Changed Type step. If you don't, they can cause your query to choke.

现在返回此查询,该查询原来是您的主要"查询.当然,您的名字还有别的名字,但是您会通过它的位置知道它.点击它.

Now go back to this query, which would've been your original "main" query. Of course, yours is named something else, but you'll know it by its location. Click on it.

您可能会看到这样的错误.

You'll likely see an error like this.

在屏幕右侧的应用的步骤"中删除所有更改的类型" 步骤.

Delete any Changed Type steps in the APPLIED STEPS on the right side of the screen.

您应该看到类似这样的内容.

You should see something like this.

这篇关于在Excel查询编辑器中合并具有不同列数的CSV文件文件夹的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆