CosmosDB,很长的索引也是分区键 [英] CosmosDB, very long index that's also the partition key
问题描述
我们正在存储文件夹树,项目数量巨大,因此我们在父文件夹上创建了一个分区.
We are storing a folder tree, the number of items is huge so we have created a partition on the parent folder.
当我们发出
SELECT * FROM root WHERE root.parentPath = "\\server\share\shortpath" AND root.isFile
RU极低,性能非常好.
The RUs is very low and the performance is very good.
但是,当我们有很长的路要走时,例如
But, when we have a long path eg
SELECT * FROM root WHERE root.parentPath = "\\server\share\a very\long\path\longer\than\this" AND root.isFile
RU高达5000,性能下降.
The RUs go up to 5000 and the performance suffers.
parentPath
可以很好地用作分区键,因为 all
查询在过滤器中都包含此字段.
parentPath
works well as a partition key as all
queries include this field in the filter.
如果我向查询添加另一个子句,它也会变得非常快,例如,如果我执行类似和root.name ='filename'
If I add another clause to the query it also becomes very fast, eg if I do something like and root.name = 'filename'
这几乎就像是基于从其派生的哈希值扫描整个分区一样.
It's almost like it's scanning the entire partition based on the hash that's derived from it.
查询不返回任何数据
The Query returns NO DATA
很好,因为有人在给定节点下寻找子文件夹,一旦深入,就会变得很慢.
x-ms-documentdb-query-metrics:
totalExecutionTimeInMs=1807.61;
queryCompileTimeInMs=0.08;
queryLogicalPlanBuildTimeInMs=0.04;
queryPhysicalPlanBuildTimeInMs=0.06;
queryOptimizationTimeInMs=0.01;
VMExecutionTimeInMs=1807.11;
indexLookupTimeInMs=0.65;
documentLoadTimeInMs=1247.08;
systemFunctionExecuteTimeInMs=0.00;
userFunctionExecuteTimeInMs=0.00;
retrievedDocumentCount=72554;
retrievedDocumentSize=59561577;
outputDocumentCount=0;
outputDocumentSize=49;
writeOutputTimeInMs=0.00;
indexUtilizationRatio=0.00
来自字符串
x-ms-documentdb-query-metrics: totalExecutionTimeInMs=1807.61;queryCompileTimeInMs=0.08;queryLogicalPlanBuildTimeInMs=0.04;queryPhysicalPlanBuildTimeInMs=0.06;queryOptimizationTimeInMs=0.01;VMExecutionTimeInMs=1807.11;indexLookupTimeInMs=0.65;documentLoadTimeInMs=1247.08;systemFunctionExecuteTimeInMs=0.00;userFunctionExecuteTimeInMs=0.00;retrievedDocumentCount=72554;retrievedDocumentSize=59561577;outputDocumentCount=0;outputDocumentSize=49;writeOutputTimeInMs=0.00;indexUtilizationRatio=0.00
推荐答案
这是由于索引v1中的路径长度限制.
This is because of a path length limit in Indexing v1.
在新的索引布局中,我们已将路径长度限制增加到更大的值,因此将集合迁移到新的布局将解决此问题并提供许多性能优势.
We have increased the path length limit to a larger value in the new index layout, therefore migrating the collections to this new layout would fix the issue and provide many performance benefit.
默认情况下,我们为新集合推出了新的索引布局.如果您可以重新创建当前集合并在那儿迁移现有数据,那就太好了.否则,另一种方法是触发迁移过程,以将现有集合移动到新的索引布局.可以使用以下C#方法进行此操作:
We have rolled out the new index layout for new collections by default. If it is possible for you to recreate the current collection and migrate existing data over there, it would be great. Otherwise, an alternative is to trigger the migration process to move existing collections to the new index layout. The following C# method can be used to do that:
static async Task UpgradeCollectionToIndexV2Async(
DocumentClient client,
string databaseId,
string collectionId)
{
DocumentCollection collection = (await client.ReadDocumentCollectionAsync(string.Format("/dbs/{0}/colls/{1}", databaseId, collectionId))).Resource;
collection.SetPropertyValue("IndexVersion", 2);
ResourceResponse<DocumentCollection> replacedCollection = await client.ReplaceDocumentCollectionAsync(collection);
Console.WriteLine(string.Format(CultureInfo.InvariantCulture, "Upgraded indexing version for database {0}, collection {1} to v2", databaseId, collectionId));
}
迁移可能需要几个小时才能完成,具体取决于集合中的数据量.该问题一旦完成就应该得到解决.
It could take several hours for the migration to complete, depending on the amount of data in the collection. The issue should be addressed once it is completed.
(这是从电子邮件对话中粘贴的副本,我们必须解决此问题)
(This was copy pasted from an email conversation we had to resolve this issue)
这篇关于CosmosDB,很长的索引也是分区键的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!