Azure BlobStorage Blob索引 [英] Azure BlobStorage blobs to Index
问题描述
是否可以将文档上传到Blob存储并执行以下操作:
Is it possible to upload a document to a blob storage and do the following:
- 获取文档内容并添加到索引.
- 从第1点的内容中获取关键短语并添加到索引中.
我希望关键词可以被搜索.
I want the key phrases then to be searchable.
我有一些代码可以将文档上载到一个完美的blob存储中,但是唯一获得此索引(我知道)的方法是使用Azure搜索服务中的导入数据",该服务可以创建并建立索引预定义字段-如下:
I have code that can upload documents to a blobstorage which works perfect, but the only way to get this indexed(that I know of) is by using the "Import Data" within the Azure Search service, which creates and index with predefined fields - as below:
当仅需要这些字段并且索引每5分钟自动更新一次时,此方法非常有用.但是当我想要一个自定义索引时便成为问题
This works great when only needing these fields and the index gets updated automatically every 5 min. But becomes a problem when I want to have a custom Index
但是,我想要的唯一字段如下:
However, the only fields I DO want, are the following:
- fileId
- fileText(这是文档的内容)
- blobURL(允许下载文档)
- keyPhrases(将从fileText中拉出-我也有执行此操作的代码)
我唯一的问题是我需要能够检索Document内容(fileText)才能获取keyPhrases,但据我所知,只有在Document Content已经在索引中时,我才能这样做我可以访问该内容?
The only issue I have is that I need to be able to retrieve the Document content(fileText) to be able to get the keyPhrases, but to my understanding, I can only do this if the Document Content is already in an index for me to access that Content?
我对Azure的了解非常有限,并且很难找到与我想做的事情类似的事情.
I have very limited knowledge with Azure and struggling to find anything that similar to what I want to do.
我用来将文档上传到Blob存储的代码如下:
The code that I am using to upload a document to my blob storage is as follows:
public CloudBlockBlob UploadBlob(HttpPostedFileBase file)
{
string searchServiceName = ConfigurationManager.AppSettings["SearchServiceName"];
string blobStorageKey = ConfigurationManager.AppSettings["BlobStorageKey"];
string blobStorageName = ConfigurationManager.AppSettings["BlobStorageName"];
string blobStorageURL = ConfigurationManager.AppSettings["BlobStorageURL"];
string UserID = User.Identity.GetUserId();
string UploadDateTime = DateTime.Now.ToString("yyyyMMddhhmmss").ToString();
try
{
var path = Path.Combine(Server.MapPath("~/App_Data/Uploads"), UserID + "_" + UploadDateTime + "_" + file.FileName);
file.SaveAs(path);
var credentials = new StorageCredentials(searchServiceName, blobStorageKey);
var client = new CloudBlobClient(new Uri(blobStorageURL), credentials);
// Retrieve a reference to a container. (You need to create one using the mangement portal, or call container.CreateIfNotExists())
var container = client.GetContainerReference(blobStorageName);
// Retrieve reference to a blob named "myfile.gif".
var blockBlob = container.GetBlockBlobReference(UserID + "_" + UploadDateTime + "_" + file.FileName);
// Create or overwrite the "myblob" blob with contents from a local file.
using (var fileStream = System.IO.File.OpenRead(path))
{
blockBlob.UploadFromStream(fileStream);
}
System.IO.File.Delete(path);
return blockBlob;
}
catch (Exception e)
{
var r = e.Message;
return null;
}
}
我希望我能提供过多的信息,但是我不知道该如何解释我所寻找的东西.如果我没有道理,请告诉我,以便我解决问题.
I hope I havnt given too much information, but I dont know how else to explain what I am looking for. If I am not making sense, please let me know so that I can fix my question.
我不是在寻找讲义代码,只是在寻找正确的方向.
I am not looking for handout code, just looking for a shove in the right direction.
我将不胜感激.
谢谢!
推荐答案
我们可以使用Azure搜索通过Azure搜索为文档编制索引.NET SDK . 根据您的描述,我使用.NET SDK创建了一个演示并成功对其进行了测试.以下是我的详细步骤:
We can use Azure Search to index document by Azure Search REST API and .NET SDK. According to your description, I create a demo with .NET SDK and test it successfully. The following is my details steps:
- 从Azure门户创建Azure搜索
- 从Azure门户获取搜索键
-
创建自定义索引字段模型
Create custom index field model
[SerializePropertyNamesAsCamelCase]
public class TomTestModel
{
[Key]
[IsFilterable]
public string fileId { get; set; }
[IsSearchable]
public string fileText { get; set; }
public string blobURL { get; set; }
[IsSearchable]
public string keyPhrases { get; set; }
}
[SerializePropertyNamesAsCamelCase]
public class TomTestModel
{
[Key]
[IsFilterable]
public string fileId { get; set; }
[IsSearchable]
public string fileText { get; set; }
public string blobURL { get; set; }
[IsSearchable]
public string keyPhrases { get; set; }
}
4.创建数据源
string searchServiceName = ConfigurationManager.AppSettings["SearchServiceName"];
string adminApiKey = ConfigurationManager.AppSettings["SearchServiceAdminApiKey"];
SearchServiceClient serviceClient = new SearchServiceClient(searchServiceName, new SearchCredentials(adminApiKey));
var dataSource = DataSource.AzureBlobStorage("storage name", "connectstrong", "container name");
//create data source
if (serviceClient.DataSources.Exists(dataSource.Name))
{
serviceClient.DataSources.Delete(dataSource.Name);
}
serviceClient.DataSources.Create(dataSource);
- 创建自定义索引
var definition = new Index()
{
Name = "tomcustomindex",
Fields = FieldBuilder.BuildForType<TomTestModel>()
};
//create Index
if (serviceClient.Indexes.Exists(definition.Name))
{
serviceClient.Indexes.Delete(definition.Name);
}
var index = serviceClient.Indexes.Create(definition);
var definition = new Index()
{
Name = "tomcustomindex",
Fields = FieldBuilder.BuildForType<TomTestModel>()
};
//create Index
if (serviceClient.Indexes.Exists(definition.Name))
{
serviceClient.Indexes.Delete(definition.Name);
}
var index = serviceClient.Indexes.Create(definition);
-
从搜索浏览中检查搜索结果.
Check the search result from the search explore.
Page.config文件:
Page.config file:
<?xml version="1.0" encoding="utf-8"?> <packages> <package id="Microsoft.Azure.KeyVault.Core" version="1.0.0" targetFramework="net452" /> <package id="Microsoft.Azure.Search" version="3.0.0-rc" targetFramework="net452" /> <package id="Microsoft.Data.Edm" version="5.6.4" targetFramework="net452" /> <package id="Microsoft.Data.OData" version="5.6.4" targetFramework="net452" /> <package id="Microsoft.Data.Services.Client" version="5.6.4" targetFramework="net452" /> <package id="Microsoft.Rest.ClientRuntime" version="2.3.4" targetFramework="net452" /> <package id="Microsoft.Rest.ClientRuntime.Azure" version="3.3.4" targetFramework="net452" /> <package id="Microsoft.Spatial" version="6.15.0" targetFramework="net452" /> <package id="Newtonsoft.Json" version="7.0.1" targetFramework="net452" /> <package id="System.Spatial" version="5.6.4" targetFramework="net452" /> <package id="WindowsAzure.Storage" version="7.2.1" targetFramework="net452" /> </packages>
TomTestModel文件:
TomTestModel file:
using System.ComponentModel.DataAnnotations; using Microsoft.Azure.Search; using Microsoft.Azure.Search.Models; namespace TomAzureSearchTest { [SerializePropertyNamesAsCamelCase] public class TomTestModel { [Key] [IsFilterable] public string fileId { get; set; } [IsSearchable] public string fileText { get; set; } public string blobURL { get; set; } [IsSearchable] public string keyPhrases { get; set; } } }
这篇关于Azure BlobStorage Blob索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!