从远程服务器上的文件读取标题数据 [英] Read header data from files on remote server

查看:190
本文介绍了从远程服务器上的文件读取标题数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我现在正在开发一个项目,我需要从远程服务器上的文件中读取标题数据。我正在讨论很多大文件,所以我无法读取整个文件,而只是需要的标题数据。



我唯一的解决方案是安装远程服务器与保险丝,然后从文件中读取标题,就好像他们在我的本地计算机上的位置一样。我已经尝试过,它的工作原理。但它有一些缺点。特别是使用FTP:




  • 真的很慢(FTP与使用curlftpfs的SSH进行比较)。在同一台服务器上,使用SSH在90秒内读取了18个文件。并在39秒内FTP 10个文件。

  • 不可靠。有时,挂载点不会被卸载。

  • 如果服务器处于活动状态并且被动挂载已完成。这个挂载点和父文件夹在大约3分钟内被锁定。

  • 即使数据传输正在进行,超时也是如此(猜测这是FTP协议而不是curlftpfs)。



保险丝是一种解决方案,但我不太喜欢它,因为我不觉得我可以信任它。所以我的问题基本上是否有其他解决方案。语言最好是Ruby,但是如果Ruby不支持这个解决方案,其他任何方法都可以工作。



谢谢! 解决方案

你在找什么类型的信息?



你可以尝试使用ruby的open-uri模块。
以下示例来自 http:/ /www.ruby-doc.org/stdlib/libdoc/open-uri/rdoc/index.html

 需要'open-uri'
open(http://www.ruby-lang.org/en){| f |
p f.base_uri#< URI :: HTTP:0x40e6ef2 URL:http://www.ruby-lang.org/en/>
p f.content_type#text / html
p f.charset#iso-8859-1
p f.content_encoding#[]
p f.last_modified#Thu Dec 05 02 :45:02 UTC 2002
}

编辑:看来,来自远程文件的ID3标签信息。这更复杂。

从wiki:
这似乎是一个难题。



在wiki上:


在文件中标记位置



只有使用ID3v2.4标准,
才可以将标签数据放置在
文件末尾,与
ID3v1相同。 ID3v2.2和2.3要求
标签数据在文件之前。流式数据的
绝对是
所必需的,对于静态数据则意味着
,整个音频文件必须更新为
才能在
前面插入数据文件。对于初始标记,这个
会导致一个很大的惩罚,因为每个文件
都必须被重写。标签编写者鼓励在
标签数据之后引入填充,以便允许
编辑标签数据而不需要
,要求整个音频文件为
re - 但是这些不是标准的
,标签要求可能会大不相同
,特别是如果APIC
(相关图片)也嵌入
的话。

这意味着根据文件的ID3标签版本,您可能需要阅读文件的不同部分。



这里有一篇文章概述了使用ruby为ID3tagv1.1读取ID3标签的基础知识,但应将服务器作为一个好的起点: http://rubyquiz.com/quiz136.html



您也可以考虑使用ID3解析库,例如 id3.rb id3lib旁注;但是,我不确定是否支持解析远程文件的功能(很可能会通过一些修改)。


I'm working on a project right now where I need to read header data from files on remote servers. I'm talking about many and large files so I cant read whole files, but just the header data I need.

The only solution I have is to mount the remote server with fuse and then read the header from the files as if they where on my local computer. I've tried it and it works. But it has some drawbacks. Specially with FTP:

  • Really slow (FTP is compared to SSH with curlftpfs). From same server, with SSH 90 files was read in 18 seconds. And with FTP 10 files in 39 seconds.
  • Not dependable. Sometimes the mountpoint will not be unmounted.
  • If the server is active and a passive mounting is done. That mountpoint and the parent folder gets locked in about 3 minutes.
  • Does timeout, even when there's data transfer going (guess this is the FTP-protocol and not curlftpfs).

Fuse is a solution, but I don't like it very much because I don't feel that I can trust it. So my question is basically if there's any other solutions to the problem. Language is preferably Ruby, but any other will work if Ruby does not support the solution.

Thanks!

解决方案

What type of information are you looking for?

You could try using ruby's open-uri module. The following example is from http://www.ruby-doc.org/stdlib/libdoc/open-uri/rdoc/index.html

require 'open-uri'
open("http://www.ruby-lang.org/en") {|f|
  p f.base_uri         # <URI::HTTP:0x40e6ef2 URL:http://www.ruby-lang.org/en/>
  p f.content_type     # "text/html"
  p f.charset          # "iso-8859-1"
  p f.content_encoding # []
  p f.last_modified    # Thu Dec 05 02:45:02 UTC 2002
}

EDIT: It seems that the op wanted to retrieve ID3 tag information from the remote files. This is more complex.

From wiki: This appears to be a difficult problem.

On wiki:

Tag location within file

Only with the ID3v2.4 standard has it been possible to place the tag data at the end of the file, in common with ID3v1. ID3v2.2 and 2.3 require that the tag data precede the file. Whilst for streaming data this is absolutely required, for static data it means that the entire audio file must be updated to insert data at the front of the file. For initial tagging this incurs a large penalty as every file must be re-written. Tag writers are encouraged to introduce padding after the tag data in order to allow for edits to the tag data without requiring the entire audio file to be re-written, but these are not standard and the tag requirements may vary greatly, especially if APIC (associated pictures) are also embedded.

This means that depending on the ID3 tag version of the file, you may have to read different parts of the file.

Here's an article that outlines the basics of reading ID3 tag using ruby for ID3tagv1.1 but should server as a good starting point: http://rubyquiz.com/quiz136.html

You could also look into using a ID3 parsing library, such as id3.rb or id3lib-ruby; however, I'm not sure if either supports the ability to parse a remote file (Most likely could through some modifications).

这篇关于从远程服务器上的文件读取标题数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆