如何获取图书元数据? [英] How to get book metadata?

查看:442
本文介绍了如何获取图书元数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的应用程式需要根据所提供的ISBN,书名或作者检索任何已发布书籍的相关资讯。这不是一个独特的要求---网站像Amazon.com,Chegg.com,甚至软件像Book Collector似乎能够轻松地做到这一点。但我无法复制它。



为了说明,我不需要搜索整个书籍数据库 - 只有一个有限的子集,如在书集。数据库将简单地允许我用必要的元数据来标记输入的书,以使得能够搜索该书的子集。所以规模不是这里的问题---获取元数据是



我尝试的选项是:



< 提取的鲁棒性。此外,将此应用程序构建为应用程序是明显违反了亚马逊的服务条款。
  • 刮除国会图书馆。虽然这似乎有较少的法律后果, 。ISBNdb.com API。虽然这项服务是免费的,并且很好地返回必要的元数据,我需要这样做每天超过500本书,在这一点,这项服务花费金钱与使用成比例。我想要一个免费或一次性付款解决方案,让我可以这样做。

  • Google Book Data API。虽然这似乎提供了信息

  • 向书籍数据库购买许可证。例如,Ingram或其他公司Baker&泰勒向零售商和图书馆提供这些目录。这个解决方案显然是昂贵的,所以我希望有一个更优雅的解决方案,我错过了。但是如果没有,并且SO上的某人对某个特定的数据库有很好的体验,我愿意这样做。

  • 我试图详细描述我的方法,所以少有书的其他人可以利用上述解决方案。

    解决方案

    由于您不可能每天检索相同的500本图书:将从isbndb.com检索到的数据存储在数据库中,并按书本填写。


    My application needs to retrieve information about any published book based on a provided ISBN, title, or author. This is hardly a unique requirement---sites like Amazon.com, Chegg.com, and even software like Book Collector seem to be able to do this easily. But I have not been able to replicate it.

    To clarify, I do not need to search the entire database of books---only a limited subset which have been inputted, as in a book collection. The database would simply allow me to tag the inputted books with the necessary metadata to enable search on that subset of books. So scale is not the issue here---getting the metadata is.

    The options I have tried are:

    1. Scrape Amazon. Scraping the regular Amazon pages was not very robust to things like missing authors, and while scraping the smaller mobile pages was faster, they shared the same issues with robustness of extraction. Plus, building this into an application is a clear violation of Amazon's Terms of Service.
    2. Scrape the Library of Congress. While this seems to have fewer legal ramifications, ease and robustness were again issues.
    3. ISBNdb.com API. While the service is free up to a point, and does a good job of returning the necessary metadata, I need to do this for over 500 books on a daily basis, at which point this service costs money proportional to use. I'd prefer a free or one-time payment solution that allows me to do the same.
    4. Google Book Data API. While this seems to provide the information I need, I cannot display the book preview as their terms of service requires.
    5. Buy a license to a database of books. For example, companies like Ingram or Baker & Taylor provide these catalogs to retailers and libraries. This solution is obviously expensive, so I'm hoping that there's a more elegant solution I've missed. But if not, and someone on SO has had a good experience with a particular database, I'm willing to go with that.

    I've tried to describe my approach in detail so others with fewer books can take advantage of the above solutions. But given my requirements, I'm at my wits' end for retrieving book metadata, so any pointers are greatly appreciated.

    解决方案

    Since it is unlikely that you have to retrieve the same 500 books every day: store the data retrieved from isbndb.com in a database and fill it up book by book.

    这篇关于如何获取图书元数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆