如何从 URL 获取元数据 [英] How to get meta data from URL

查看:36
本文介绍了如何从 URL 获取元数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚开始使用 java 脚本,我想从 URL 中获取元数据......当在输入字段中输入任何 URL 时,它必须从中提取元数据,这是在 html java 中使用的基本用法-执行代码时抛出错误的脚本

我正在寻找任何替代方案,但没有任何帮助.请提供如何实现该功能的任何想法.

<身体><头><meta name="description" content="免费网络教程"><meta name="keywords" content="HTML5,CSS,JavaScript"><meta name="author" content="John Doe"><meta content="http://stackoverflow.com/favicon.ico"><p>点击按钮返回所有meta元素的content属性值.</p><button onclick="myFunction()">试试看</button><p id="演示"></p><脚本>函数 myFunction() {var x = "https://www.amazon.in/"//var x = document.getElementsByTagName("META");var txt = "";变量 i;for (i = 0; i < x.length; i++) {txt = txt + "+(i+1)+" 的内容.元标记:"+x[i].content+"
";}document.getElementById("demo").innerHTML = txt;}</html>

解决方案

我猜你正在尝试使用 javascript 构建元数据scraper,如果没有错的话.
在从任何网址请求数据时,您需要先考虑 CORS 政策,然后再继续.

参考网址:

  1. https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS
  2. https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS/Errors

JSFiddle:http://jsfiddle.net/pgrmL73h/

已经演示了如何从给定的 URL 中获取元标记.出于演示目的,我使用了 https://jsfiddle.net/ url 来获取元标记,您可以根据需要进行更改.

按照以下步骤从网站检索 META 标签.

  1. 要从任何网站 url 检索页面源,首先您需要访问该网站.使用 jquery AJAX 方法你可以做到.
    参考网址:https://api.jquery.com/jquery.ajax/>

  2. 使用了来自 jQuery 的 $.parseHTML 方法,它有助于从 html 字符串中检索 DOM 元素.
    参考网址:https://api.jquery.com/jquery.parsehtml/>

  3. 一旦 AJAX 请求成功检索页面源,我们就会检查页面源中的每个 DOM 元素 &根据我们的需要过滤 META 节点 &将数据存储在txt"中变量.

例如:关键字、描述等标签将被检索.

  1. AJAX 请求完成后,我们将显示变量txt"的详细信息;在段落标签内.

JS 代码:

function myFunction() {var txt = "";document.getElementById("demo").innerHTML = txt;//此处使用的示例 url,您可以根据需要使其更加动态.//在这里使用 AJAX 来访问 url &从这些网站获取页面源.它在此处的使用方式类似于 PHP 中用于获取页面源代码的 CURL 或 file_get_contents (https://www.php.net/manual/en/function.file-get-contents.php).$.ajax({网址:https://jsfiddle.net/",错误:函数(){txt = "无法检索网页源 HTML";},成功:功能(响应){//将在此处以字符串格式获取输出//使用 $.parseHTML 从检索到的 HTML 字符串中获取 DOM 元素.参考:https://api.jquery.com/jquery.parsehtml/response = $.parseHTML(response);$.each(response, function(i, el){if(el.nodeName.toString().toLowerCase() == 'meta' && $(el).attr("name") != null && typeof $(el).attr(";名称") != 未定义"){txt += $(el).attr("name") +"="+ ($(el).attr("content")?$(el).attr("content"):($(el).attr("value")?$(el).attr("value"):"")) +"
";console.log($(el).attr("name") ,"=", ($(el).attr("content")?$(el).attr("content"):($(el).attr("value")?$(el).attr("value"):"")), el);}});},完成:功能(){document.getElementById("demo").innerHTML = txt;}});}

I just start to using java script and I want to fetch metadata from the URL ... when enter any URL into the input field ,it has to pull meta data from it, this is the basic usage using in html java-script when executing code throwing error

I am searching any alternatives to this, but nothing helps. Please provide any idea how to achieve the functionality.

<!DOCTYPE html>
    <html>
    <body>
    <head>
      <meta name="description" content="Free Web tutorials">
      <meta name="keywords" content="HTML5,CSS,JavaScript">
      <meta name="author" content="John Doe">
      <meta content="http://stackoverflow.com/favicon.ico">
    </head>
    
    <p>Click the button to return the value of the content attribute of all meta elements.</p>
    
    <button onclick="myFunction()">Try it</button>
    
    <p id="demo"></p>
    
    <script>
    function myFunction() {
        var x = "https://www.amazon.in/"
      // var x = document.getElementsByTagName("META");
      var txt = "";
      var i;
      for (i = 0; i < x.length; i++) {
        txt = txt + "Content of "+(i+1)+". meta tag: "+x[i].content+"<br>";
      }
      
      document.getElementById("demo").innerHTML = txt;
    }
    </script>
    
    </body>
    </html>

解决方案

I guess you are trying to build metadata scraper using javascript, if not wrong.
You need to take into consideration CORS policy before proceeding further, while requesting data from any url.

Reference URL:

  1. https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS
  2. https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS/Errors

JSFiddle: http://jsfiddle.net/pgrmL73h/

Have demonstrated, how you can fetch the meta tags from URL given. For demo purpose, I have used https://jsfiddle.net/ url for fetching the meta tags, you can change it as per your need.

Followed below steps to retrieve the META tags from website.

  1. For retrieving page source from any website url, first you need to access that website. Using jquery AJAX method you can do it.
    Reference URL: https://api.jquery.com/jquery.ajax/

  2. Used $.parseHTML method from jQuery which helps to retrieve DOM elements from html string.
    Reference URL: https://api.jquery.com/jquery.parsehtml/

  3. Once the AJAX request retrieves page source successfully, we are checking each DOM element from the page source & filtered the META nodes as per our need & stored the data inside a "txt" variable.

E.G.: Tags like keyword, description will be retrieved.

  1. Once the AJAX request completed, we are displaying the details of the variable "txt" inside a paragraph tag.

JS Code:

function myFunction() {
  var txt = "";
  document.getElementById("demo").innerHTML = txt;
  // sample url used here, you can make it more dynamic as per your need.
  // used AJAX here to just hit the url & get the page source from those website. It's used here like the way CURL or file_get_contents (https://www.php.net/manual/en/function.file-get-contents.php) from PHP used to get the page source.
  $.ajax({
      url: "https://jsfiddle.net/",
      error: function() {
        txt = "Unable to retrieve webpage source HTML";
      }, 
      success: function(response){
          // will get the output here in string format
          // used $.parseHTML to get DOM elements from the retrieved HTML string. Reference: https://api.jquery.com/jquery.parsehtml/
          response = $.parseHTML(response);
          $.each(response, function(i, el){
              if(el.nodeName.toString().toLowerCase() == 'meta' && $(el).attr("name") != null && typeof $(el).attr("name") != "undefined"){
                  txt += $(el).attr("name") +"="+ ($(el).attr("content")?$(el).attr("content"):($(el).attr("value")?$(el).attr("value"):"")) +"<br>";
                  console.log($(el).attr("name") ,"=", ($(el).attr("content")?$(el).attr("content"):($(el).attr("value")?$(el).attr("value"):"")), el);
              }
          });
      },
      complete: function(){
          document.getElementById("demo").innerHTML = txt;
      }
  });
}

这篇关于如何从 URL 获取元数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆