.htaccess 中 PDF 和图像文件的规范标题链接 [英] Canonical Header Links for PDF and Image files in .htaccess

查看:24
本文介绍了.htaccess 中 PDF 和图像文件的规范标题链接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为我网站上的许多 PDF 和图像文件设置规范链接.

I'm attempting to setup Canonical links for a number of PDF and images files on my website.

示例文件夹结构:

/index.php
/docs/
    file.pdf
    /folder1/
        file.pdf
    /folder2/
        file1.pdf
        file2.pdf
/img/
    sprite.png
    /slideshow/
        slide1.jpg
        slide2.jpg

示例 PDF URL 到规范 URL:http://www.example.com/docs/folder1/file.pdf -->http://www.example.com/products/folder1/

Example PDF URL to Canonical URL: http://www.example.com/docs/folder1/file.pdf --> http://www.example.com/products/folder1/

我试图避免将单独的 .htaccess 文件放在包含我的所有图像和 PDF 的每个子文件夹中.我目前有 7 个主"文件夹,每个文件夹都有 2-10 个子文件夹,大多数子文件夹都有自己的子文件夹.我有大约 80 个 PDF,还有更多图片.

I am trying to avoid having to put individual .htaccess files in each of the sub-folders that contain all of my images and PDFs. I currently have 7 "main" folders, and each of these folders have any where from 2-10 sub-folders, and most sub-folders have their own sub-folders. I have roughly 80 PDFs, and even more images.

我正在寻找一种(半)动态解决方案,其中某个文件夹中的所有文件都将规范链接设置为单个 URL.我想在单个 .htaccess 文件中保留尽可能多的内容.

I'm looking for a (semi)dynamic solution where all files in a certain folder will have the Canonical Link set to a single url. I want to keep as much as possible in a single .htaccess file.

我知道 不理解路径,并且 > 在 .htaccess 文件中不起作用.

I know that <Files> and <FilesMatch> do not understand paths, and that <Directory> and <DirectoryMatch> don't work in .htaccess files.

有没有一种相当简单的方法来实现这一点?

Is there a fairly simple way to accomplish this?

推荐答案

我不知道单独用 apache 规则解决这个问题的方法,因为它需要某种正则表达式匹配并在一个指令,这是不可能的.

I don't know of a way to solve this with apache rules alone as it would require some sort of regex matching and reusing the result of the match in a directive, which isn't possible.

但是,如果您在混合中引入一个 php 脚本,那就很简单了:

However, it's pretty simple if you introduce a php script into the mix:

RewriteEngine On
RewriteCond %{REQUEST_URI} \.(jpg|png|pdf)$
RewriteRule (.*) /canonical-header.php?path=$1

请注意,无论文件夹名称如何,这都会向脚本发送对所有 jpg、png 和 pdf 文件的请求.如果您只想包含特定文件夹,您可以添加另一个 RewriteCond 来实现.

Note that this would send requests for all jpg, png and pdf files to the script regardless of the folder name. If you want to include only specific folders, you could add another RewriteCond to accomplish that.

现在是 canonical-header.php 脚本:

Now the canonical-header.php script:

<?php

// Checking for the presence of the path variable in the query string allows us to easily 404 any requests that
// come directly to this script, just to be safe.
if (!empty($_GET['path'])) {
    // Be sure to add any new file types you want to handle here so the correct content-type header will be sent.
    $mimeTypes = array(
        'pdf' => 'application/pdf',
        'jpg' => 'image/jpeg',
        'png' => 'image/png',
    );

    $path         = filter_input(INPUT_GET, 'path', FILTER_SANITIZE_URL);
    $file         = realpath($path);
    $extension    = pathinfo($path, PATHINFO_EXTENSION);
    $canonicalUrl = 'http://' . $_SERVER['HTTP_HOST'] . '/' . dirname($path);
    $type         = $mimeTypes[$extension];

    // Verify that the file exists and is readable, or send 404
    if (is_readable($file)) {
        header('Content-Type: ' . $type);
        header('Link <' . $canonicalUrl . '>; rel="canonical"');
        readfile(realpath($path));
    } else {
        header('HTTP/1.0 404 Not Found');
        echo "File not found";
    }
} else {
    header('HTTP/1.0 404 Not Found');
    echo "File not found";
}

请将此代码视为未经测试,并在将其发布到生产环境之前检查它是否在浏览器中按预期工作.

Please consider this code untested and check that it works as expected across browsers before releasing it to production.

这篇关于.htaccess 中 PDF 和图像文件的规范标题链接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆