将 URL 分解/解析为其在 php 中的组成部分 [英] Break up/parse a URL into its constituent parts in php

查看:25
本文介绍了将 URL 分解/解析为其在 php 中的组成部分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

将 URL 解析为其对应路径的最佳方法是什么,例如

What is the best way to parse a URL into its corresponding paths, such that

https://www.example.com/path/to/directory/file.jpeg?param1=foo&param2=bar

结果为数组保存

Array(
    ["scheme"] => "https",
    ["host"] => www.example.com
    ["directory"] => "path/to/directory"
    ["filename"] => "file"
    ["extension] => "jpeg"
    ["path"] => "path/to/directory/file.jpeg",
    ["file"] => "file.jpeg"
    ["params"] => Array(
        ["param1"] => "foo",
        ["param2"] => "bar"
    )
)

注意:键不需要这样命名,它们只是一个例子.

Note: The keys do not need to be named like this, they are just an example.

我已经研究了 parse_url,但它没有足够细粒度地分割路径,因此进一步的手动处理似乎是不可避免的.

I have looked into parse_url, but it doesn't split up the path fine grained enough, so further manual processing seems inevitable.

旁注:我研究了 非常 很多 多个 questionsanswers,但我找不到任何明确的参考,因此我的问题.

Sidenote: I have looked into very many multiple questions and answers, but I can't find any definite reference, thus my question.

推荐答案

最好的方法是结合多个内置 php 函数的努力,例如 parse_url(用于基本 url 部分),parse_str(用于查询参数)和 pathinfo(用于目录、文件名和扩展名部分).

The best way is to combine the efforts of multiple builtin php functions such as parse_url (for the basic url parts), parse_str (for the query parameters) and pathinfo (for the directory, filename and extension parts).

parse_url 将解析 url 并将其拆分为包含以下键的关联数组(如果它们存在于 url 中):

parse_url will parse the url and split it up into an associative array containing the following keys (if they are present in the url):

  • 方案(http、https、ftp、...)
  • 主机 (www.example.com)
  • 端口
  • 用户
  • 通过
  • 路径(这需要进一步处理)
  • 查询(这将需要进一步处理)
  • fragment(锚点/hashbang 部分,hash 标记之后的任何内容)

parse_str 可用于将 parse_url 中的 query 部分解析为多维(如果需要)关联数组.

parse_str can be used to parse the query part from parse_url into a multidimensional (if needed) associative array.

pathinfo 可用于将 parse_url 中的 path 部分解析为包含以下键的关联数组:

pathinfo can be used to parse the path part from parse_url into an associative array which can contain the following keys:

[dirname] => /path/to/directory
[basename] => file.jpeg
[extension] => jpeg
[filename] => file

把它们放在一起

function decompose_url($url) {
    $parts = parse_url($url);
    if (!$parts) {
        # For seriously malformed urls
        return false;
    }
    # Just for good measure, throw in the top level domain, if there is a host with a top level domain
    if (array_key_exists('host', $parts) && strrpos($parts['host'], '.') !== false) {
        $domain_parts = explode('.', $parts['host']);
        $parts['tld'] = end($domain_parts);
    }
    if (array_key_exists('path', $parts)) {
        $pathinfo = pathinfo($parts['path']);
        if (empty($pathinfo['basename'])) {
            # With an empty basename, extension and filename will also be empty
            unset($pathinfo['basename']);
            unset($pathinfo['extension']);
            unset($pathinfo['filename']);

        }
        $parts = array_merge($parts, $pathinfo);
    }
    if (array_key_exists('query', $parts)) {
        parse_str($parts['query'], $query_parts);
        $parts['query_parts'] = $query_parts;
    }
    return $parts;
}

测试一下

$urls = [
    'http://www.example.com/',
    'http://www.example.com',
    'http://www.example.com/test/.jpg',
    'http://www.example.com/test/.'
    'https://anonymous:dCU7egW1A1L0a6pxU3qu9@www.example.com:8080/path/to/directory/file.jpeg?param1=foo&param2=bar&param3[1]=abc&param3[2]=def#anchor',
    'ftp://anonymous@ftp.example.com/pub/test.jpg',
    'file:///home/user/.config/test.config',
    'chrome://settings/passwords',
];

foreach ($urls as $url) {
    echo $url, PHP_EOL;
    var_export(decompose_url($url));
    echo PHP_EOL, PHP_EOL;
}

将产生这些相应的结果:

will yield these corresponding results:

http://www.example.com/
array (
  'scheme' => 'http',
  'host' => 'www.example.com',
  'path' => '/',
  'tld' => 'com',
  'dirname' => '/',
)

http://www.example.com
array (
  'scheme' => 'http',
  'host' => 'www.example.com',
  'tld' => 'com',
)

http://www.example.com/test/.jpg
array (
  'scheme' => 'http',
  'host' => 'www.example.com',
  'path' => '/test/.jpg',
  'tld' => 'com',
  'dirname' => '/test',
  'basename' => '.jpg',
  'extension' => 'jpg',
  'filename' => '',
)

http://www.example.com/test/.
array (
  'scheme' => 'http',
  'host' => 'www.example.com',
  'path' => '/test/.',
  'tld' => 'com',
  'dirname' => '/test',
  'basename' => '.',
  'extension' => '',
  'filename' => '',
)

https://anonymous:dCU7egW1A1L0a6pxU3qu9@www.example.com:8080/path/to/directory/file.jpeg?param1=foo&param2=bar&param3[1]=abc&param3[2]=def#anchor
array (
  'scheme' => 'https',
  'host' => 'www.example.com',
  'port' => 8080,
  'user' => 'anonymous',
  'pass' => 'dCU7egW1A1L0a6pxU3qu9',
  'path' => '/path/to/directory/file.jpeg',
  'query' => 'param1=foo&param2=bar&param3[1]=abc&param3[2]=def',
  'fragment' => 'anchor',
  'tld' => 'com',
  'dirname' => '/path/to/directory',
  'basename' => 'file.jpeg',
  'extension' => 'jpeg',
  'filename' => 'file',
  'query_parts' => 
  array (
    'param1' => 'foo',
    'param2' => 'bar',
    'param3' => 
    array (
      1 => 'abc',
      2 => 'def',
    ),
  ),
)

ftp://anonymous@ftp.example.com/pub/test.jpg
array (
  'scheme' => 'ftp',
  'host' => 'ftp.example.com',
  'user' => 'anonymous',
  'path' => '/pub/test.jpg',
  'tld' => 'com',
  'dirname' => '/pub',
  'basename' => 'test.jpg',
  'extension' => 'jpg',
  'filename' => 'test',
)

file:///home/user/.config/test.config
array (
  'scheme' => 'file',
  'path' => '/home/user/.config/test.config',
  'dirname' => '/home/user/.config',
  'basename' => 'test.config',
  'extension' => 'config',
  'filename' => 'test',
)

chrome://settings/passwords
array (
  'scheme' => 'chrome',
  'host' => 'settings',
  'path' => '/passwords',
  'dirname' => '/',
  'basename' => 'passwords',
  'filename' => 'passwords',
)

这篇关于将 URL 分解/解析为其在 php 中的组成部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆