如何转换到Python中最短的url安全字符串整数? [英] How to convert an integer to the shortest url-safe string in Python?

查看:198
本文介绍了如何转换到Python中最短的url安全字符串整数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想psenting在URL中的一个整数重新$ P $的最短途径。例如,11234可缩短至使用十六进制2be2'。由于使用的base64是一个64字符编码,它应该有可能重新present使用更少的字符不是十六进制用base64整数。问题是我想不出使用Python转换为Base64的一个整数(和回来),最彻底的方法。

I want the shortest possible way of representing an integer in a URL. For example, 11234 can be shortened to '2be2' using hexadecimal. Since base64 uses is a 64 character encoding, it should be possible to represent an integer in base64 using even less characters than hexadecimal. The problem is I can't figure out the cleanest way to convert an integer to base64 (and back again) using Python.

基于64位模块具有与字节串处理方式 - 这样也许有解决办法是将整数转换为其二进制重新presentation以字符串形式...但我不知道怎么做,要么。

The base64 module has methods for dealing with bytestrings - so maybe one solution would be to convert an integer to its binary representation as a Python string... but I'm not sure how to do that either.

推荐答案

这答案是精神道格拉斯Leeder介绍的相似,但有以下变化:

This answer is similar in spirit to Douglas Leeder's, with the following changes:


  • 它不使用实际Base64的,所以没有填充字符

  • 而不是首先将数字节字符串(基256),将其转换直接立足64,它有让您重新使用符号字符present负数。的优势

  • It doesn't use actual Base64, so there's no padding characters
  • Instead of converting the number first to a byte-string (base 256), it converts it directly to base 64, which has the advantage of letting you represent negative numbers using a sign character.

import string
ALPHABET = string.ascii_uppercase + string.ascii_lowercase + \
           string.digits + '-_'
ALPHABET_REVERSE = dict((c, i) for (i, c) in enumerate(ALPHABET))
BASE = len(ALPHABET)
SIGN_CHARACTER = '$'

def num_encode(n):
    if n < 0:
        return SIGN_CHARACTER + num_encode(-n)
    s = []
    while True:
        n, r = divmod(n, BASE)
        s.append(ALPHABET[r])
        if n == 0: break
    return ''.join(reversed(s))

def num_decode(s):
    if s[0] == SIGN_CHARACTER:
        return -num_decode(s[1:])
    n = 0
    for c in s:
        n = n * BASE + ALPHABET_REVERSE[c]
    return n


    >>> num_encode(0)
    'A'
    >>> num_encode(64)
    'BA'
    >>> num_encode(-(64**5-1))
    '$_____'


一些旁注:


A few side notes:


  • 您可以(略微的)通过将string.digits首先在字母表(' - '和使符号字符)增加的基极 - 64号的人类可读性;我选择,我并基于Python的urlsafe_b64en code的顺序。

  • 如果您要对其进行编码很多负数的,你可以使用,而不是一个符号字符的标志位或一个人/补提高效率。

  • 您应该能够很容易地通过改变字母适应这个code到不同的基础,无论是将其限制为仅字母数字字符或添加额外的URL安全字符。

  • 我推荐的的使用再$ P $比大多数基地10 URI的psentation情况下,它增加了复杂性,使得调试困难没有比较的开销显著节约HTTP,除非你是打算的东西TinyURL的式的。

  • You could (marginally) increase the human-readibility of the base-64 numbers by putting string.digits first in the alphabet (and making the sign character '-'); I chose the order that I did based on Python's urlsafe_b64encode.
  • If you're encoding a lot of negative numbers, you could increase the efficiency by using a sign bit or one's/two's complement instead of a sign character.
  • You should be able to easily adapt this code to different bases by changing the alphabet, either to restrict it to only alphanumeric characters or to add additional "URL-safe" characters.
  • I would recommend against using a representation other than base 10 in URIs in most cases—it adds complexity and makes debugging harder without significant savings compared to the overhead of HTTP—unless you're going for something TinyURL-esque.

这篇关于如何转换到Python中最短的url安全字符串整数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆