使用 Keras 了解 WeightedKappaLoss [英] Understanding WeightedKappaLoss using Keras

查看:33
本文介绍了使用 Keras 了解 WeightedKappaLoss的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 Keras 尝试使用一系列事件来预测分数 (0-1) 的向量.

例如,X 是由 3 个向量组成的序列,每个向量包含 6 个特征,而 y 是一个包含 3 个分数的向量:

X[[1,2,3,4,5,6], <--- 虚拟数据[1,2,3,4,5,6],[1,2,3,4,5,6]]是[0.34 ,0.12 ,0.46] <--- 虚拟数据

我想把这个问题作为序数分类,所以如果实际值为 [0.5,0.5,0.5] 预测 [0.49,0.49,0.49] 是比 [0.3,0.3,0.3] 更好.我的原始解决方案是在我的最后一层使用 sigmoid 激活和 mse 作为损失函数,因此每个输出神经元的输出范围在 0-1 之间:

def get_model(num_samples, num_features, output_size):选择 = 亚当()模型 = 顺序()model.add(LSTM(config['lstm_neurons'], activation=config['lstm_activation'], input_shape=(num_samples, num_features)))模型.添加(辍学(配置['dropout_rate']))对于 config['dense_layers'] 中的层:模型.添加(密集(层['神经元'],激活=层['激活']))模型.添加(密集(输出大小,激活='sigmoid'))model.compile(loss='mse',optimizer=opt,metrics=['mae','mse'])回报模式

我的目标是了解 WeightedKappaLoss 的用法并在我的实际数据上实施它.我创建了

我不确定以下几点,希望得到澄清:

  • 我用对了吗?
  • 在我最初的问题中,我预测了 3 个分数,而不是 Colab 示例中我只预测了 1 个分数.如果我使用的是 WeightedKappaLoss,是否意味着我需要转换每个分数到 100 个单热编码的向量?
  • 有没有办法在原始浮点分数上使用 WeightedKappaLoss 而不转换为分类问题?

解决方案

让我们把目标分成两个子目标,我们走一遍目的概念,先是Weighted Kappa数学细节,然后总结一下在tensorflow中尝试使用WeightedKappaLoss时需要注意的事项

PS:如果只关心用法可以跳过理解部分


加权Kappa详解

由于Weighted Kappa可以看作Cohen's kappa + weights,所以我们需要先了解Cohen's kappa

Cohen 的 kappa 示例

假设我们有两个分类器(A 和 B)试图将 50 个陈述分为两类(真和假),它们在列联表中将这些陈述相互分类的方式:

 B真假A True 20 5 25 陈述 A think is true错误 10 15 25 个陈述 认为是错误的30 陈述 B 认为是真的20 陈述 B 认为是错误的

现在假设我们想知道:A 和 B 的预测有多可靠?

我们可以做的只是简单地取A和B彼此一致的分类陈述的百分比,即观察一致的比例表示为Po,所以:

Po = (20 + 15)/50 = 0.7

但这是有问题的,因为有概率A和B彼此一致是随机的,即期望机会一致的比例表示为Pe,如果我们使用观察到的百分比作为期望概率,然后:

Pe = (概率陈述 A 认为是真的) * (概率陈述 B 认为是真的) +(概率陈述A认为是错误的)*(概率陈述B认为是错误的)= (25/50) * (30/50) +(25/50) * (20/50)= 0.5

Cohen 的 kappa 系数 表示为 K,它结合了 PoPe 来为我们提供更可靠的可靠性预测预测 A 和 B:

K = (Po - Pe)/(1 - Pe) = 1 - (1 - Po)/(1 - Pe) = 1 - (1 - 0.7)/(1 - 0.5) = 0.4

我们可以看到 A 和 B 越是一致(Po 更高),他们因偶然性而一致的越少(Pe 更低),科恩的河童思考"结果可靠

现在假设 A 是陈述的标签(基本事实),然后 K 告诉我们 B 的预测有多可靠,即在考虑随机机会时,预测与标签的一致性程度

Cohen 的 kappa 的权重

我们用m个类正式定义列联表:

 分类器 2class.1 class.2 class... class.k 对行求和class.1 n11 n12 ... n1k n1+class.2 n21 n22 ... n2k n2+分类器 1 类... ... ... ... ... ...class.k nk1 nk2 ... nkk nk+对列 n+1 n+2 ... n+k N 求和 # 所有表格单元格的总和

表格单元格包含交叉分类类别的计数,分别表示为niji,j,分别表示行和列索引

考虑那些 k 序数类与两个分类类分开,例如将 1, 0 分成五个类 1, 0.75, 0.5, 0.25, 0 有一个平滑有序的过渡,我们不能说除了第一个和最后一个类之外的类是独立的,例如 非常好,好,正常,坏,非常差非常好good 不是独立的,good 应该更接近 bad 而不是 very bad>

由于相邻的类是相互依赖的,那么为了计算与协议相关的数量,我们需要定义这个依赖项,即权重表示为Wij,它分配给每个单元格在列联表中,权重的值(在 [0, 1] 范围内)取决于两个类的接近程度

现在让我们看看加权Kappa中的PoPe公式:

Cohen's kappa中的PoPe公式:

我们可以看到Cohen's kappa中的PoPe公式是Weighted Kappa中公式的特例,当我们使用 Po 计算 K(Cohen's kappa 系数)时,weight = 1 分配给所有对角线单元格,weight = 0加权Kappa中的>Pe公式我们也考虑了相邻类之间的依赖

这里有两个常用的加权系统:

  1. 线性权重:

  1. 二次权重:

其中,|i-j| 是类之间的距离,k 是类的数量

加权 Kappa 损失

这个损失用于我们之前提到的一个分类器是标签的情况,这个损失的目的是使模型(另一个分类器)的预测尽可能可靠,即鼓励模型做出更多与标签一致的预测同时考虑相邻类之间的依赖关系,减少随机猜测

加权 Kappa 损失的公式如下:

它只是采用负Cohen's kappa系数的公式并去掉常数-1然后对其应用自然对数,其中dij = |ij| 线性权重dij = (|ij|)^2 二次权重

>

以下是用tensroflow编写的加权Kappa损失的源代码,你可以看到它只是实现了上面的加权Kappa损失公式:

导入警告从输入导入可选将张量流导入为 tf从 typeguard 导入 typechecked从 tensorflow_addons.utils.types 导入数字类加权KappaLoss(tf.keras.losses.Loss):@typechecked定义 __init__(自己,num_classes:整数,权重:可选[str] =二次",名称:可选[str] =cohen_kappa_loss",epsilon:可选[数字] = 1e-6,dtype:可选[tf.DType] = tf.float32,减少:str = tf.keras.losses.Reduction.NONE,):super().__init__(name=name, reduction=reduction)警告.警告("WeightedKappaLoss 的数据类型默认为"`tf.keras.backend.floatx()`."参数‘dtype’将在插件‘0.12’中移除.",弃用警告,)如果权重不在(线性",二次")中:raise ValueError(未知的 kappa 加权类型.")self.weightage = 权重self.num_classes = num_classesself.epsilon = epsilon 或 tf.keras.backend.epsilon()label_vec = tf.range(num_classes, dtype=tf.keras.backend.floatx())self.row_label_vec = tf.reshape(label_vec, [1, num_classes])self.col_label_vec = tf.reshape(label_vec, [num_classes, 1])col_mat = tf.tile(self.col_label_vec, [1, num_classes])row_mat = tf.tile(self.row_label_vec, [num_classes, 1])如果权重==线性":self.weight_mat = tf.abs(col_mat - row_mat)别的:self.weight_mat = (col_mat - row_mat) ** 2定义调用(自我,y_true,y_pred):y_true = tf.cast(y_true, dtype=self.col_label_vec.dtype)y_pred = tf.cast(y_pred, dtype=self.weight_mat.dtype)batch_size = tf.shape(y_true)[0]cat_labels = tf.matmul(y_true, self.col_label_vec)cat_label_mat = tf.tile(cat_labels, [1, self.num_classes])row_label_mat = tf.tile(self.row_label_vec, [batch_size, 1])如果 self.weightage == 线性":重量 = tf.abs(cat_label_mat - row_label_mat)别的:重量 = (cat_label_mat - row_label_mat) ** 2分子 = tf.reduce_sum(weight * y_pred)label_dist = tf.reduce_sum(y_true,axis=0,keepdims=True)pred_dist = tf.reduce_sum(y_pred,axis=0,keepdims=True)w_pred_dist = tf.matmul(self.weight_mat, pred_dist, transpose_b=True)分母 = tf.reduce_sum(tf.matmul(label_dist, w_pred_dist))分母/= tf.cast(batch_size, dtype=denominator.dtype)损失 = tf.math.divide_no_nan(分子,分母)返回 tf.math.log(loss + self.epsilon)def get_config(self):配置 = {num_classes":self.num_classes,权重":自我权重,epsilon":self.epsilon,}base_config = super().get_config()返回 {**base_config, **config}


加权 Kappa 损失的使用

我们可以使用加权 Kappa 损失,只要我们可以将我们的问题形成为序数分类问题,即类形成平滑有序的过渡并且相邻的类是相互依赖的,例如排名非常好,好,正常,不好,非常差,模型的输出应该是Softmax结果

当我们尝试预测分数 (0-1) 的向量时,我们不能使用 加权 Kappa 损失,即使它们的总和可以为 1,因为 向量中每个元素的权重是不同的,这个损失不是问减法的值有多大不同,而是乘法问的数字有多少,例如:

 将 tensorflow 导入为 tf从 tensorflow_addons.losses 导入 WeightedKappaLossy_true = tf.constant([[0.1, 0.2, 0.6, 0.1], [0.1, 0.5, 0.3, 0.1],[0.8, 0.05, 0.05, 0.1], [0.01, 0.09, 0.1, 0.8]])y_pred_0 = tf.constant([[0.1, 0.2, 0.6, 0.1], [0.1, 0.5, 0.3, 0.1],[0.8, 0.05, 0.05, 0.1], [0.01, 0.09, 0.1, 0.8]])y_pred_1 = tf.constant([[0.0, 0.1, 0.9, 0.0], [0.1, 0.5, 0.3, 0.1],[0.8, 0.05, 0.05, 0.1], [0.01, 0.09, 0.1, 0.8]])kappa_loss = WeightedKappaLoss(weightage='linear', num_classes=4)loss_0 = kappa_loss(y_true, y_pred_0)loss_1 = kappa_loss(y_true, y_pred_1)print('Loss_0: {}, loss_1: {}'.format(loss_0.numpy(), loss_1.numpy()))

输出:

# y_pred_0 等于 y_true 但 loss_1 小于 loss_0损失_0:-0.7053321599960327,损失_1:-0.8015820980072021

您在 Colab 中的代码在序数分类问题,由于你形成的函数X->Y很简单(X的int是Y索引+1),所以模型学习起来相当快准确,因为我们可以看到 K(Cohen 的 kappa 系数)高达 1.0 并且加权 Kappa 损失低于 -13.0(实际上通常是我们可以期望的最低限度)

总而言之,您可以使用加权 Kappa 损失,除非您可以将您的问题形成为序数分类问题,如果可以并尝试要解决 LTR(学习排名)问题,您可以查看 本实施 ListNet 教程本 tensorflow_ranking 教程 以获得更好的结果,否则您不应该使用加权 Kappa 损失,如果您只能将您的问题形成为回归问题强>,那么你应该像你原来的解决方案一样


参考:

科恩在维基百科上的 kappa

R 中的加权 Kappa: 对于两个序数变量

tensroflow-addons中WeightedKappaLoss的源代码

tfa.losses.WeightedKappaLoss 的文档>

分类变量、有序变量和数值变量的区别

I'm using Keras to try to predict a vector of scores (0-1) using a sequence of events.

For example, X is a sequence of 3 vectors comprised of 6 features each, while y is a vector of 3 scores:

X
[
  [1,2,3,4,5,6], <--- dummy data
  [1,2,3,4,5,6],
  [1,2,3,4,5,6]
]

y
[0.34 ,0.12 ,0.46] <--- dummy data

I want to adress the problem as ordinal classification, so if the actual values are [0.5,0.5,0.5] the prediction [0.49,0.49,0.49] is better then [0.3,0.3,0.3]. My Original solution, was to use sigmoid activation on my last layer and mse as the loss function, so the output is ranged between 0-1 for each of the output neurons:

def get_model(num_samples, num_features, output_size):
    opt = Adam()
    model = Sequential()
    
    model.add(LSTM(config['lstm_neurons'], activation=config['lstm_activation'], input_shape=(num_samples, num_features)))
    model.add(Dropout(config['dropout_rate']))

    for layer in config['dense_layers']:
      model.add(Dense(layer['neurons'], activation=layer['activation']))

    model.add(Dense(output_size, activation='sigmoid'))
    model.compile(loss='mse', optimizer=opt, metrics=['mae', 'mse'])

    return model

My Goal is to understand the usage of WeightedKappaLoss and to implement it on my actual data. I've created this Colab to fiddle around with the idea. In the Colab, my data is a sequence shaped (5000,3,3) and my targets shape is (5000, 4) representing 1 of 4 possible classes.

I want the model to understand that it needs to trim the floating point of the X in order to predict the right y class:

[[3.49877793, 3.65873511, 3.20218196],
 [3.20258153, 3.7578669 , 3.83365481],
 [3.9579924 , 3.41765455, 3.89652426]], ----> y is 3 [0,0,1,0]

[[1.74290875, 1.41573056, 1.31195701],
 [1.89952004, 1.95459796, 1.93148095],
 [1.18668981, 1.98982041, 1.89025326]], ----> y is 1 [1,0,0,0]

New model code:

def get_model(num_samples, num_features, output_size):
    opt = Adam(learning_rate=config['learning_rate'])
    model = Sequential()
    
    model.add(LSTM(config['lstm_neurons'], activation=config['lstm_activation'], input_shape=(num_samples, num_features)))
    model.add(Dropout(config['dropout_rate']))

    for layer in config['dense_layers']:
      model.add(Dense(layer['neurons'], activation=layer['activation']))

    model.add(Dense(output_size, activation='softmax'))
    model.compile(loss=tfa.losses.WeightedKappaLoss(num_classes=4), optimizer=opt, metrics=[tfa.metrics.CohenKappa(num_classes=4)])

    return model

When fitting the model I can see the following metrics on TensorBoard:

I'm not sure about the following points and would appreciate clarification:

  • Am I using it right?
  • In my original problem, I'm predicting 3 scores, as opposed of the Colab example, where I'm predicting only 1. If I'm using WeightedKappaLoss, does it mean I'll need to convert each of the scores to a vector of 100 one-hot encoding?
  • Is there a way to use the WeightedKappaLoss on the original floating point scores without converting to a classification problem?

解决方案

Let we separate the goal to two sub-goals, we walk through the purpose, concept, mathematical details of Weighted Kappa first, after that we summarize the things to note when we try to use WeightedKappaLoss in tensorflow

PS: you can skip the understand part if you only care about usage


Weighted Kappa detailed explanation

Since the Weighted Kappa can be see as Cohen's kappa + weights, so we need to understand the Cohen's kappa first

Example of Cohen's kappa

Suppose we have two classifier (A and B) trying to classify 50 statements into two categories (True and False), the way they classify those statements wrt each other in a contingency table:

         B
         True False
A True   20   5     25 statements A think is true
  False  10   15    25 statements A think is false
         30 statements B think is true
              20 statements B think is false

Now suppose we want know: How reliable the prediction A and B made?

What we can do is simply take the percentage of classified statements which A and B agree with each other, i.e proportion of observed agreement denote as Po, so:

Po = (20 + 15) / 50 = 0.7

But this is problematic, because there have probability that A and B agree with each other by random chance, i.e proportion of expected chance agreement denote as Pe, if we use observed percentage as expect probability, then:

Pe = (probability statement A think is true) * (probability statement B think is true) +
     (probability statement A think is false) * (probability statement B think is false) 
   = (25 / 50) * (30 / 50) + 
     (25 / 50) * (20 / 50)
   = 0.5

Cohen's kappa coefficient denote as K that incorporate Po and Pe to give us more robust prediction about reliability of prediction A and B made:

K = (Po - Pe) / (1 - Pe) = 1 - (1 - Po) / (1 - Pe) = 1 - (1 - 0.7) / (1 - 0.5) = 0.4

We can see the more A and B are agree with each other (Po higher) and less they agree because of chance (Pe lower), the more Cohen's kappa "think" the result is reliable

Now assume A is the labels (ground truth) of statements, then K is telling us how reliable the B's prediction are, i.e how much prediction agree with labels when take random chance into consideration

Weights for Cohen's kappa

We define the contingency table with m classes formally:

                                    classifier 2
                       class.1  class.2  class... class.k  Sum over row
               class.1   n11      n12      ...      n1k      n1+  
               class.2   n21      n22      ...      n2k      n2+  
classifier 1   class...  ...      ...      ...      ...      ...  
               class.k   nk1      nk2      ...      nkk      nk+  
       Sum over column   n+1      n+2      ...      n+k      N   # total sum of all table cells

The table cells contain the counts of cross-classified categories denote as nij, i,j for row and column index respectively

Consider those k ordinal classes are separate from two categorical classes, e.g separate 1, 0 into five classes 1, 0.75, 0.5, 0.25, 0 which have a smooth ordered transition, we cannot say the classes are independent except the first and last class, e.g very good, good, normal, bad, very bad, the very good and good are not independent and the good should closer to bad than to very bad

Since the adjacent classes are interdependent then in order to calculate the quantity related to agreement we need define this dependency, i.e Weights denote as Wij, it assigned to each cell in the contingency table, value of weight (within range [0, 1]) depend on how close two classes are

Now let's look at Po and Pe formula in Weighted Kappa:

And Po and Pe formula in Cohen's kappa:

We can see Po and Pe formula in Cohen's kappa is special case of formula in Weighted Kappa, where weight = 1 assigned to all diagonal cells and weight = 0 elsewhere, when we calculate K (Cohen's kappa coefficient) using Po and Pe formula in Weighted Kappa we also take dependency between adjacent classes into consideration

Here are two commonly used weighting system:

  1. Linear weight:

  1. Quadratic weight:

Where, |i-j| is the distance between classes and k is the number of classes

Weighted Kappa Loss

This loss is use in case we mentioned before where one classifier is the labels, and the purpose of this loss is to make the model's (another classifier) prediction as reliable as possible, i.e encourage model to make more prediction agree with labels while make less random guess when take dependency between adjacent classes into consideration

The formula of Weighted Kappa Loss given by:

It just take formula of negative Cohen's kappa coefficient and get rid of constant -1 then apply natural logarithm on it, where dij = |i-j| for Linear weight, dij = (|i-j|)^2 for Quadratic weight

Following is the source code of Weighted Kappa Loss written with tensroflow, as you can see it just implement the formula of Weighted Kappa Loss above:

import warnings
from typing import Optional

import tensorflow as tf
from typeguard import typechecked

from tensorflow_addons.utils.types import Number

class WeightedKappaLoss(tf.keras.losses.Loss):
    @typechecked
    def __init__(
        self,
        num_classes: int,
        weightage: Optional[str] = "quadratic",
        name: Optional[str] = "cohen_kappa_loss",
        epsilon: Optional[Number] = 1e-6,
        dtype: Optional[tf.DType] = tf.float32,
        reduction: str = tf.keras.losses.Reduction.NONE,
    ):
        super().__init__(name=name, reduction=reduction)
        warnings.warn(
            "The data type for `WeightedKappaLoss` defaults to "
            "`tf.keras.backend.floatx()`."
            "The argument `dtype` will be removed in Addons `0.12`.",
            DeprecationWarning,
        )
        if weightage not in ("linear", "quadratic"):
            raise ValueError("Unknown kappa weighting type.")

        self.weightage = weightage
        self.num_classes = num_classes
        self.epsilon = epsilon or tf.keras.backend.epsilon()
        label_vec = tf.range(num_classes, dtype=tf.keras.backend.floatx())
        self.row_label_vec = tf.reshape(label_vec, [1, num_classes])
        self.col_label_vec = tf.reshape(label_vec, [num_classes, 1])
        col_mat = tf.tile(self.col_label_vec, [1, num_classes])
        row_mat = tf.tile(self.row_label_vec, [num_classes, 1])
        if weightage == "linear":
            self.weight_mat = tf.abs(col_mat - row_mat)
        else:
            self.weight_mat = (col_mat - row_mat) ** 2

    def call(self, y_true, y_pred):
        y_true = tf.cast(y_true, dtype=self.col_label_vec.dtype)
        y_pred = tf.cast(y_pred, dtype=self.weight_mat.dtype)
        batch_size = tf.shape(y_true)[0]
        cat_labels = tf.matmul(y_true, self.col_label_vec)
        cat_label_mat = tf.tile(cat_labels, [1, self.num_classes])
        row_label_mat = tf.tile(self.row_label_vec, [batch_size, 1])
        if self.weightage == "linear":
            weight = tf.abs(cat_label_mat - row_label_mat)
        else:
            weight = (cat_label_mat - row_label_mat) ** 2
        numerator = tf.reduce_sum(weight * y_pred)
        label_dist = tf.reduce_sum(y_true, axis=0, keepdims=True)
        pred_dist = tf.reduce_sum(y_pred, axis=0, keepdims=True)
        w_pred_dist = tf.matmul(self.weight_mat, pred_dist, transpose_b=True)
        denominator = tf.reduce_sum(tf.matmul(label_dist, w_pred_dist))
        denominator /= tf.cast(batch_size, dtype=denominator.dtype)
        loss = tf.math.divide_no_nan(numerator, denominator)
        return tf.math.log(loss + self.epsilon)

    def get_config(self):
        config = {
            "num_classes": self.num_classes,
            "weightage": self.weightage,
            "epsilon": self.epsilon,
        }
        base_config = super().get_config()
        return {**base_config, **config}


Usage of Weighted Kappa Loss

We can using Weighted Kappa Loss whenever we can form our problem to Ordinal Classification Problems, i.e the classes form a smooth ordered transition and adjacent classes are interdependent, like ranking something with very good, good, normal, bad, very bad, and the output of the model should be like Softmax results

We cannot using Weighted Kappa Loss when we try to predict the vector of scores (0-1) even if they can sum to 1, since the Weights in each elements of vector is different and this loss not ask how different is the value by subtract, but ask how many are the number by multiplication, e.g:

import tensorflow as tf
from tensorflow_addons.losses import WeightedKappaLoss

y_true = tf.constant([[0.1, 0.2, 0.6, 0.1], [0.1, 0.5, 0.3, 0.1],
                      [0.8, 0.05, 0.05, 0.1], [0.01, 0.09, 0.1, 0.8]])
y_pred_0 = tf.constant([[0.1, 0.2, 0.6, 0.1], [0.1, 0.5, 0.3, 0.1],
                      [0.8, 0.05, 0.05, 0.1], [0.01, 0.09, 0.1, 0.8]])
y_pred_1 = tf.constant([[0.0, 0.1, 0.9, 0.0], [0.1, 0.5, 0.3, 0.1],
                      [0.8, 0.05, 0.05, 0.1], [0.01, 0.09, 0.1, 0.8]])

kappa_loss = WeightedKappaLoss(weightage='linear', num_classes=4)
loss_0 = kappa_loss(y_true, y_pred_0)
loss_1 = kappa_loss(y_true, y_pred_1)
print('Loss_0: {}, loss_1: {}'.format(loss_0.numpy(), loss_1.numpy()))

Outputs:

# y_pred_0 equal to y_true yet loss_1 is smaller than loss_0
Loss_0: -0.7053321599960327, loss_1: -0.8015820980072021

Your code in Colab is working correctly in the context of Ordinal Classification Problems, since the function you form X->Y is very simple (int of X is Y index + 1), so the model learn it fairly quick and accurate, as we can see K (Cohen's kappa coefficient) up to 1.0 and Weighted Kappa Loss drop below -13.0 (which in practice usually is minimal we can expect)

In summary, you can using Weighted Kappa Loss unless you can form your problem to Ordinal Classification Problems which have labels in one-hot fashion, if you can and trying to solve the LTR (Learning to rank) problems, then you can check this tutorial of implement ListNet and this tutorial of tensorflow_ranking for better result, otherwise you shouldn't using Weighted Kappa Loss, if you can only form your problem to Regression Problems, then you should do the same as your original solution


Reference:

Cohen's kappa on Wikipedia

Weighted Kappa in R: For Two Ordinal Variables

source code of WeightedKappaLoss in tensroflow-addons

Documentation of tfa.losses.WeightedKappaLoss

Difference between categorical, ordinal and numerical variables

这篇关于使用 Keras 了解 WeightedKappaLoss的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆