SQL-如何找到查询的最佳性能数字 [英] SQL - How to find optimal performance numbers for query

查看:105
本文介绍了SQL-如何找到查询的最佳性能数字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

第一次来这里,请原谅我的假证.我有一个关于SQL的局限性的问题,因为我是代码新手,我认为我所需要的相当复杂.

First time here so forgive me for any faux pas. I have a question about the limitation of SQL as I am new to the code, and what I need I believe to be rather complex.

是否可以针对特定查询自动查找最佳数据.例如,说我有以下几列:

Is it possible to automate finding the optimal data for a specific query. For example, say I have the following columns:

1)车辆类型(文字),例如汽车,自行车,公共汽车

1) Vehicle type (Text) e.g. car,bike,bus

2)乘客人数(数字),例如0-7

2) Number of passengers (Numeric) e.g. 0-7

3)发生事故(布尔值),例如t或f

3) Was in an accident (Boolean) e.g. t or f

我想从这里获取百分比.因此,如果我只选择载有3名乘客的汽车,那占事故总数的百分比.

From here, I would like to get percentages. So if I were to select only cars with 3 passengers, what percentage of the total accidents does that account for.

我了解如何将其作为一个整数或进行数学计算,但是我的问题涉及到如何使这一过程自动化以获取最佳数量.

I understand how to get this as a one off or mathematically calculate it, however my question relates how to automate this process to get the optimum number.

那么,按照这个例子,假设我只看汽车,哪个乘客承担了最高的事故发生率?

So, keeping with this example, say I look at just cars, what number of passengers covers the highest percentage of accidents?

目前,我正在逐个数字进行测试,是否有办法查找"最佳数字?就像示例中的0-7一样,这很容易,但是我自然希望处理更大的范围甚至多个范围.例如,假设我们添加了另一个名为:

At the moment, I am currently going through and testing number by number, is there a way to 'find' the optimal number? It is easy when it is just 0-7 like in the example, but I would naturally like to deal with a larger range and even multiple ranges. For example, say we add another variable titled:

4)门数(数字)e-g- 0-3

4) Number of doors (numeric) e-g- 0-3

是否有办法从这两个覆盖事故率最高的变量中找到最佳的数字组合?

Would there be a way of finding the best combination of numbers from these two variables that cover the highest percentage of accidents?

所以说我们乘了:汽车,> 2位乘客,< 3门.在事故变量中,有50%是真实的

So say we took: Car, >2 passengers, <3 doors on the vehicle. Out of the accidents variable 50% were true

但是,如果我们将其更改为:汽车,> 4位乘客,<3门.在事故变量中,有80%是真实的.

But if we change that to:Car, >4 passengers, <3 doors. Out of the accidents variable 80% were true.

我希望我已经解释清楚了.我知道这很可能用SQL无法实现,但是还有另一种方法来找到这些最佳数字吗?

I hope I have explained this well. I understand that this is most likely not possible with SQL, however is there another way to find these optimum numbers?

预先感谢

推荐答案

下面是一个示例,它将为您提供所有可能性的答案.您可以添加一个limit子句以仅显示最上面的答案,或者添加到where子句以限制为特定术语.

Here's an example that will give you an answer for all possibilities. You could add a limit clause to show only the top answer, or add to the where clause to limit to specific terms.

SELECT
    `vehicle_type`,
    `num_passengers`,
    sum(if(`in_accident`,1,0)) as `num_accidents`,
    count(*) as `num_in_group`,
    sum(if(`in_accident`,1,0)) / count(*) as `percent_accidents`
FROM `accidents`
GROUP BY `vehicle_type`,
    `num_passengers`
ORDER BY sum(if(`in_accident`,1,0)) / count(*)

这篇关于SQL-如何找到查询的最佳性能数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆