SQL-如何找到查询的最佳性能数字 [英] SQL - How to find optimal performance numbers for query
问题描述
第一次来这里,请原谅我的假证.我有一个关于SQL的局限性的问题,因为我是代码新手,我认为我所需要的相当复杂.
First time here so forgive me for any faux pas. I have a question about the limitation of SQL as I am new to the code, and what I need I believe to be rather complex.
是否可以针对特定查询自动查找最佳数据.例如,说我有以下几列:
Is it possible to automate finding the optimal data for a specific query. For example, say I have the following columns:
1)车辆类型(文字),例如汽车,自行车,公共汽车
1) Vehicle type (Text) e.g. car,bike,bus
2)乘客人数(数字),例如0-7
2) Number of passengers (Numeric) e.g. 0-7
3)发生事故(布尔值),例如t或f
3) Was in an accident (Boolean) e.g. t or f
我想从这里获取百分比.因此,如果我只选择载有3名乘客的汽车,那占事故总数的百分比.
From here, I would like to get percentages. So if I were to select only cars with 3 passengers, what percentage of the total accidents does that account for.
我了解如何将其作为一个整数或进行数学计算,但是我的问题涉及到如何使这一过程自动化以获取最佳数量.
I understand how to get this as a one off or mathematically calculate it, however my question relates how to automate this process to get the optimum number.
那么,按照这个例子,假设我只看汽车,哪个乘客承担了最高的事故发生率?
So, keeping with this example, say I look at just cars, what number of passengers covers the highest percentage of accidents?
目前,我正在逐个数字进行测试,是否有办法查找"最佳数字?就像示例中的0-7一样,这很容易,但是我自然希望处理更大的范围甚至多个范围.例如,假设我们添加了另一个名为:
At the moment, I am currently going through and testing number by number, is there a way to 'find' the optimal number? It is easy when it is just 0-7 like in the example, but I would naturally like to deal with a larger range and even multiple ranges. For example, say we add another variable titled:
4)门数(数字)e-g- 0-3
4) Number of doors (numeric) e-g- 0-3
是否有办法从这两个覆盖事故率最高的变量中找到最佳的数字组合?
Would there be a way of finding the best combination of numbers from these two variables that cover the highest percentage of accidents?
所以说我们乘了:汽车,> 2位乘客,< 3门.在事故变量中,有50%是真实的
So say we took: Car, >2 passengers, <3 doors on the vehicle. Out of the accidents variable 50% were true
但是,如果我们将其更改为:汽车,> 4位乘客,<3门.在事故变量中,有80%是真实的.
But if we change that to:Car, >4 passengers, <3 doors. Out of the accidents variable 80% were true.
我希望我已经解释清楚了.我知道这很可能用SQL无法实现,但是还有另一种方法来找到这些最佳数字吗?
I hope I have explained this well. I understand that this is most likely not possible with SQL, however is there another way to find these optimum numbers?
预先感谢
推荐答案
下面是一个示例,它将为您提供所有可能性的答案.您可以添加一个limit子句以仅显示最上面的答案,或者添加到where子句以限制为特定术语.
Here's an example that will give you an answer for all possibilities. You could add a limit clause to show only the top answer, or add to the where clause to limit to specific terms.
SELECT
`vehicle_type`,
`num_passengers`,
sum(if(`in_accident`,1,0)) as `num_accidents`,
count(*) as `num_in_group`,
sum(if(`in_accident`,1,0)) / count(*) as `percent_accidents`
FROM `accidents`
GROUP BY `vehicle_type`,
`num_passengers`
ORDER BY sum(if(`in_accident`,1,0)) / count(*)
这篇关于SQL-如何找到查询的最佳性能数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!