如何从现有值中提取特定值 [英] How to extract a particular value from existing value

查看:74
本文介绍了如何从现有值中提取特定值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含不同模式值的列的表格。



我需要从该模式中提取某些值。





输入

 8.92 mm(0.3510)
31.00 mm(1.2210)
0.3583
1-1 / 32(1.0312)
#77(0.0180 in)
J(0.2770)
11/64(0.1719)
21/32 in(0.6562 in)





输出

 0.3510 
1.2210
0.3583
1.0312
0.0180
0.2770
0.1719
0.6562





请帮助!!



我尝试过:



Charindex和substring但是还没能成功

解决方案

解决方案1是你要收到的最好的建议但是在这个解决方案中我要去假设您已经拥有此数据,并希望将其转换为解决方案1中建议的更合理的格式。



首先检查数据 - 乍一看,看起来有7种或更多种不同的格式,但您感兴趣的数字只有两种不同的格式......列中没有其他内容,例如你的第3行 0.3583 ,或者它们被括号(...)包围。

如果用括号括起来,你只有两个版本 - 仅限数字,例如你的第1行 8.92毫米(0.3510)或后面跟着'in',例如你的行5 #77(0.0180 in)

数据中的其他所有内容都会被忽略,所以你要做的就是

1.删除与所需数量无关的任何内容

2.删除括号中不是数字的任何内容

3.删除括号



在处理此类问题时,进行此前期分析非常重要,这样您就可以确保覆盖所有可能性。



您可以使用sql执行第1步

 选择 id,s。[value] 
FROM @ test
CROSS APPLY string_split(dat,' (' )s
WHERE s.value LIKE ' %)%'

UNION

选择 id,dat
FROM @ test
WHERE dat NOT LIKE ' %(%'

注意事项:

a)如果您在2016之前使用的是SQL版本,则必须编写自己的UDF来进行字符串拆分。通过您最喜欢的搜索引擎提供了数百个示例。

b)请注意,在第一个查询中,我只是在寻找括号中的数字。我通过检查结束括号来忽略括号中的任何内容

c)第二个查询查找列中包含我们的数字唯一规则的模式。

到目前为止我们得到以下结果:

 id [value] 
1 0.3510)
2 1.2210)
3 0.3583
4 1.0312)
5 0.0180 in)
6 0.2770)
7 0.1719)
8 0.6562 in)

所以我们仍然需要摆脱收盘和在我们可以这样做

;   cte  as  

选择 id,REPLACE(REPLACE(s.value,' )'' '), ' ' AS [value]
FROM @ test
CROSS APPLY string_split(dat,' (')s
WHERE s.value LIKE ' %)%'

UNION

选择 id,dat
FROM @ test
WHERE dat NOT LIKE ' %(%'

SELECT id,[value]
来自 cte

这里要注意的一个非常重要的事情是这些值是字符串 - 您可能需要在使用之前将它们转换为数字。



现在我将回顾解决方案1,其中@OriginalGriff指出

引用:

你会发现不符合你所展示的任何例子的输入,所以你将改变复杂的SQL代码来经常检测和添加新案例。

幸运的是你会遵循他的建议所以这将是是一次性的运动,而不是经常发生。但是,您需要检查转换后的数据以确保已捕获所有内容。



编辑:我刚刚将测试数据更改为检查其他问题 - 请注意最后一个值中的新格式

 声明  @ test   table (id  int   identity  1  1 ),dat  nvarchar  50 ))
insert into @ test (dat) values
' 8.92 mm(0.3510)'),
' 31.00 mm(1.2210)'),
' 0.3583'),
' 1-1 / 32(1.0312)'),
' #77(0.0180 in)'),
' J(0.2770)' ),
' 11/64(0.1719)'),
' 21/32 in(0.6562 in)'
,( ' 0.3583 in'

这导致我上面的代码抛出错误< blockquote class =quote>消息8114,级别16,状态5,行28

将数据类型nvarchar转换为数字时出错。

这可以通过执行以下操作来避免(在本例中) cte之外的替换,即

;   cte  as  

选择 id,s。[value]
FROM @ test
CROSS APPLY string_split(dat, ' (')s
WHERE s.value LIKE ' %)%'

UNION

选择 id,dat
FROM @ test
WHERE dat NOT LIKE ' %(%'

SELECT id,REPLACE(REPLACE([value],' )'' '),' in'' '
来自 cte

但这确实让@OriginalGriff成就了我在上面引用过!


基本上,根本不要存储它:SQL字符串处理是......嗯......很差,最好 - 你应该决定存储键入(metric或imperial)并将演示文稿软件中的输入转换为该输入,然后将其作为一致数存储在浮点列中。

将数值存储为字符串只是后来极其痛苦处理的一个秘诀 - 特别是当数值是自由形式而不是在定义的测量系统中时。



暂时保留存储空间是一场噩梦 - 您会发现输入不符合您显示的任何示例,因此您将更改复杂的SQL代码以检测和添加经常发生新案例。

统一并检查您的输入;将数据存储在数字字段中;始终如一。将来你的生活将变得更加轻松!


  DECLARE   @ table   TABLE (输入 NVARCHAR  100 )); 

INSERT INTO @ table (输入)

VALUES
' 8.92 mm(0.3510)'
,(' 31.00 mm(1.2210)'
,(' 0.3583'
,(' 1-1 / 32(1.0312)'
,(' #77(0.0180 in)'
,(' J(0.2770)'
,(' 11/64(0.1719)'
,(' 21/32 in(0.6562 in)'


SELECT SUBSTRING(R EPLACE(REPLACE(INputval,' in'' '),' )'' '),CHARINDEX('' (',INputval)+ 1,LEN(INputval)) AS 输入 FROM @ table


I have a table which has a column which consists of different pattern of values.

I need to extract certain values from that patern.


input

8.92 mm (0.3510)			
31.00 mm (1.2210)			
0.3583
1-1/32" (1.0312)
#77 (0.0180 in)			
J (0.2770)				
11/64" (0.1719)
21/32 in (0.6562 in)



output

0.3510
1.2210
0.3583
1.0312
0.0180
0.2770
0.1719
0.6562



Please help!!

What I have tried:

Charindex and substring but havent been able to succeed

解决方案

Solution 1 is the best advice you are going to receive but in this solution I am going to assume that you already have this data and want to convert it to the more sensible format suggested in Solution 1.

Firstly examine the data - at first glance it looks like there are 7 or more different formats but the numbers that you are interested in are in only two different formats … nothing else is in the column e.g. your row 3 0.3583, OR they are surrounded by brackets (...).
If surrounded by brackets you only have two versions of that - numbers only e.g. your row 1 8.92 mm (0.3510) or followed by ' in' e.g. your row 5 #77 (0.0180 in)
Everything else in the data is ignored, so what you are trying to do is
1. Remove anything not related to the number required
2. Remove anything in brackets that is not a number
3. Remove the brackets

When approaching problems of this kind it is really important to do this up-front analysis so that you can be sure you are covering every eventuality.

You can do step 1 with sql like this

select id, s.[value]
FROM @test
CROSS APPLY string_split (dat, '(') s
WHERE s.value LIKE '%)%'
	
UNION 
	
select id, dat
FROM @test
WHERE dat NOT LIKE '%(%'

Points to note:
a) if you are using a version of SQL prior to 2016 you will have to write your own UDF to do the string split. There are hundreds of examples available via your favourite search engine.
b) note that in the first query I am just looking for the numbers in brackets. I ignore anything that wasn't in brackets by checking for the closing bracket
c) the second query looks for the pattern where the column contains our numbers only rule.
So far we get these results:

id	[value]
1	0.3510)
2	1.2210)
3	0.3583
4	1.0312)
5	0.0180 in)
6	0.2770)
7	0.1719)
8	0.6562 in)

So we still need to get rid of the closing bracket and " in" which we can do like this

;with cte as 
(
	select id, REPLACE(REPLACE(s.value, ')',''), ' in','')  AS [value]
	FROM @test
	CROSS APPLY string_split (dat, '(') s
	WHERE s.value LIKE '%)%'
	
	UNION 
	
	select id, dat
	FROM @test
	WHERE dat NOT LIKE '%(%'
)
SELECT id, [value]
from cte

A really important thing to note here is that these values are strings - you may need to convert them to numbers before they are used.

Now I am going to refer back to Solution 1 where @OriginalGriff points out

Quote:

you will find inputs that don't meet any of the examples you show, so you will be changing complicated SQL code to detect and add new cases frequently.

Fortunately you are going to follow his advice so this will be a one-off exercise rather than a frequent occurrence. However, you will need to examine your converted data to make sure you have captured everything.

EDIT: I just changed my test data to check for other problems - note the new format in the last value

declare @test table (id int identity(1,1), dat nvarchar(50))
insert into @test (dat) values
('8.92 mm (0.3510)' ),
('31.00 mm (1.2210)'), 
('0.3583'),
('1-1/32" (1.0312)'),
('#77 (0.0180 in)' ),
('J (0.2770)'),
('11/64" (0.1719)'),
('21/32 in (0.6562 in)')
,('0.3583 in')

This caused my code above to throw an error

Msg 8114, Level 16, State 5, Line 28
Error converting data type nvarchar to numeric.

This can be avoided (in this instance) by doing the replacements outside of the cte i.e.

;with cte as 
(
	select id, s.[value]
	FROM @test
	CROSS APPLY string_split (dat, '(') s
	WHERE s.value LIKE '%)%'
	
	UNION 
	
	select id, dat
	FROM @test
	WHERE dat NOT LIKE '%(%'
)
SELECT id, REPLACE(REPLACE([value], ')',''), ' in','')  
from cte

But this really does make the point that @OriginalGriff made and which I quoted above!


Basically, don't store it like that at all: SQL string handling is ... ummm ... poor, at best - and you should decide on a storage type (metric or imperial) and convert the input in your presentation software to that, then store it as a consistent number in a floating point column.
Storing numeric values as strings is just a recipe for extremely painful processing later - particularly when the "numeric value" is free form and not in a defined measurement system.

To leave the storage as you have it at the moment is a nightmare - you will find inputs that don't meet any of the examples you show, so you will be changing complicated SQL code to detect and add new cases frequently.
Unify and check your inputs; store the data in numeric fields; be consistent. Your life will be a whole load easier in future!


DECLARE @table TABLE(INputval NVARCHAR(100)); 

INSERT INTO @table(INputval)

VALUES
 ('8.92 mm (0.3510)')	
,('31.00 mm (1.2210)')
,('0.3583')
,('1-1/32" (1.0312)')
,('#77 (0.0180 in)')	
,('J (0.2770)')	
,('11/64" (0.1719)')
,('21/32 in (0.6562 in)')


SELECT SUBSTRING(REPLACE(REPLACE(INputval,'in','  '),')',' '),CHARINDEX('(',INputval)+1,LEN(INputval)) AS INputval  FROM @table


这篇关于如何从现有值中提取特定值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆