哪些代码在db中有对应关系 [英] Which codes have correspondence in the db

查看:178
本文介绍了哪些代码在db中有对应关系的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我面临着必须查看数百万条记录的数据库的任务,一组约1500条的代码具有相应的记录,这些记录存在于db中。例如,我有一个csv文件中的1500个ID。我想知道这些ID中的哪些ID存在于数据库中,因此是正确的,哪些不正确。

I'm faced with the task of having to look in a database with millions of records, which codes of a set of about 1500 have a corresponding record, which ones of those exist in the db. For example i have 1500 IDs in a csv file. I want to know which ones of those IDs exist in the database, and are therefore correct, and which ones don't.

有没有更好的方法没有... WHERE id IN(1,2,3,...,1500) ;
有问题的DB /语言是ORACLE PL / SQL。

Is there a better way of doing this without "... WHERE id IN (1, 2, 3, ..., 1500);" ? The DB/language in question is ORACLE PL/SQL.

感谢您提供任何帮助。

推荐答案

表格。这些都是非常整洁的东西,使我们能够查询SQL中的操作系统文件的内容。 了解详情

Build an external table on your CSV file. These are highly neat things which allow us to query the contents of an OS file in SQL. Find out more.

然后是发出查询的简单问题:

Then it's a simple matter of issuing a query:

select csv.id
       , case ( when tgt.id is null then 'invalid' else 'valid') end as valid_id
from your_external_tab csv
       left join target_table tgt on (csv.id = tgt.id)








从表现的角度来看,CSV表是不太理想的

"CSV table is hardly ideal from a performance point of view"

性能是上下文的问题。在这种情况下,它取决于CSV中的数据变化的频率和我们需要查询它的频率。如果文件每天生成一次,并且我们只需要在传递后检查值,那么外部表是最有效的解决方案。但是如果这个数据集是一个需要经常查询的永久存储库,那么写入堆表的开销显然是合理的。

Performance is a matter of context. In this case it depends on how often the data in the CSV changes and how often we need to query it. If the file is produced once a day and we only need to check the values after it has been delivered then an external table is the most efficient solution. But if this data set is a permanent repository which needs to be queried often then the overhead of writing to a heap table is obviously justified.

对我来说,一个由多个ID组成的CSV文件,没有什么听起来像暂态数据,因此
适合外部表的用例。但是OP可能有其他没有提到的要求。

To me, a CSV file consisting of a bunch IDs and nothing else sounds like transient data and so fits the use case for external tables. But the OP may have additional requirements which they haven't mentioned.

这是一种不需要创建任何永久数据库对象的替代方法。

Here is an alternative approach which doesn't require creating any permanent database objects. Consequently it is less elegant, and probably will perform worse.

它使用UTL_FILE读取CSV文件并填充基于SYSTEM.NUMBER_TBL_TYPE的集合,这是一个预定义的集合(NUMBER的嵌套表),应该在Oracle数据库中可用。

It reads the CSV file labouriously using UTL_FILE and populates a collection based on SYSTEM.NUMBER_TBL_TYPE, a pre-defined collection (nested table of NUMBER) which should be available in your Oracle database.

declare
    ids system.number_tbl_type;
    fh utl_file.file_handle;
    idx pls_integer := 0;
    n pls_integer;
 begin
    fh := utl_file.fopen('your_data_directory', 'your_data.csv', 'r');
    begin
        utl_file.get_line(fh, n);
        loop  
            idx := idx + 1;
            ids.extend();
            ids(idx) := n;
            utl_file.get_line(fh, n);
        end loop;
   exception
      when no_data_found then
          if utl_file.is_open(fh) then
             utl_file.fclose(fh);
          end if;
     when others then
          raise;
  end;
  for id_recs in  in  ( select csv.column_value 
              , case ( when tgt.id is null then 'invalid' else 'valid') end as valid_id
                from (select * from table(ids)) csv
            left join target_table tgt on (csv.column_value = tgt.id)
  ) loop
  dbms_output.put_line '(ID '||id_recs.column_value || ' is '||id_recs.valid_id);
  end loop;
end;

注意:我没有测试这个代码。原则是声音,但细节可能需要调试;)

Note: I have not tested this code. The principle is sound but the details may need debugging ;)

这篇关于哪些代码在db中有对应关系的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆