--query 下 $CONDITIONS 的目的是什么? [英] What is the purpose of $CONDITIONS under --query?
问题描述
我使用的是 cloudera 快速入门版 CDH 5.7
I am using cloudera quick start edition CDH 5.7
我在终端窗口上使用了以下查询:
I used below query on terminal window:
sqoop import \
--connect "jdbc:mysql://quickstart.cloudera:3306/retail_db" \
--username=retail_dba \
--password=cloudera \
--query="select * from orders join order_items on orders.order_id = order_items.order_item_order_id where \$CONDITIONS" \
--target-dir /user/cloudera/order_join \
--split-by order_id \
--num-mappers 4
问:$CONDITIONS 的目的是什么?为什么在这个查询中使用?谁能给我解释一下.
Q: What is the purpose of the $CONDITIONS ? Why used in this query ? Can anybody can explain to me.
推荐答案
$CONDITIONS
用于 sqoop 内部修改查询,实现任务拆分和获取元数据.
$CONDITIONS
is used internally by sqoop to modify query to achieve task splitting and fetching metadata.
为了获取元数据,sqoop 将 \$CONDITIONS
替换为 1= 0
To fetch metadata, sqoop replaces \$CONDITIONS
with 1= 0
select * from table where 1 = 0
为了获取所有数据(1 个映射器),sqoop 将 \$CONDITIONS
替换为 1= 1
To fetch all data (1 mapper), sqoop replaces \$CONDITIONS
with 1= 1
select * from table where 1 = 1
在多个映射器的情况下,sqoop 用范围查询替换 \$CONDITIONS
以从 RDBMS 获取数据的子集.
In the case of multiple mappers, sqoop replaces \$CONDITIONS
with range query to fetch a subset of data from RDBMS.
例如,id
介于 1 到 100 之间,我们使用了 4 个映射器.
For example, id
lies between 1 to 100 and we are using 4 mappers.
Select * From table WHERE id >= 1' AND 'id < 25
Select * From table WHERE id >= 25' AND 'id < 50
Select * From table WHERE id >= 50' AND 'id < 75
Select * From table WHERE id >= 75' AND 'id <= 100
这篇关于--query 下 $CONDITIONS 的目的是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!