Hi Shengjie, in addition to what Abe mentioned, sounds like you have a
perfect use-case for incremental mode lastmodified.
Internally, the lastmodified import consists of two standalone
MapReduce jobs. The first job will import the delta of changed data
similarly to the way normal import does. This import job will save
data in a temporary directory on HDFS. The second job will take both
the old and new data and will merge them together into the final
output, preserving only the last updated value for each row.
Here's a sample command :
sqoop import \
--connect jdbc:mysql://mysql.example.com/sqoop \
--username sqoop \
--password sqoop \
--table visits \
--incremental lastmodified \
--check-column last_update_date \
--last-value "2013-05-22 01:01:01"
Hope this helps,
On Mon, Aug 5, 2013 at 8:40 AM, Abraham Elmahrek <[EMAIL PROTECTED]> wrote: