Jose M CV February 2016

Sqoop job fails with KiteSDK validation error for Oracle import

I am attempting to run a Sqoop job to load from an Oracle db and into Parquet format to a Hadoop cluster. The job is incremental.

Sqoop version is 1.4.6. Oracle version is 12c. Hadoop version is 2.6.0 (distro is Cloudera 5.5.1).

The Sqoop command is (this creates the job, and executes it):

$ sqoop job -fs hdfs://<HADOOPNAMENODE>:8020 \
--create myJob \
-- import \
--connect jdbc:oracle:thin:@<DBHOST>:<DBPORT>/<DBNAME> \
--username <USERNAME> \
-P \
--as-parquetfile \
--table <USERNAME>.<TABLENAME> \
--target-dir <HDFSPATH>  \
--incremental append  \
--check-column <TABLEPRIMARYKEY>

$ sqoop job --exec myJob

Error on execute:

16/02/05 11:25:30 ERROR sqoop.Sqoop: Got exception running Sqoop:
org.kitesdk.data.ValidationException: Dataset name
05112528000000918_2088_<USERNAME>.<TABLENAME>
is not alphanumeric (plus '_')
    at org.kitesdk.data.ValidationException.check(ValidationException.java:55)
    at org.kitesdk.data.spi.Compatibility.checkDatasetName(Compatibility.java:103)
    at org.kitesdk.data.spi.Compatibility.check(Compatibility.java:66)
    at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.create(FileSystemMetadataProvider.java:209)
    at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.create(FileSystemDatasetRepository.java:137)
    at org.kitesdk.data.Datasets.create(Datasets.java:239)
    at org.kitesdk.data.Datasets.create(Datasets.java:307)
    at org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:107)
    at org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:80)
    at org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:106)
    at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:260)
    at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:668)
    at org.apache.sqoop.manager.OracleMa        

Answers


Jose M CV February 2016

Resolved. There seems to be a known issue with the JDBC connectivity to Oracle 12c. Using a specific OJDBC6 (instead of 7) did the trick. FYI - the OJDBC is installed in /usr/share/java/ and a symbolic link is created in /installpath.../lib/sqoop/lib/

Post Status

Asked in February 2016
Viewed 1,157 times
Voted 11
Answered 1 times

Search




Leave an answer