Home Ask Login Register

Developers Planet

Your answer is one click away!

Yura P. February 2016

SQL queries in RDD

I must operate with RDD by Scala/Spark methods and by SQL queries.

Is it possible to operate with RDD directly via SQL queries?

The proposed ways (schemaRDD or DataFrame) require extra memory leakage.

After such a transformation I have in the memory two identical huge objects.


eliasah February 2016

Yes, in a way, you may be able to do so. But you'll need to create your own version of DataFrame.

DataFrame is an abstraction over RDDs. Nevertheless, joins, filters, etc. the features that you find with Spark-SQL are optimized with DataFrames but they were made on RDDs first.

Post Status

Asked in February 2016
Viewed 2,343 times
Voted 12
Answered 1 times


Leave an answer

Quote of the day: live life