Yura P. February 2016
SQL queries in RDD
I must operate with RDD by Scala/Spark methods and by SQL queries.
Is it possible to operate with RDD directly via SQL queries?
The proposed ways (schemaRDD or DataFrame) require extra memory leakage.
After such a transformation I have in the memory two identical huge objects.
eliasah February 2016
Yes, in a way, you may be able to do so. But you'll need to create your own version of DataFrame.
DataFrame is an abstraction over RDDs. Nevertheless, joins, filters, etc. the features that you find with Spark-SQL are optimized with DataFrames but they were made on RDDs first.
Asked in February 2016
Viewed 2,343 times
Answered 1 times
Leave an answer
Quote of the day: live life