Fihop February 2016

Calculate the running time for spark sql

I'm trying to run a couple of spark SQL statements and want to calculate their running time.

One of the solution is to resort to log. I’m wondering is there any other simpler methods to do it. Something like the following:

import time

startTimeQuery = time.clock()
df = sqlContext.sql(query)
df.show()
endTimeQuery = time.clock()
runTimeQuery = endTimeQuery - startTimeQuery

Answers


femibyte February 2016

If you're using spark-shell (scala) you could try defining a timing function like this:

def show_timing[T](proc: => T): T = {
    val start=System.nanoTime()
    val res = proc // call the code
    val end = System.nanoTime()
    println("Time elapsed: " + (end-start)/1000 + " microsecs")
    res
}

Then you can try:

val df = show_timing{sqlContext.sql(query)}

Post Status

Asked in February 2016
Viewed 1,579 times
Voted 14
Answered 1 times

Search




Leave an answer