Stefan Papp February 2016

How to use the results of one Hive query as input for another in a hive script?

I want to use the count(*) result from a Hive Query as input for a second hive query. The query is simplified as:

set LIM = SELECT count(*) from default.mytable* 0.8;
select * from default.mytable LIMIT ${hiveconf:LIM};

The above code will lead to an error as the first query does not get executed and there the LIM variable will not get substituted with a numeric value.

Is there a way to force Hive to substitute the variable LIM so that I have a numeric value in the second query?

Answers


Samson Scharfrichter February 2016

## WARNING - verbose explanation follows; the short answer is "no way" ##

In terms of IT Architecture, this kind of tricks is not done in the database tier but in the application tier.

Since I don't know nuthin' about your Teradata stack (fondly nicknamed "taratata" by some of your French-speaking colleagues) I'll take the Oracle stack as an example.

A. Inside a PL/SQL block, you can retrieve the (scalar) result of a query into a variable, and use it later -- as an input bind variable in a prepared statement, or as a way to build dynamically a string to be parsed dynamically as a SQL query. That PL/SQL block an "application", with application logic of arbitrary complexity; it just happens to run inside an Oracle session, on the same host that also runs the database tier.

B. Inside the SQL*Plus client (and maybe compatible tools e.g. SQL Developer) you can use a weird syntax to retrieve a value in a kind of macro-variable, that can be used to stuff the value as-is in further SQL queries. That trick allows some crude "application" logic to be applied to an otherwise static SQL script, client-side. But that is clearly a non-portable trick.

Bottom line - since Hive has no procedural language, and will probably (hopefully) never have one, the best way to do what you want would be to develop your own custom Hive client all by yourself, with whatever business logic you want. After all, there must be thousands of people around the world who are developing Java code to access Hive with JDBC, so you would not be alone...


Yuvi February 2016

Well.. you can do that if you are comfy with writing a shell script. Take the query output and store it into a variable & use the variable for your second query.

Post Status

Asked in February 2016
Viewed 3,890 times
Voted 10
Answered 2 times

Search




Leave an answer