amchang87 February 2016

Pig: Max Value per Bag of Tuples

So I have some data like the following:

grunt> describe aliveevents_patient_id;
aliveevents_patient_id: {group: int,aliveevents: {(events::patientid: int,events::eventid: chararray,events::etimestamp: datetime,events::value: float,mortality::patientid: int,mortality::mtimestamp: datetime,mortality::label: int)}}

How exactly would I be able to get the biggest value per group of etimestamp?

Essentially I'd like to do this to the following:

patient_id, etimestamp
1, 10
1, 20
2, 30

Outputs

patient_id, etimestamp
1, 20
2, 30

Answers


Ankur Singh February 2016

According to your question :

let aliveevents_patient_id contain two field {patient_id,etimestamp}

then the script is :

A = GROUP aliveevents_patient_id BY patient_id;
DUMP A;
(1,{(1,10),(1,20)})
(2,{(2,30)})

B = FOREACH A GENERATE group,MAX(aliveevents_patient_id.etimestamp);
DUMP B;

(1,20)
(2,30)

Post Status

Asked in February 2016
Viewed 3,920 times
Voted 10
Answered 1 times

Search




Leave an answer