zjbu February 2016

Perform actions in supervisor when worker exits

Is there a way to perform actions in the supervisor after a worker exits (in addition to just restarting the worker).

At the moment I have some code inside the worker (below) that seems to run about half the time.

try do 
   some_work()

catch :exit, reason
   save_reason_to_db()
   exit(reason)

The worker has a one_for_one stragtegy, only one worker running. So every time the worker exits the supervisor will start a new one.

It seems like every second time this worker exits, it runs the save_reason_to_db(), and every other time it doesn't.

Answers


michalmuskala February 2016

I'm not an OTP expert but the simplest solution to something like that would probably be to set up a second process in the supervisor that would monitor the failing process and do the work (like saving reason to db or issue some notification).

This way concerns are nicely separated between application structure (supervisors) and reacting to process lifetime events.


zjbu February 2016

This isn't really an answer to my own question, but I found the bug that was causing the issue I was having.

The issue was the worker was actually a consumer for rabbitmq, and one of the actions in the catch block was to ack reject the message with rabbitmq.

Then the rabbitmq was immediately giving the next message to the worker which then was calling exit(reason) straight away and dying without actually trying to execute on the message.

The solution for me was to remove the exit(reason) line, and match more tightly on the reason since there is really no reason for the worker to die.

try do 
  some_work()

catch :exit, :timeout
  save_reason_to_db()

Thanks for inputs - good to have options available :)


stephen_m February 2016

You can do some cleanup or whatever it is you need to do in the terminate/2 callback function which is part of the genserver behaviour.

See here and read more here too

So in the Supervisor worker spec, make sure that the shutdown option is not set to :brutal_kill. If it is set to :brutal_kill, terminate/2 callback in the genserver worker will not be run. Read more here. You should probably set the shutdown option to a timeout that suits your application, eg: shutdown: 10_000

In your genserver, in the init/1 call, set the trap exits flag to true:

:erlang.process_flag(:trap_exit, :true)

or using pure elixir

Process.flag(:trap_exit, :true)

Then you can declare a terminate callback function in your genserver worker:

  def terminate(reason, state) do
    save_reason_to_db()
    :ok
  end

Post Status

Asked in February 2016
Viewed 2,178 times
Voted 11
Answered 3 times

Search




Leave an answer