rwallace February 2016

Torch CrossEntropyCriterion error

I'm trying to train a simple test network on the XOR function in Torch. It works when I use MSECriterion, but when I try CrossEntropyCriterion it fails with the following error message:

/home/a/torch/install/bin/luajit: /home/a/torch/install/share/lua/5.1/nn/THNN.lua:699: Assertion `cur_target >= 0 && cur_target < n_classes' failed.  at /tmp/luarocks_nn-scm-1-6937/nn/lib/THNN/generic/ClassNLLCriterion.c:31
stack traceback:
    [C]: in function 'v'
    /home/a/torch/install/share/lua/5.1/nn/THNN.lua:699: in function 'ClassNLLCriterion_updateOutput'
    ...e/a/torch/install/share/lua/5.1/nn/ClassNLLCriterion.lua:41: in function 'updateOutput'
    ...torch/install/share/lua/5.1/nn/CrossEntropyCriterion.lua:13: in function 'forward'
    .../a/torch/install/share/lua/5.1/nn/StochasticGradient.lua:35: in function 'train'
    a.lua:34: in main chunk
    [C]: in function 'dofile'
    /home/a/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670

I get the same error message when decomposing it into LogSoftMax and ClassNLLCriterion. Code is:

dataset={};
function dataset:size() return 100 end -- 100 examples
for i=1,dataset:size() do
  local input = torch.randn(2);     -- normally distributed example in 2d
  local output = torch.Tensor(2);
  if input[1]<0 then
      input[1]=-1
  else
      input[1]=1
  end
  if input[2]<0 then
      input[2]=-1
  else
      input[2]=1
  end
  if input[1]*input[2]>0 then     -- calculate label for XOR function
    output[2] = 1;
  else
    output[1] = 1
  end
  dataset[i] = {input, output}
end

require "nn"
mlp = nn.Sequential();  -- make a multi-layer perceptron
inputs = 2; outputs = 2; HUs = 20; -- parameters
mlp:add(nn.Linear(inputs, HUs))
mlp:add(nn.Tanh())
mlp:add(nn.Linear(HUs, outputs))

criterion = nn.CrossEntropyCriterion()
trainer = nn.StochasticGradient(mlp, criterion)
trainer.learningRate = 0.01
trainer:train(dataset)        

Answers


Alexander Lutsenko February 2016

MSE criterion was designed for regression problems. When it's used for classification tasks, the targets should be one-hot vectors. Cross entropy / Negative log likelihood criteria are used exclusively for classification; therefore, there's no need to explicitly represent the target class as a vector. In torch the target for such criteria is just an index of the assigned class (1 to the number of classes).

Post Status

Asked in February 2016
Viewed 3,526 times
Voted 14
Answered 1 times

Search




Leave an answer