This bug was linked to the use of compute_single_action, where the seq_lens and the state parameters were empty. This bugged out the script, preventing us from simulating learned policies using This has since been fixed. The QMIX LSTM model does not apparently suffer from this bug, therefore it is untouched for now.