Chapter 06 DQN pong training #77

moayad-hsn · 2020-06-23T08:48:59Z

Hi,
so I faced this error while running the code for training the DQN agent on pong
8589: done 9 games, mean reward -20.444, eps 0.91, speed 124.21 f/s
9518: done 10 games, mean reward -20.400, eps 0.90, speed 121.48 f/s
Traceback (most recent call last):
File "02_dqn_pong.py", line 169, in
loss_t = calc_loss(batch, net, tgt_net, device=device)

File "02_dqn_pong.py", line 96, in calc_loss
state_action_values = net(states_v).gather(1, actions_v.unsqueeze(-1)).squeeze(-1)

RuntimeError: index 17179869185 is out of bounds for dimension 1 with size 6

I want to know the reason for this indexing error, it happens when I start training the network and I don't have any idea on it's cause

The text was updated successfully, but these errors were encountered:

ImGonnaDans · 2020-06-23T09:36:23Z

Hi,
so I faced this error while running the code for training the DQN agent on pong
8589: done 9 games, mean reward -20.444, eps 0.91, speed 124.21 f/s
9518: done 10 games, mean reward -20.400, eps 0.90, speed 121.48 f/s
Traceback (most recent call last):
File "02_dqn_pong.py", line 169, in
loss_t = calc_loss(batch, net, tgt_net, device=device)

File "02_dqn_pong.py", line 96, in calc_loss
state_action_values = net(states_v).gather(1, actions_v.unsqueeze(-1)).squeeze(-1)

RuntimeError: index 17179869185 is out of bounds for dimension 1 with size 6

I want to know the reason for this indexing error, it happens when I start training the network and I don't have any idea on it's cause

There is no such big action, the correct action range is from 0 to env.action_space.n (which is 5 on Pong, totally 6 actions). So, I think you can check the array action_v. make sure that was the really action array you want to input to the method gather.

DeanReznick · 2020-07-18T22:20:45Z

Hi, guys,

This error also appears when I use the CPU instead of the GPU.
If I use the GPU the error appears:

Traceback (most recent call last):
File "...Chapter06/02_dqn_pong.py", line 176, in
loss_t = calc_loss(batch, net, tgt_net, device=device)
File "...Chapter06/02_dqn_pong.py", line 97, in calc_loss
state_action_values = net(states_v).gather(1, actions_v.unsqueeze(-1)).squeeze(-1)
RuntimeError: Expected object of scalar type Long but got scalar type Int for argument #3 'index' in call to _th_gather_out

If I use '.long()' the speed decreases massively. But the code runs.

And:
print(actions_v.shape) -> torch.Size([32])

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chapter 06 DQN pong training #77

Chapter 06 DQN pong training #77

moayad-hsn commented Jun 23, 2020

ImGonnaDans commented Jun 23, 2020

DeanReznick commented Jul 18, 2020

Chapter 06 DQN pong training #77

Chapter 06 DQN pong training #77

Comments

moayad-hsn commented Jun 23, 2020

ImGonnaDans commented Jun 23, 2020

DeanReznick commented Jul 18, 2020