Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

seems like dqn use_dueling = True is broken #4

Open
jt70 opened this issue Oct 6, 2023 · 2 comments
Open

seems like dqn use_dueling = True is broken #4

jt70 opened this issue Oct 6, 2023 · 2 comments

Comments

@jt70
Copy link

jt70 commented Oct 6, 2023

I changed the parameter in examples/dqn.py to this and I get an error:

def main():
    env_name = 'CartPole-v1'
    # env_name = 'PongNoFrameskip-v4'
    use_prioritization = True
    use_double = False
    use_dueling = True
    # use_dueling = False
    # use_atari = True
    use_atari = False
Traceback (most recent call last):
  File "/home/jason/github_projects/ProtoRL/protorl/examples/dqn.py", line 41, in <module>
    main()

File "/home/jason/github_projects/ProtoRL/protorl/examples/dqn.py", line 37, in main
    scores, steps_array = ep_loop.run(n_games)

File "/home/jason/github_projects/ProtoRL/protorl/loops/single.py", line 34, in run
    self.agent.update()

File "/home/jason/github_projects/ProtoRL/protorl/agents/dqn.py", line 46, in update
    q_pred = self.q_eval.forward(states)[indices, actions]
TypeError: tuple indices must be integers or slices, not tuple
@AlejoCarpentier007
Copy link

The same thing happens to me, the truth is I can't find why, I'll have to thoroughly check where the problem is.

@AlejoCarpentier007
Copy link

AlejoCarpentier007 commented Jun 26, 2024

@jt70 After a search and tinkering with the framework I found the solution, what happens is that to use dueling you have to change the learner, the actor and the agent, in the folders there are the dqn.py and dueling files, you have to change that, It took me a while to realize the problem because the first thing I did was change the learner but I didn't realize that I also had to change the actor and the agent, if you don't change these it will give you a problem in the learner's update function, also if you use dueling You have to set the dueling parameter to true in the dqn.py file in the examples because otherwise it will also give an error.

from protorl.agents.dueling import DuelingDQNAgent as Agent from protorl.actor.dueling import DuelingDQNActor as Actor from protorl.learner.dueling import DuelingDQNLearner as Learner from protorl.loops.single import EpisodeLoop from protorl.policies.epsilon_greedy import EpsilonGreedyPolicy from protorl.utils.network_utils import make_dqn_networks from protorl.wrappers.common import make_env from protorl.memory.generic import initialize_memory

def main():
env_name = 'CartPole-v1'
# env_name = 'PongNoFrameskip-v4'
use_prioritization = True
use_double = True
use_dueling = True
use_atari = False
layers=[32]
env = make_env(env_name, use_atari=use_atari)
n_games = 1500
bs = 64
# 0.3, 0.5 works okay for cartpole
# 0.25, 0.25 doesn't seem to work
# 0.25, 0.75 doesn't work
memory = initialize_memory(max_size=100_000,
obs_shape=env.observation_space.shape,
batch_size=bs,
n_actions=env.action_space.n,
action_space='discrete',
prioritized=use_prioritization,
alpha=0.3,
beta=0.5
)

policy = EpsilonGreedyPolicy(n_actions=env.action_space.n, eps_dec=1e-4)

q_eval, q_target = make_dqn_networks(env,hidden_layers=layers, use_double=use_double,
                                     use_dueling=use_dueling,
                                     use_atari=use_atari)
dqn_actor = Actor(q_eval, q_target, policy)
q_eval, q_target = make_dqn_networks(env, hidden_layers=layers,use_double=use_double,
                                     use_dueling=use_dueling,
                                     use_atari=use_atari)
dqn_learner = Learner(q_eval, q_target,use_double=use_double,
                      prioritized=use_prioritization, lr=1e-4)

agent = Agent(dqn_actor, dqn_learner, prioritized=use_prioritization,)
sample_mode = 'prioritized' if use_prioritization else 'uniform'
ep_loop = EpisodeLoop(agent, env, memory, sample_mode=sample_mode,
                      prioritized=use_prioritization)
scores, steps_array = ep_loop.run(n_games)

if name == 'main':
main()``

philtabor added a commit that referenced this issue Jul 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants