Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero reward in Overcooked environment regardless of algorithm/length of training #238

Open
promiseve opened this issue May 31, 2024 · 0 comments

Comments

@promiseve
Copy link

Hi,
i added the overcooked env as described here:https://marllib.readthedocs.io/en/latest/handbook/env.html#id64.
However, once i running the script as described in https://marllib.readthedocs.io/en/latest/handbook/quick_start.html#id11 but will modification for the overcooked environment.
script:

`from marllib import marl

prepare env

env = marl.make_env(environment_name="overcooked", map_name="asymmetric_advantages")

initialize algorithm with appointed hyper-parameters

vdn = marl.algos.vdn(hyperparam_source="common")

build agent model based on env + algorithms + user preference

model = marl.build_model(env, vdn, {"core_arch": "mlp", "encode_layer": "128-256"})

start training

vdn.fit(env, model, stop={"timesteps_total": 1000000}, checkpoint_freq=100, share_policy="group", checkpoint_end= True)
#render
#mappo.render(env, model, local_mode =True)`

I get a reward of 0.0 no matter how long i train or the algorithm i use.So far i have tried mappo and VDN, when i train on mpe environment , there are changes in the reward.
Please i would appreciate any ideas or suggestions.

@promiseve promiseve changed the title Zero reward in Overcooked environment regardless of algorithm/lenght of training Zero reward in Overcooked environment regardless of algorithm/length of training May 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant