Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird behavior from trained Connect-4 agent #23

Open
tonberry22 opened this issue Sep 3, 2018 · 1 comment
Open

Weird behavior from trained Connect-4 agent #23

tonberry22 opened this issue Sep 3, 2018 · 1 comment

Comments

@tonberry22
Copy link

Hi there, would like to ask if anyone else has had this problem. I've trained up an agent for a couple of days using the high-quality settings (higher self-play and simulation parameters etc.), but when I play test games against it I notice that at times when I miss blocking its win (it has 3-in-a-row already on a diagonal) and play somewhere else, it also ignores its own win and plays elsewhere. This may go on for more than a couple of moves and it never takes the win. I am loading the weights from my model and using act() to get the agent's moves, with tau set to 0 so it acts deterministic.

Would this be a problem with the code or is it explainable in terms of exploitation vs exploration (where the agent is confused when encountering such situations because it has never explored that avenue because it will always block 3-in-a-rows when given the opportunity)? Would there be any way to discourage this behavior apart from hard-coding a 'win-lose check' that prioritizes playing to connect 3-in-a-rows first?

@xuehy
Copy link

xuehy commented Dec 23, 2018

I have met similar cases in my experiments. So I am eager to know if anyone has succeeded in training a model that constantly beats humans?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants