Skip to content

My master research implementation titled 'Multi-Agent Simulated Environments Generated by a Transformer-Based Generative Model'.

License

BSD-2-Clause, Unknown licenses found

Licenses found

BSD-2-Clause
LICENSE-LPIPS
Unknown
LICENSE-NVIDIA
Notifications You must be signed in to change notification settings

Masao-Taketani/multi-agent-env-generator

Repository files navigation

Multi-Agent Simulated Environments Generated by a Transformer-Based Generative Model

generated_pong_2agnts generated_pong_4agnts generated_boxing generated_gtav generated_carla
Paper Link
Link of trained weights

Important

As for the multi-agent models, the trained weights are created based on agents with random policy. If you want more highly accurate environments in any kinds of situations, include data of agents with various kinds of levels, such as random, beginners, mid levels, and experts.

Note

Please also refer to scripts directory as to 1st and 2nd training phases and encoding images after 1st training phase. Hyperparameters we applied for each dataset are also listed there.

Tip

We used a single GPU having 48GB GPU memory for 2nd training phase. If your machine has smaller GPU memory size, reduce the batch size.

Build the Environment

docker build -t [Docker image name] .
docker run --rm -it --ipc=host --gpus '"device=[device id(s)]"' -v $(pwd):/work [Docker image name]:latest

Supported Datasets

Choose a dataset name among boxing, pong, carla or gtav.
As for an image size, specify 64x64 for boxing, pong or carla and 48x80 for gtav.

Data Creation

python data/multi_thread_processing.py --dataset [dataset name] --num_eps 1500 --data_dir datasets --num_threads [number of threads] --num_agents [number of agents]

Then create a file to split the dataset into train, validation, and test datasets.

python data/data_split.py --datapath [dataset path]

Encoder and Decoder Training

python enc_dec_training.py --log_dir [log path] --use_perceptual_loss --batch [batch size] --data_path [dataset path] --dataset [dataset name]

Preparation for Transition Learner Training by Encoding Images into Latent Vectors

python latent_encoder.py --ckpt [checkpoint path] --results_path [output path for encoded dataset] --data_path [dataset path] --dataset [dataset name] --img_size [image size(heightxweight)]

Use --visualize 1 and specify --vis_bs [batch size for visualization] to check the images after encoding and decoding are applied for debugging purposes.

Transition Learner Training

As for action_space argument, specify 4 for pong, 6 for boxing, 2 for carla, and 3 for gtav.
To train Transition Learner, there are two ways provided.
[1] Train with auto-regressive.

python trans_learner_training_ar.py --batch_size [batch size] --data_dir [dataset directory path] --num_workers [number of data processing workers] --max_seq_len [maximum sequential length for each visual and actions]

[2] Train with GAN.

python trans_learner_training_gan.py --batch_size [batch size] --data_dir [dataset directory path] --num_workers [number of data processing workers] --max_seq_len [maximum sequential length for each emb of visual frames and actions] --num_transenclayer [number of transformer layers] --attn_mask_type [attention mask type] --dataset [dataset name] --action_space [number of action space]

Simulator Execution

sudo is required to run keyboard module. Use an image located under init_imgs directory for each initial image required to run the simulator.
The followings are command descriptions for all the environments supported to run on this repo.

  • [GTAⅤ]
    Left: a, Right: d
  • [Pong (2 agents)]
    1st agent: Fire: w, Left: a, Right: d
    2nd agent: Fire: i, Left: j, Right: l
  • [Pong (4 agents)]
    1st agent: Fire: w, Left: a, Right: d
    2nd agent: Fire: t, Left: h, Right: f
    3rd agent: Fire: i, Left: j, Right: l
    4th agent: Fire: s, Left: z, Right: c
  • [Boxing]
    1st agent: Fire: e, Left: a, Right: d, Up: w, Down: s
    2nd agent: Fire: u, Left: j, Right: l, Up: i, Down: k

press q or Ctrl + C on the terminal to quit the environment you are playing.

Important

As for the multi-agent models, the trained weights are created based on agents with random policy. Thus, the generated results may not be consistent if the inputs for the agents are not random.

Tip

As for pong environment, the transitions of the environment is rather slow. In case you feel the same, use --fps 60 for more challenging transition speed.

sudo python3 simulator.py --encdec_ckpt [encoder decoder checkpoint path] --trans_ckpt [transition learner checkpoint path] --init_img_path [initial image path]

References

License

About

My master research implementation titled 'Multi-Agent Simulated Environments Generated by a Transformer-Based Generative Model'.

Resources

License

BSD-2-Clause, Unknown licenses found

Licenses found

BSD-2-Clause
LICENSE-LPIPS
Unknown
LICENSE-NVIDIA

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published