Skip to content
Dipendra Misra edited this page Dec 25, 2018 · 2 revisions

Overview

Cornell Instruction Following Framework (CIFF) consists of datasets, simulators and source code to run experiments.

We provide 4 datasets with CIFF: Blocks (Bisk et al. NAACL 2016), LANI (Misra et al. EMNLP 2018), CHAI (Misra et al. EMNLP 2018) and Touchdown (Chen et al. arXiv 2018). Blocks, LANI and CHAI come with simulators which are Unity3D executables and generate visual observations besides other functionalities.

The source code consists of experiment dependent packages, core experiment agnostic packages and a set of json files for setting hyperparameters (constants*.json) and environment functionalities (config*.json). The simulators are launched by the source code and the code communicates with the simulators using socket functionality. In most cases, if everything is set properly you only need to set the PYTHONENV and run a single terminal. We detail the 3 type of files in source code below and provide more details on other pages (please use the sidebar to navigate):

Environment Dependent Packages

Environment dependent packages are those which need to be implemented for any new environment that you want to add. For most packages, there is an abstract package containing list of interfaces that you need to implement to be able to use the core functionality. The notation we follow in CIFF is that if the abstract package is called package then the domain dependent package is named package_domain_name. For example, server_blocks is the domain dependent package for the abstract package server. The environment dependent packages consists of the following:

  1. dataset_agreement: The abstract package for dataset_agreement is ./src/dataset_agreement/. It contains interfaces consisting of an abstract view of a data point, action space, dataset parser, dataset loader and metadata util.

  2. server: The abstract package for server is ./src/server/. It contains interfaces for communicating with the simulators, computing a scalar reward and a metadata dictionary (which includes evaluation metric besides other domain specific data). We assume a reinforcement learning setting for describing this interface.

  3. setup_agreement: The abstract package for setup_agreement is ./src/setup_agreement. It contains a single interface that makes sure the setup files (json files) are properly set i.e. all required values such as vocabulary size, size of image etc. are set.

  4. experiments: There is no abstract package for experiment package cause it depends upon what you want to run. This package is implemented for different domains differently. The experiment package will contain several experiments that you want to run for that domain. For example see experiments_nav_drone, experiments_house, experiments_blocks or experiments_streetview.

Environment Agnostic Packages

The environment agnostic packages provide the real functionality and are designed to be environment independent provided the interfaces are correctly implemented. There are essentially five high-level packages:

  1. model: This contains many models for representing policies, value functions, embedding images, text, actions etc. The model package is further divided along: incremental packages and non-incremental packages and model vs modules. Model packages represent a complete model that could be a policy or value function whereas modules contain part of models for embedding images, text, actions etc. In general, a model will use several modules in its definition and combine them to create the final value. A model could also use a module more than once (example, a model can have two perceptrons for embedding different quantities). An incremental model/module is one that has a state and its definition takes the old state as input and outputs a new state along with other quantities while a non-incremental model/module is stateless. For example, LSTM is implemented in CIFF as an incremental module and uses hidden state and cell memory from previous time step as its state and outputs the new hidden state and cell memory as the new state.

  2. learning: Learning package contains various learning algorithms including behaviour cloning, policy gradient etc. Learning package contains three main sub-packages:

    1. asynchronous: Runs asynchronous learning by communicating with several simulators and using Hogwild style learning. If you want fast efficient training then you should use this.

    2. auxiliary_objective: Contains code for computing objective which could be added to the main loss to help learning.

    3. single_client: Contains code for performing synchronized learning using a single client.

  3. agents: Contains code for representing an agent. For example, it contains a class for representing the observed agent state for making decision.

  4. utils: Contains utility files which help running experiments. Examples of a utility file are one for launching simulations for running experiments or finding free ports for socket connection.

  5. baselines: This package contains some baseline code that is not integrated with the ciff learning package. This package will be deprecated in future (see issues).

Clone this wiki locally