You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@titu1994
hello,
when I use your code, there is a difference between random action generating part and predicting action from controller part in get_action function.
def get_action(self, state):
'''
Gets a one hot encoded action list, either from random sampling or from
the Controller RNN
Args:
state: a list of one hot encoded states, whose first value is used as initial
state for the controller RNN
Returns:
A one hot encoded action list
'''
if np.random.random() < self.exploration:
print("Generating random action to explore")
actions = []
for i in range(self.state_size * self.num_layers):
state_ = self.state_space[i]
size = state_['size']
sample = np.random.choice(size, size=1)
sample = state_['index_map_'][sample[0]]
action = self.state_space.embedding_encode(i, sample)
actions.append(action)
return actions
else:
print("Prediction action from Controller")
initial_state = self.state_space[0]
size = initial_state['size']
if state[0].shape != (1, size):
state = state[0].reshape((1, size)).astype('int32')
else:
state = state[0]
print("State input to Controller for Action : ", state.flatten())
with self.policy_session.as_default():
K.set_session(self.policy_session)
with tf.name_scope('action_prediction'):
pred_actions = self.policy_session.run(self.policy_actions, feed_dict={self.state_input: state})
return pred_actions
the results of random part are the vectors consist of [0, 1+index number, 0, ...]
but the results of prediction part are the vectors consist of [0, 1, 0, ...] which is one hot encoding.
is it your intention? or just a mistake?
waiting for your answer.
Thanks.
The text was updated successfully, but these errors were encountered:
@titu1994
hello,
when I use your code, there is a difference between random action generating part and predicting action from controller part in get_action function.
the results of random part are the vectors consist of [0, 1+index number, 0, ...]
but the results of prediction part are the vectors consist of [0, 1, 0, ...] which is one hot encoding.
is it your intention? or just a mistake?
waiting for your answer.
Thanks.
The text was updated successfully, but these errors were encountered: