Skip to main content
Figure 5 | Robotics and Biomimetics

Figure 5

From: Learning search polices from humans in a partially observable context

Figure 5

Overview of the decision loop. At the top, a strategy is chosen given an initial belief p(x0|z0) of the location of the end effector (initially through sampling the conditional). A speed is applied to the given direction based on the believed distance to the goal. This velocity is passed onwards to a low-level impedance controller which sends out the required torques. The resulting sensation, encoded through the multinomial distribution over the environment features, and actual displacement are sent back to update the belief.

Back to article page