Skip to main content
Figure 5 | Robotics and Biomimetics

Figure 5

From: Learning by imitation with the STIFF-FLOP surgical robot: a biomimetic approach inspired by octopus movements

Figure 5

An example of reward-weighted regression. (From left to right) Subset of iterations during the refinement algorithm. At each iteration, a new Gaussian distribution, depicted by the green ellipse, is fit to the most promising augmented dataset (policy parameters and goal). A regression-based exploration strategy is then used in the augmented-space ζ to iteratively find better policies to achieve the goal. For the regression, we assume that we know the desired goal (blue vertical bar ζ I ∗) but we do not know how to achieve it (namely, the input of the regression is the desired goal).

Back to article page