Fig. 7From: Selective network discovery via deep reinforcement learning on embedded spacesIllustration of how quality of node embedding affects the quality of policy and reward functions; a highly uncertain reward and policy functions, b highly concentrated reward and policy functions. Ideally, our policy is distributed around the best action with minimal variance, so b is preferredBack to article page