High-confidence error estimates for learned value functions.

AllImages Videos Books Maps News Shopping

[1808.09127] High-confidence error estimates for learned value functions

Aug 28, 2018 · In this paper, we address the largely open problem of how to obtain these high-confidence estimates, for general state-spaces.

[PDF] High-confidence error estimates for learned value functions

auai.org › proceedings › papers

Estimating the value function for a fixed pol- icy is a fundamental problem in reinforcement learning. Policy evaluation algorithms—to es-.

High-confidence error estimates for learned value functions

www.semanticscholar.org › paper › High...

A high-confidence bound on an empirical estimate of the value error to the true value error is provided, which is used to design an offline sampling ...

People's Hypercorrection of High Confidence Errors: Did They Know ...

www.ncbi.nlm.nih.gov › PMC3079415

The hypercorrection effect refers to the finding that when given corrective feedback, errors that are committed with high confidence are easier to correct than ...

Confidence Interval: The Dance of Predictions and Uncertainty - Medium

medium.com › ...

Feb 17, 2024 · A confidence interval is a method that computes an upper and a lower bound around an estimated value. The actual parameter value is either insider or outside ...

Hypercorrection of high confidence errors in children - ScienceDirect.com

www.sciencedirect.com › article › abs › pii

Three experiments investigated whether the hypercorrection effect – the finding that errors committed with high confidence are easier, rather than more ...

The number of sampled returns to obtain high-confidence estimates ...

www.researchgate.net › figure › The-nu...

... High-confidence error estimates for learned value functions | Estimating the value function for a fixed policy is a fundamental problem in reinforcement ...

Confidence prediction errors drive value-based learning in the ... - PLOS

journals.plos.org › ploscompbiol › article

Our results provide evidence that confidence-based learning signals affect instrumentally learned subjective values in the absence of external feedback.

Robust Losses for Learning Value Functions - IEEE Xplore

ieeexplore.ieee.org › iel7

Abstract—Most value function learning algorithms in reinforcement learning are based on the mean squared (projected) Bellman error.

[PDF] Defining Admissible Rewards for High-Confidence Policy Evaluation in ...

finale.seas.harvard.edu › finale › files

Apr 4, 2020 · In this paper, we combine these two conditions to construct tests for admissible functions in reward design using available data. This yields a ...