Neural temporal-difference learning converges to global optima

Qi Cai, Zhuoran Yang, Jason D. Lee, Zhaoran Wang

Research output: Contribution to journalConference articlepeer-review

43 Scopus citations

Fingerprint

Dive into the research topics of 'Neural temporal-difference learning converges to global optima'. Together they form a unique fingerprint.

INIS

Computer Science