LESSWRONG
Login
HOME
SEQUENCES
CODEX
HPMOR
royf
Send Message
Users Feeds
This user has no feeds associated with them
Posts
magical scoring
Update Then Forget
5 years ago
royf
9
points
expand_less
Upvote
expand_more
Downvote
Followup to: How to Be Oversurprised A Bayesian update needs never lose information. In a dynamic world, though, the update is only half t
Comments
11
How to Be Oversurprised
5 years ago
royf
13
points
expand_less
Upvote
expand_more
Downvote
Followup to: How to Disentangle the Past and the Future Some agents are memoryless, reacting to each new observation as it happens, withou
Comments
14
How to Disentangle the Past and the Future
5 years ago
royf
12
points
expand_less
Upvote
expand_more
Downvote
I'm on my way to an important meeting. Am I worried? I'm not worried. The presentation is on my laptop. I distinctly remember putting it the
Comments
13
Point-Based Value Iteration
5 years ago
royf
9
points
expand_less
Upvote
expand_more
Downvote
Followup to: The Bayesian Agent This post explains one interesting and influential algorithm for achieving high utility of the actions of
Comments
0
Internal Availability
5 years ago
royf
2
points
expand_less
Upvote
expand_more
Downvote
Edit: Following mixed reception, I decided to split this part out of the latest post in my sequence on reinforcement learning. It wasn't cle
Comments
8
The Bayesian Agent
5 years ago
royf
11
points
expand_less
Upvote
expand_more
Downvote
Followup to: Reinforcement Learning: A Non-Standard Introduction, Reinforcement, Preference and Utility A reinforcement-learning agent int
Comments
19
Reinforcement, Preference and Utility
5 years ago
royf
7
points
expand_less
Upvote
expand_more
Downvote
Followup to: Reinforcement Learning: A Non-Standard Introduction A reinforcement-learning agent is interacting with its environment throug
Comments
5
Reinforcement Learning: A Non-Standard Introduction (Part 2)
5 years ago
royf
9
points
expand_less
Upvote
expand_more
Downvote
Followup to: Part 1 In part 1 we modeled the dynamics of an agent and its environment as a turn-based discrete-time process. We now start
Comments
7
Reinforcement Learning: A Non-Standard Introduction (Part 1)
5 years ago
royf
20
points
expand_less
Upvote
expand_more
Downvote
Imagine that the world is divided into two parts: one we shall call the agent and the rest - its environment. Imagine you could describe in
Comments
19
The Perception-Action Cycle
5 years ago
royf
6
points
expand_less
Upvote
expand_more
Downvote
Would readers be interested in a sequence of posts offering an intuitive explanation of my underway thesis on the application of information
Comments
12
Load More...