Skip to content

Underlying implementation for explore_adf softmax #4711

@paulusm

Description

@paulusm

Description

I've gone over the docs a few times, but I'm unclear as to which bandit evaluation algorithm underlies the explore_adf softmax explorer, i.e. how is the quality of an action predicted?

Link to Documentation Page

https://github.com/VowpalWabbit/vowpal_wabbit/wiki/Contextual-Bandit-algorithms

Metadata

Metadata

Assignees

No one assigned

    Labels

    DocumentationIssue in samples or documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions