OmniParser V2

    Turn any LLM into a Computer Use Agent

    Featured
    315 Votes
    Trending
    134 Views
    OmniParser V2 - Turn any LLM into a Computer Use Agent media 2
    OmniParser V2 - Turn any LLM into a Computer Use Agent media 3

    Description

    OmniParser ‘tokenizes’ UI screenshots from pixel spaces into structured elements in the screenshot that are interpretable by LLMs. This enables the LLMs to do retrieval based next action prediction given a set of parsed interactable elements.

    Recommended Products