x-Useに関する論文・技術記事メモの一覧

x-Use

#Pocket #NLP #Supervised-FineTuning (SFT)#LLMAgent
Issue Date: 2025-06-12 Go-Browse: Training Web Agents with Structured Exploration, Apurva Gandhi+, arXiv25 Comment元ポスト:https://x.com/gneubig/status/1932786231542493553?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-QWebArena:#1849 ... #ComputerVision #Pocket #NLP #Dataset #LanguageModel #Evaluation #MulltiModal #ICLR
Issue Date: 2025-04-18 AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents, Christopher Rawles+, ICLR25 CommentAndroid環境でのPhone Useのベンチマーク ... #Article #ComputerVision #Pocket #NLP #LLMAgent #MulltiModal #Blog #Reasoning #OpenWeight
Issue Date: 2025-04-18 Introducing UI-TARS-1.5, ByteDance, 2025.04 Commentpaper:https://arxiv.org/abs/2501.12326色々と書いてあるが、ざっくり言うとByteDanceによる、ImageとTextをinputとして受け取り、TextをoutputするマルチモーダルLLMによるComputer Use Agent (CUA)関連#1794元 ...

#Article #LLMAgent #Blog
Issue Date: 2025-03-15 browser-useの基礎理解, むさし, 2024.12 Comment公式リポジトリ:https://github.com/browser-use/browser-useBrowserUseはDoMを解析するということは内部的にテキストをLLMで処理してアクションを生成するのだろうか。OpenAIのComputer useがスクリーンショットからアクションを生成するの ... #Article #NLP #LanguageModel #LLMAgent #Blog
Issue Date: 2025-03-12 OpenAI API での Computer use の使い方, npaka, 2025.03 CommentOpenAIのCompute Useがどのようなものかコンパクトにまとまっている。勉強になりました。公式:https://platform.openai.com/docs/guides/tools-computer-use ... #Article #NLP #LLMAgent #python #Blog #API
Issue Date: 2025-01-04 browser-use やばいです, Syoitu, 2024.12 Commentすごい手軽に使えそうだが、クローリング用途に使おうとするとhallucinationが起きた時に困るのでうーんと言ったところ。 ...