SAMEER MAURYA
Decoding the Complexities of Compositional Generalization in Web Automation with Language Model Agents
Decoding the Complexities of Compositional Generalization in Web Automation with Language Model Agents

In this blog, a detailed analysis is presented on the performance of language model agents (LMAs) in complex web automation tasks. The focus is on CompWoB, a new benchmark comprising 50 intricate tasks. The report highlights that while advanced models like GPT-3.5-Turbo and GPT-4 perform well in simpler tasks, their efficiency significantly drops in compositional tasks. A standout model, HTML-T5++, however, demonstrates remarkable capability, outperforming human levels in some cases. This study illuminates the limitations in LMAs' ability to handle task complexity and varying instruction sequences, emphasizing the need for further development in model adaptability and generalization for practical usage.
https://www.linkedin.com/pulse/decoding-complexities-compositional-generalization-web-sameer-maurya-y7ryf/?trackingId=cJizL%2FOeERXqFp%2B71Prh9g%3D%3D