Google Deepmind and Researchers: Transformers Handle E

📢 New preprint: Can transformers extrapolate to input lengths beyond their training input lengths? https://t.co/pNqAWYFnPw For the first time, we show that Transformers can handle decimal digit addition beyond their training lengths, extending test lengths up to 2.5x with the… https://t.co/wdmqsepivn

Xinyun Chen@xinyun_chen_

5 mo

Excited to share our work (https://t.co/GGKi08Vz65) for reading long documents way exceeding the context window (up to 20x). Inspired by human reading paradigm, Read Agent summarizes the input episodically as gist memories, and uses them to retrieve relevant details when needed. https://t.co/RTfBMxU6Nd

Kuang-Huei Lee@kuanghueilee

5 mo

We propose ReadAgent 📖, a LLM agent that reads and reasons over text up to 20x more than the raw context length. Like humans, it decides where to pause, keeps fuzzy episodic memories of past readings, and looks up detail info as needed. Just by prompting. https://t.co/7RuZE7ThQs https://t.co/scTGg8X03n

Sumit@_reachsumit

5 mo

A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts Uses documents similar to humans- segmenting, compressing, and looking up passages when needed - to increase the effective context length for LLMs by 3-20x. 📝https://t.co/aGip6rkf63 👨🏽‍💻https://t.co/bhozRrxUF8 https://t.co/RegPKnA720

AK@_akhaliq

5 mo

Google presents A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts paper page: https://t.co/TTvVZukXjx Current Large Language Models (LLMs) are not only limited to some maximum context length, but also are not able to robustly consume long inputs. To address… https://t.co/QpJi28B0E4

Aran Komatsuzaki@arankomatsuzaki

5 mo

Google presents: A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts proj: https://t.co/2r4VTpq3G9 abs: https://t.co/vajC2kft8r https://t.co/iyJaQwJyMQ

Yongchao Zhou@Yongchao_Zhou_

5 mo

📢 New preprint: Can transformers extrapolate to input lengths beyond their training input lengths? https://t.co/pNqAWYFnPw For the first time, we show that Transformers can handle decimal digit addition beyond their training lengths, extending test lengths up to 2.5x with the… https://t.co/a5CbYpCZZR

AK@_akhaliq

5 mo

Google Deepmind presents Transformers Can Achieve Length Generalization But Not Robustly paper page: https://t.co/RZPUz5j0jH Length generalization, defined as the ability to extrapolate from shorter training sequences to longer test ones, is a significant challenge for language… https://t.co/cuY1Ip1Tdy

AK@_akhaliq

5 mo

Google presents Premise Order Matters in Reasoning with Large Language Models paper page: https://t.co/tK0aEXXaau Large language models (LLMs) have accomplished remarkable reasoning performance in various domains. However, in the domain of reasoning tasks, we discover a… https://t.co/cRbj4pfI1a

Similar Stories

Google Deepmind and Researchers: Transformers Handle Extended Input Lengths

Similar Stories

Sources

Google Deepmind and Researchers: Transformers Handle Extended Input Lengths