A new study by Stanford's RegLab and its Human-Centered Artificial Intelligence research center reveals that legal AI tools from LexisNexis and Thomson Reuters, which claim to be free from hallucinations, still produce a significant number of erroneous citations. The study found that these tools hallucinate in one of two ways: either by providing incorrect responses or by citing sources that do not support their claims. The researchers discovered that Westlaw's AI-assisted research tool produced false responses in 1 out of 6 cases, and when given access to the 'right' product, the rate increased to 1 in 3 responses. These findings highlight the need for rigorous and transparent benchmarking of AI tools in the legal field. The study, conducted in collaboration with Yale, also notes that LexisNexis made headlines in October 2023 for claiming to have hallucination-free AI products.
🚨 In Oct 2023, LexisNexis made splashy headlines about hallucination-free AI for law. Turns out that it was full of BS 😅 The so-called hallucination free products can hallucinate up to 33% of the time per this recent work from Stanford + Yale researchers! 1/4 https://t.co/tWwOUDDUJp
"[D]ocuments that might be relevant on their face due to semantic similarity to a query may actually be inapposite for... reasons... unique to the #law. Thus, we also observe hallucinations occurring when these RAG systems fail to identify the truly binding authority." #AI #tech https://t.co/XseLHhzWGG
Appellate Judge Proposes Possible Use of GenAI for Contract Interpretation – Recognizes That AI Hallucinates but Flesh-And-Blood Lawyers Do... https://t.co/RAH13CEqzW | by @SheppardMullin
Even as a non-lawyer on this team, I find fascinating the concrete examples of hallucinated & wrong LLM RAG “facts” we see. Some examples from v2 of the paper—now including Westlaw’s AI-Assisted Research, which we were given access to after posting v1. https://t.co/MnMP30x6YN https://t.co/O28t6PO76q https://t.co/IZYOmL0hAD
"These systems can hallucinate in one of two ways. First, a response from an #AI tool might just be incorrect... Second, a response might be misgrounded—the AI tool describes the law correctly, but cites a source which does not in fact support its claims": https://t.co/4e5iqShVVz
In a wild follow up to this, Westlaw tried to say the researchers that found 1 in 6 (!) false responses from their AI tools were "auditing the wrong product". So those researchers asked for access the "right" product and found hallucinations in... 1 in 3 responses 🤦🏾♀️ https://t.co/CHv9Ks2ldc https://t.co/3TtbKGLSJB
A new study reveals that even bespoke legal AI tools that promise “hallucination-free” citations still hallucinate an alarming amount of the time. The findings call for a need for rigorous and transparent benchmarking of AI tools in law. https://t.co/Ot1AIapa6p
The preprint study by Stanford’s RegLab and its Human-Centered Artificial Intelligence research center found that LexisNexis and Thomson Reuters overstate their claims of the extent to which their products are free of hallucinations. https://t.co/wTm8sQbbGb