Oct 23, 01:26 AM

Researchers Explore Methods to Make AI Language Models Forget Specific Data

Researchers are investigating ways to make large language models forget specific kinds of data. Salvatore Raieli explores this complex question from both a theoretical and pragmatic perspective. The research focuses on developing a new architecture based on advective diffusion that combines the computational structure of message-passing neural networks (MPNNs) and Transformers. The goal is to improve the ability of AI language models to selectively forget certain information, which could have implications for data privacy and bias mitigation.

#Salvatore Raieli #Transformers

Written with ChatGPT (GPT-3).

Sources

Stat.ML Papers@StatMLPapers
8 mo
Sequence Length Independent Norm-Based Generalization Bounds for Transformers. (arXiv:2310.13088v1 [https://t.co/zjV5HgYw5a]) https://t.co/M3y1cHYYkn
Stat.ML Papers@StatMLPapers
8 mo
To grok or not to grok: Disentangling generalization and memorization on corrupted algorithmic datasets. (arXiv:2310.13061v1 [cs.LG]) https://t.co/gPibY9RD7o
Stat.ML Papers@StatMLPapers
8 mo
Mean Estimation Under Heterogeneous Privacy Demands. (arXiv:2310.13137v1 [https://t.co/wV6LxuXL4M]) https://t.co/qHgi2KF4Fr
Stat.ML Papers@StatMLPapers
8 mo
Interaction Screening and Pseudolikelihood Approaches for Tensor Learning in Ising Models. (arXiv:2310.13232v1 [https://t.co/BMRR0LzNKS]) https://t.co/8bezNr1NWd
Stat.ML Papers@StatMLPapers
8 mo
Meta-learning of Physics-informed Neural Networks for Efficiently Solving Newly Given PDEs. (arXiv:2310.13270v1 [https://t.co/zjV5HgYw5a]) https://t.co/wJbh43eUvg
Stat.ML Papers@StatMLPapers
8 mo
Non-Negative Spherical Relaxations for Universe-Free Multi-Matching and Clustering. (arXiv:2310.13311v1 [https://t.co/zjV5HgYw5a]) https://t.co/RzK91NkSM6
Stat.ML Papers@StatMLPapers
8 mo
DeepFDR: A Deep Learning-based False Discovery Rate Control Method for Neuroimaging Data. (arXiv:2310.13349v1 [https://t.co/zjV5HgYw5a]) https://t.co/HwXfx5a6iE
Stat.ML Papers@StatMLPapers
8 mo
Optimal Best Arm Identification with Fixed Confidence in Restless Bandits. (arXiv:2310.13393v1 [https://t.co/zjV5HgYw5a]) https://t.co/yZ0N8c99UQ
Stat.ML Papers@StatMLPapers
8 mo
Calibrating Neural Simulation-Based Inference with Differentiable Coverage Probability. (arXiv:2310.13402v1 [https://t.co/zjV5HgYw5a]) https://t.co/M4xzPMdWMz
Stat.ML Papers@StatMLPapers
8 mo
Random Matrix Analysis to Balance between Supervised and Unsupervised Learning under the Low Density Separation Assumption. (arXiv:2310.13434v1 [cs.LG]) https://t.co/mnwUEGAVjr
Stat.ML Papers@StatMLPapers
8 mo
Y-Diagonal Couplings: Approximating Posteriors with Conditional Wasserstein Distances. (arXiv:2310.13433v1 [cs.LG]) https://t.co/DLpcfvMS4D
Stat.ML Papers@StatMLPapers
8 mo
Variational measurement-based quantum computation for generative modeling. (arXiv:2310.13524v1 [quant-ph]) https://t.co/sixo2KnFqM
Stat.ML Papers@StatMLPapers
8 mo
Towards Understanding Sycophancy in Language Models. (arXiv:2310.13548v1 [https://t.co/x5f9xnJFAw]) https://t.co/c4cWgCfoIV
Stat.ML Papers@StatMLPapers
8 mo
Provable Benefits of Multi-task RL under Non-Markovian Decision Making Processes. (arXiv:2310.13550v1 [cs.LG]) https://t.co/ik2ZgRkXdS
Stat.ML Papers@StatMLPapers
8 mo
Optimal Transport for Measures with Noisy Tree Metric. (arXiv:2310.13653v1 [https://t.co/zjV5HgYw5a]) https://t.co/OjTwdGGloA
Stat.ML Papers@StatMLPapers
8 mo
Deep neural networks can stably solve high-dimensional, noisy, non-linear inverse problems. (arXiv:2206.00934v5 [https://t.co/2UYnDWkVUv] UPDATED) https://t.co/pmKtZ72TWB
Stat.ML Papers@StatMLPapers
8 mo
Interpretable Sequence Classification Via Prototype Trajectory. (arXiv:2007.01777v3 [cs.LG] UPDATED) https://t.co/BV3wtdb2wM
Stat.ML Papers@StatMLPapers
8 mo
Event-Triggered Time-Varying Bayesian Optimization. (arXiv:2208.10790v4 [cs.LG] UPDATED) https://t.co/eseSL1chPD
Stat.ML Papers@StatMLPapers
8 mo
Trade-off Between Efficiency and Consistency for Removal-based Explanations. (arXiv:2210.17426v3 [cs.LG] UPDATED) https://t.co/GTuVPVX6De
Stat.ML Papers@StatMLPapers
8 mo
On the Overlooked Structure of Stochastic Gradients. (arXiv:2212.02083v3 [cs.LG] UPDATED) https://t.co/65UInPGogS
Stat.ML Papers@StatMLPapers
8 mo
Kernel Ridge Regression Inference. (arXiv:2302.06578v2 [https://t.co/kehXLTMwJD] UPDATED) https://t.co/TswNII1luj
Stat.ML Papers@StatMLPapers
8 mo
Adaptive Selective Sampling for Online Prediction with Experts. (arXiv:2302.08397v2 [https://t.co/zjV5HgYw5a] UPDATED) https://t.co/jo35kitJHh
Stat.ML Papers@StatMLPapers
8 mo
A Primal-Dual-Critic Algorithm for Offline Constrained Reinforcement Learning. (arXiv:2306.07818v2 [cs.LG] UPDATED) https://t.co/d8Khl3g1Ng
Stat.ML Papers@StatMLPapers
8 mo
Verifiable Learning for Robust Tree Ensembles. (arXiv:2305.03626v3 [cs.LG] UPDATED) https://t.co/QjXj05jtn2
Stat.ML Papers@StatMLPapers
8 mo
Trained Transformers Learn Linear Models In-Context. (arXiv:2306.09927v3 [https://t.co/zjV5HgYw5a] UPDATED) https://t.co/Gb6dbOJNRj
Stat.ML Papers@StatMLPapers
8 mo
Predicting Battery Lifetime Under Varying Usage Conditions from Early Aging Data. (arXiv:2307.08382v2 [cs.LG] UPDATED) https://t.co/F8voH3qEXN
Stat.ML Papers@StatMLPapers
8 mo
On the quality of randomized approximations of Tukey's depth. (arXiv:2309.05657v2 [https://t.co/zjV5HgYw5a] UPDATED) https://t.co/p6CXAoua0y
Stat.ML Papers@StatMLPapers
8 mo
Modeling Supply and Demand in Public Transportation Systems. (arXiv:2309.06299v2 [cs.LG] UPDATED) https://t.co/5EzORfTw8b
Stat.ML Papers@StatMLPapers
8 mo
On Double Descent in Reinforcement Learning with LSTD and Random Features. (arXiv:2310.05518v2 [cs.LG] UPDATED) https://t.co/KThYdhm9rK
Stat.ML Papers@StatMLPapers
8 mo
Label Differential Privacy via Aggregation. (arXiv:2310.10092v2 [cs.LG] UPDATED) https://t.co/9dqd9fobO2
arXiv Sound@ArxivSound
8 mo
``The Effect of Spoken Language on Speech Enhancement using Self-Supervised Speech Representation Loss Functions. (arXiv:2307.14502v2 [https://t.co/3pcQCkeyAA] UPDATED),'' George Close, Thomas Hain, Stefan Goetze, https://t.co/Rcmas47Hph
arXiv Sound@ArxivSound
8 mo
``Low-latency Speech Enhancement via Speech Token Generation. (arXiv:2310.08981v2 [https://t.co/mPAjnto8C8] UPDATED),'' Huaying Xue, Xiulian Peng, Yan Lu, https://t.co/ELyLU7m3Mb
Towards Data Science@TDataScience
8 mo
In their new research, @mmbronstein, @qitianwu_, and @Chenxia58917359 explore "a new architecture based on advective diffusion that combines the computational structure of message-passing neural networks (MPNNs) and Transformers [...]." https://t.co/8pkWPEZA15
fly51fly@fly51fly
8 mo
[CL] Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective M Zhong, C An, W Chen, J Han, P He [University of Illinois Urbana-Champaign & The University of Hong Kong & Microsoft Azure AI] (2023) https://t.co/r9Rg3XA2wO - The paper… https://t.co/wTRZlLSj84 https://t.co/31T4tkDia1
fly51fly@fly51fly
8 mo
[LG] Approximating Two-Layer Feedforward Networks for Efficient Transformers R Csordás, K Irie, J Schmidhuber [The Swiss AI Lab IDSIA & Harvard University] (2023) https://t.co/AdUz3ppmaS - The paper presents a unified framework to understand methods for approximating two-layer… https://t.co/birWBw276Z https://t.co/zYJ8hi422C
Towards Data Science@TDataScience
8 mo
How can we make large language models forget specific kinds of data? Salvatore Raieli explores a complex question from both a theoretical and pragmatic perspective. https://t.co/FLcgFwRM96