White-box transformers via sparse rate reduction Y Yu, S Buchanan, D Pai, T Chu, Z Wu, S Tong, B Haeffele, Y Ma Advances in Neural Information Processing Systems 36, 9422-9457, 2023 | 60 | 2023 |
Independent and decentralized learning in markov potential games C Maheshwari, M Wu, D Pai, S Sastry arXiv preprint arXiv:2205.14590, 2022 | 22 | 2022 |
Emergence of segmentation with minimalistic white-box transformers Y Yu, T Chu, S Tong, Z Wu, D Pai, S Buchanan, Y Ma arXiv preprint arXiv:2308.16271, 2023 | 19 | 2023 |
Pursuit of a discriminative representation for multiple subspaces via sequential games D Pai, M Psenka, CY Chiu, M Wu, E Dobriban, Y Ma arXiv preprint arXiv:2206.09120, 2022 | 9 | 2022 |
Representation learning via manifold flattening and reconstruction M Psenka, D Pai, V Raman, S Sastry, Y Ma Journal of Machine Learning Research 25 (132), 1-47, 2024 | 8 | 2024 |
Masked completion via structured diffusion with white-box transformers D Pai, ZW Wu, S Buchanan, Y Yu, Y Ma International Conference on Learning Representations, 2023 | 6 | 2023 |
Y. Ma. White-box transformers via sparse rate reduction: Compression is all there is Y Yu, S Buchanan, D Pai, T Chu, Z Wu, S Tong, H Bai, Y Zhai, ... arXiv preprint arXiv:2311.13110, 2023 | 5 | 2023 |
A Global Geometric Analysis of Maximal Coding Rate Reduction P Wang, H Liu, D Pai, Y Yu, Z Zhu, Q Qu, Y Ma arXiv preprint arXiv:2406.01909, 2024 | 4 | 2024 |
Congestion Pricing for Efficiency and Equity: Theory and Applications to the San Francisco Bay Area C Maheshwari, K Kulkarni, D Pai, J Yang, M Wu, S Sastry arXiv preprint arXiv:2401.16844, 2024 | 4 | 2024 |
White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is? Y Yu, S Buchanan, D Pai, T Chu, Z Wu, S Tong, H Bai, Y Zhai, ... arXiv preprint arXiv:2311.13110, 2023 | 3 | 2023 |
Closed-loop transcription via convolutional sparse coding X Dai, K Chen, S Tong, J Zhang, X Gao, M Li, D Pai, Y Zhai, XI Yuan, ... arXiv preprint arXiv:2302.09347, 2023 | 3 | 2023 |
Scaling White-Box Transformers for Vision J Yang, X Li, D Pai, Y Zhou, Y Ma, Y Yu, C Xie arXiv preprint arXiv:2405.20299, 2024 | 2 | 2024 |
Active-dormant attention heads: Mechanistically demystifying extreme-token phenomena in llms T Guo, D Pai, Y Bai, J Jiao, MI Jordan, S Mei arXiv preprint arXiv:2410.13835, 2024 | 1 | 2024 |
Learning Low-Dimensional Structure via Closed-Loop Transcription: Equilibria and Optimization D Pai | | 2023 |