-
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization
Paper • 2412.17739 • Published • 41 -
SmoothQuant+: Accurate and Efficient 4-bit Post-Training WeightQuantization for LLM
Paper • 2312.03788 • Published • 1 -
FlatQuant: Flatness Matters for LLM Quantization
Paper • 2410.09426 • Published • 15 -
FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
Paper • 2501.01005 • Published • 2
xinliu
slothCreepTree
AI & ML interests
LLM Inference, Multimodal LLM
Recent Activity
updated
a collection
about 2 months ago
papers
updated
a collection
8 months ago
papers
updated
a collection
8 months ago
papers
Organizations
None yet