Tag: QLoRA
2 articles
AI Research 42 - Multimodal Large Model Quantization: Fro...
Survey outline for multimodal large model quantization schemes: from FP32 to INT4. Core goal is model capability retention, compression efficiency 50-75%, inference speedup 2-4x. Analyze comparison...
AI Research 39 - Multimodal Large Model Quantization: How...
In multimodal large model optimization, the order choice of fine-tuning and quantization directly affects the final model's performance and efficiency. There are three main strategies: fine-tune fi...