User Stories

User Stories#

Read case studies on how users and developers solve real, everyday problems with vLLM Ascend

  • LLaMA-Factory is an easy-to-use and efficient platform for training and fine-tuning large language models. It supports vLLM Ascend to speed up inference since LLaMA-Factory#7739, gaining 2x performance enhancement in inference.

  • Huggingface/trl is a cutting-edge library designed for post-training foundation models using advanced techniques like SFT, PPO and DPO. It uses vLLM Ascend since v0.17.0 to support RLHF on Ascend NPUs.

  • MindIE Turbo is an LLM inference engine acceleration plugin library developed by Huawei on Ascend hardware, which includes self-developed LLM optimization algorithms and optimizations related to the inference engine framework. It supports vLLM Ascend since 2.0rc1.

  • GPUStack is an open-source GPU cluster manager for running AI models. It supports vLLM Ascend since v0.6.2. See more GPUStack performance evaluation information at this link.

  • verl is a flexible, efficient, and production-ready RL training library for LLMs. It uses vLLM Ascend since v0.4.0. See more information on verl x Ascend Quickstart.

More details