On August 25,Dear Utol: Catfish Episode 46 Alibaba Cloud launched an open-source Large Vision Language Model (LVLM) named Qwen-VL. The LVLM is based on Alibaba Cloud’s 7 billion parameter foundational language model Qwen-7B. In addition to capabilities such as image-text recognition, description, and question answering, Qwen-VL introduces new features including visual location recognition and image-text comprehension, the company said in a statement. These functions enable the model to identify locations in pictures and to provide users with guidance based on the information extracted from images, the firm added. The model can be applied in various scenarios including image and document-based question answering, image caption generation, and fine-grained visual recognition. Currently, both Qwen-VL and its visual AI assistant Qwen-VL-Chat are available for free and commercial use on Alibaba’s “Model as a Service” platform ModelScope. [Alibaba Cloud statement, in Chinese]
Related Articles
2025-06-26 23:00
128 views
Best smartwatch deal: Save 44% on CMF Watch Pro for $38.90 at Amazon
SAVE $30.10:As of April 4, CMF Watch Pro Smartwatch is available for $38.90 at Amazon. That’s
Read More
2025-06-26 21:23
1660 views
'A Bug's Life' fleshlight is here to ruin your childhood memories
If you're feeling particularly nostalgic about the '90s and in the mood to tarnish your precious chi
Read More
2025-06-26 21:03
832 views
'Quordle' today: See each 'Quordle' answer and hints for April 15
If Quordleis a little too challenging today, you've come to the right place for hints. There aren't
Read More