mplugdocowl2+github

2025-03-27 12:21:19

拼音 [ 拼音 ]

简拼 [ 简拼 ]

含义

mPLUG-DocOwl2: OCR-free多页文档理解新SOTA,单页视觉token仅324!

论文链接: https://arxiv.org/abs/2409.03420 代码链接: https://github.com/X-PLUG/mPLUG-DocOwl/tree/main/DocOwl2 模型结构 NLP领域对于文本总结和压缩已经有了很多研究。考虑到文档图片的主要信息都是布局和文字信息,且现有的...
mPLUG-DocOwl2: OCR-free多页文档理解新SOTA,单页视觉token仅324...

github: GitHub - X-PLUG/mPLUG-DocOwl: mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding DocOwl2 多页文档理解性能展示模型结构 NLP领域对于文本总结和压缩已经有了很多研究。考虑到文档图片的主要信息都是布局和文字信息,且现有的多模态大模型普遍通过一个vision-to-text模块...
release DocOwl2 model and inference code · X-PLUG/mPLUG-Doc...

``` 0 comments on commit 457327e Please sign in to comment. Footer © 2024 GitHub, Inc. Footer navigation Terms Privacy Security Status Docs Contact Manage cookies Do not share my personal information
release DocOwl2, training data, inferene and evaluation code...

## Models 0 comments on commit fc890c9 Please sign in to comment. Footer © 2024 GitHub, Inc. Footer navigation Terms Privacy Security Status Docs Contact Manage cookies Do not share my personal information
release DocOwl2, training data, inferene and evaluation code...

0 comments on commit fab2fd7 Please sign in to comment. Footer © 2024 GitHub, Inc. Footer navigation Terms Privacy Security Status Docs Contact Manage cookies Do not share my personal information
文档理解系列--mPLUG-DocOwl 2 - 知乎

此外,与在类似数据上训练的单图像MLLMs相比,我们的DocOwl2在10个单图像文档基准测试上实现了相当性能,且视觉标记少于20%。代码、模型和数据地址:https://github.com/X-PLUG/mPLUG-DocOwl/tree/main/DocOwl2。 1 引言: 理解多页文档或新闻视频在人类日常生活中很常见。为了应对这种情况,多模态大型语言模型(ML...
release DocOwl2, training data, inferene and evaluation code...

0 comments on commit d4bde9d Please sign in to comment. Footer © 2024 GitHub, Inc. Footer navigation Terms Privacy Security Status Docs Contact Manage cookies Do not share my personal information
release DocOwl2, training data, inferene and evaluation code...

## Models 0 comments on commit fc890c9 Please sign in to comment. Footer © 2024 GitHub, Inc. Footer navigation Terms Privacy Security Status Docs Contact Manage cookies Do not share my personal information
mPLUG-DocOwl2:新模型无需OCR,多页文档理解迈入新纪元-AI.x-AIGC...

https://github.com/X-PLUG/mPLUG-DocOwl/tree/main/DocOwl2 高分辨率文档图像的挑战在处理高分辨率文档图像时,多模态大型语言模型(MLLMs)面临着一系列挑战。随着文档图像分辨率的提高,模型需要生成数千个视觉令牌来理解单一文档图像,这不仅增加了GPU内存的消耗,也导致了推理速度的降低,特别是...
release DocOwl2 inferene and evaluation code · X-PLUG/mPLUG...

sys.path.append('/nas-alinlp/anwenhu/code/mPLUG_github/mPLUG-DocOwl2/evaluation') print(sys.path) import re from evaluator import doc_evaluate import os from tqdm import tqdm import random from pathlib import Path def parser_line(line): image = line['image'][0] assert len(line['messag...

快搜汉语词典

mplugdocowl2+github

拼音 [ 拼音 ]

简拼 [ 简拼 ]

含义

mPLUG-DocOwl2: OCR-free多页文档理解新SOTA,单页视觉token仅324!

mPLUG-DocOwl2: OCR-free多页文档理解新SOTA,单页视觉token仅324...

release DocOwl2 model and inference code · X-PLUG/mPLUG-Doc...

release DocOwl2, training data, inferene and evaluation code...

release DocOwl2, training data, inferene and evaluation code...

文档理解系列--mPLUG-DocOwl 2 - 知乎

release DocOwl2, training data, inferene and evaluation code...

release DocOwl2, training data, inferene and evaluation code...

mPLUG-DocOwl2:新模型无需OCR,多页文档理解迈入新纪元-AI.x-AIGC...

release DocOwl2 inferene and evaluation code · X-PLUG/mPLUG...

缩写

今日热搜

上海网友集中晒蘑菇

近反义词

相关词语

相关搜索