Text clustering with LLM embeddingsarxiv.org/abs/2403.15112 核心观点: 这篇文章探讨了文本聚类中使用不同文本嵌入(特别是大语言模型中的嵌入)和聚类算法对聚类结果的影响。文章进行了多组实验,评估了嵌入方式、降维和嵌入维度对聚类结果的影响。结果显示,大语言模型中的嵌入擅长捕捉结构化语言的细微差别,BERT在...
Clustering Algorithms and RAG Enhancing Semi-Supervised Text Classification with Large LLMsThis paper proposes a Clustering, Labeling, then Augmenting framework that significantly enhances performance in Semi-Supervised Text Classification (SSTC) tasks, effectively addressing the challenge of vast datasets ...
A Web Crawler based on LLMs implemented with Ray and Huggingface. The embeddings are saved into a vector database for fast clustering and retrieval. Use it for your RAG. - Aavache/LLMWebCrawler
A Web Crawler based on LLMs implemented with Ray and Huggingface. The embeddings are saved into a vector database for fast clustering and retrieval - sebastian-napora/LLMWebCrawler