IntroSparse AutoEncode (SAE) TLDR 就是一个宽度很大的linear proj + 激活函数 + linear proj(有可能再加一个threshold i.e. JumpReLU),通过loss设计让激活稀疏化。 根据 https://transformer-circuits.pub/20…
SAE(Sparse Auto Encoder)作为Auto Encoder的一种,通过稀疏化激活的方式,常被用于从大语言模型中提取可解释的特征。但最近cocomix等一系列工作的出现又揭示了SAE作为Auto Encoder用于表征和预测的潜力。下面就简单介绍下SAE的原理以及和其他Auto Encoder的区别。 SAE的原理 SAE的原理图 相比于传统的Auto Encoder架构,SAE...
10:59 [动手写神经网络] pytorch 高维张量 Tensor 维度操作与处理,einops 23:03 [动手写 Transformer] 手动实现 Transformer Decoder(交叉注意力,encoder-decoder cross attentio) 14:43 [动手写神经网络] kSparse AutoEncoder 稀疏性激活的显示实现(SAE on LLM) 16:22 [...
简介:本文将提供一个简单的稀疏自编码器(Sparse Autoencoder, SAE)的PyTorch代码示例,以及如何将其堆叠(Stack)以创建栈式稀疏自编码器(Stacked Sparse Autoencoders, SSAE)。 千帆应用开发平台“智能体Pro”全新上线 限时免费体验 面向慢思考场景,支持低代码配置的方式创建“智能体Pro”应用 立即体验 在深度学习中,自...
A sparse autoencoder is one of a range of types of autoencoder artificial neural networks that work on the principle of unsupervised machine learning. Autoencoders are a type of deep network that can be used for dimensionality reduction – and to reconstruct a model through backpropagation. Adve...
Files main sae-viewer public src .gitignore README.md package-lock.json package.json tailwind.config.js tsconfig.json sparse_autoencoder .gitignore .pre-commit-config.yaml LICENSE README.md SECURITY.md pyproject.tomlBreadcrumbs sparse_autoencoder / sae-viewer/ Directory actions More options...
9 Dec 2024·Bart Bussmann,Patrick Leask,Neel Nanda· Sparse autoencoders (SAEs) have emerged as a powerful tool for interpreting language model activations by decomposing them into sparse, interpretable features. A popular approach is the TopK SAE, that uses a fixed number of the most active ...
Files main sae-viewer public src .gitignore README.md package-lock.json package.json tailwind.config.js tsconfig.json sparse_autoencoder .gitignore .pre-commit-config.yaml LICENSE README.md SECURITY.md pyproject.tomlBreadcrumbs sparse_autoencoder /sae-viewer / .gitignore ...
Paper tables with annotated results for Sparse Autoencoder Features for Classifications and Transferability
Different from traditional stacked autoencoders, the ESGSAE model considers the complementarity between the original feature and the hidden outputs by embedding the original features into hidden layers. To alleviate the impact of the small sample problem on the generalization of the proposed ESGSAE ...