Seongland
About Articles Publications Projects Apps
About Articles Publications Projects Apps

Articles

Research articles and technical writing.

Filter:

CorrSteer: Steering LLMs via Correlation-based Corrections

2025-08

An interactive exploration of correlation-guided feature selection for controllable language model behavior using Sparse Autoencoders.

25 min read SteeringInterpretabilitySAEAlignment

SAE Training Dataset Influence in Feature Matching and a Hypothesis on Position Features

2025-01

Investigating how dataset composition affects sparse autoencoder feature matching and density patterns.

15 min read SAEDatasetInterpretability

Superposition Hypothesis for Steering LLM with Sparse Autoencoder

2024-10

Exploring how the superposition hypothesis in neural networks relates to steering language models using sparse autoencoders.

10 min read SAESteeringSuperposition

Reversing Transformer to Understand In-Context Learning with Phase Change & Feature Dimensionality

2024-08

Understanding in-context learning by reversing transformer representations, exploring phase changes and feature dimensionality.

12 min read InterpretabilityIn-Context LearningTransformer

No articles match the selected tags.

© 2026 Seonglae Cho
GitHub Scholar CV Resume