3 Ways to Improve Your LLM
Source: https://www.maartengrootendorst.com/blog/improving-llms/
Source: Anthropic: contextual-retrieval
The continuous state space model (SSM) is a fundamental representation defined by two simple equations:
$$ \begin{align*} x'(t) &= A x(t) + B u(t)\\ y(t) &= C x(t) + D u(t) \end{align*} $$Here the input \(u(t)\) is a 1-dimensional signal, the state \(x(t)\) is a $N$-dimensional latent representation satisfying a linear ODE, and the output \(y(t)\) is a simple 1-dimensional projection of the state.
Here \( A \in \mathbb{R}^{N \times N} \) is called the state matrix, and the other parameter shapes are \( B \in \mathbb{R}^{N \times 1}, C \in \mathbb{R}^{1 \times N}, D \in \mathbb{R}^{1 \times 1} \).
Mamba: Linear-Time Sequence Modeling with SSS, Gonzo ML (in Russian)
Shirui Pan, Senior Member, IEEE, Linhao Luo, Yufei Wang, Chen Chen, Jiapu Wang, Xindong Wu, Fellow, IEEE
https://arxiv.org/pdf/2306.08302.pdf
We show the framework of MindMap, which comprises three main parts:
Source: yusu substack
Source: GitHub talk Llama-fast