社内勉強会資料　Mamba - A new era or ephemeral

NABLAS 447 views 13 slides May 17, 2024

Slide 1 of 13

About This Presentation

選択的状態空間モデル（Selective SSM）を用いることで、必要最小限の情報に着目し計算効率の向上を達成するMambaについて紹介しています

Size: 813.59 KB

Language: en

Added: May 17, 2024

Slides: 13 pages

Slide Content

Mamba: A new era or
ephemeral?
Mamba: Linear-Time Sequence Modeling with Selective State
Spaces

Transformer cannot
Drawbacks of transformers are finite window, quadratic scaling with respect to the
window length
O(L^2)

Generating tokens for a sequence of length L needs roughly L² computations which can be costly if the
sequence length increases.

RNN: recurrent neural network
Fast inference, linear complexity

Drawbacks
-small memory
-not parallelizable

SSM: state space model
Input x
output y
hidden state h
Looks
familiar ??????

Mamba how to solve small memory problem
Matrix A responsible for updating hidden state H, how to update memory along
sequence continues?
hippo as High-order Polynomial Projection:

Mamba: A [NxD] matrix definition
Input shape: [2, 64, 16]
A: [32, 16]

SSM: parallel training
What hinders RNN is that its nonlinearity tanh() cannot be fastly trained.
From definition, SSM kind of like a RNN without tanh()

SSM: parallel training

In a way, it’s like a convolution operation

Mamba: selective SSM
Not Content-awareness: independent A, B, C results in problems with
content-awareness.

In comparison, these tasks are relatively easy for Transformers since they
dynamically change their attention based on the input sequence. They can
selectively “look” or “attend” at diﬀerent parts of the sequence.

Mamba: selective SSM
Mamba makes matrics B, C, and step size Δ depend on the input, which is similar
to transformer

This raised another issue -> cannot train parallelly like in S4

parallel scan?
As long as algorithms satisfy the associative prosperity, can be parallelized!

t is the parallel thread

feels like
DP

社内勉強会資料　Mamba - A new era or ephemeral

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

社内勉強会資料 Mamba - A new era or ephemeral

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx

社内勉強会資料　Mamba - A new era or ephemeral