Equivariant neural networks and representation theory

VisualMath 42 views 29 slides Jun 08, 2024
Slide 1
Slide 1 of 29
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29

About This Presentation

Or: Beyond linear.

Abstract: Equivariant neural networks are neural networks that incorporate symmetries. The nonlinear activation functions in these networks result in interesting nonlinear equivariant maps between simple representations, and motivate the key player of this talk: piecewise linear ...


Slide Content

What is a Neural Network (NN)?
▶Task:
▶We can see this as afunction
▶Convert the pictures into grayscale values, e.g., 28×28 grid of numbers

Flatten the result into a vector, e.g., 28×287→a vector with 28
2
= 784 entries
▶The output is a vector with 10 entries
▶We thus have a functionR
784
→R
10
Input example
Output example
Example
Here we have two matrices: a 3×1 matrix and a 2×3 matrix


w
x
y

and
`
a11a12a13
a21a22a23
´
plus two bias terms as iny= matrix·x+ bias
Example
Here we have four matrices (plus four biases), whose composition gives a map:
R
3
→R
4
→R
3
→R
3
→R
2
Actually...
we need nonlinear maps as well, say ReLU applied componentwise
Here we have four matrices, whose composition gives a map:
R
3ReLU◦matrix
−−−−−−−→R
4ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
2
.
Life is not linear!
ReLU is crucial and brings in nonlinearity
no ReLU:
, with ReLU:
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 2 / 5

What is a Neural Network (NN)?
▶Task:
▶We can see this as afunction
▶Convert the pictures into grayscale values, e.g., 28×28 grid of numbers

Flatten the result into a vector, e.g., 28×287→a vector with 28
2
= 784 entries
▶The output is a vector with 10 entries
▶We thus have a functionR
784
→R
10
Input example
Output example
Example
Here we have two matrices: a 3×1 matrix and a 2×3 matrix


w
x
y

and
`
a11a12a13
a21a22a23
´
plus two bias terms as iny= matrix·x+ bias
Example
Here we have four matrices (plus four biases), whose composition gives a map:
R
3
→R
4
→R
3
→R
3
→R
2
Actually...
we need nonlinear maps as well, say ReLU applied componentwise
Here we have four matrices, whose composition gives a map:
R
3ReLU◦matrix
−−−−−−−→R
4ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
2
.
Life is not linear!
ReLU is crucial and brings in nonlinearity
no ReLU:
, with ReLU:
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 2 / 5

What is a Neural Network (NN)?
,
▶Idea: R
784
→R
10
▶NN:

Thematrices
▶ThePL maps
Input example
Output example
Example
Here we have two matrices: a 3×1 matrix and a 2×3 matrix


w
x
y

and
`
a11a12a13
a21a22a23
´
plus two bias terms as iny= matrix·x+ bias
Example
Here we have four matrices (plus four biases), whose composition gives a map:
R
3
→R
4
→R
3
→R
3
→R
2
Actually...
we need nonlinear maps as well, say ReLU applied componentwise
Here we have four matrices, whose composition gives a map:
R
3ReLU◦matrix
−−−−−−−→R
4ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
2
.
Life is not linear!
ReLU is crucial and brings in nonlinearity
no ReLU:
, with ReLU:
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 2 / 5

What is a Neural Network (NN)?
▶NN:
▶The task of an NN is toapproximate

It
consists of
neurons = entries of vectors, and weights = entries of matrices
Input example
Output example
Example
Here we have two matrices: a 3×1 matrix and a 2×3 matrix


w
x
y

and
`
a11a12a13
a21a22a23
´
plus two bias terms as iny= matrix·x+ bias
Example
Here we have four matrices (plus four biases), whose composition gives a map:
R
3
→R
4
→R
3
→R
3
→R
2
Actually...
we need nonlinear maps as well, say ReLU applied componentwise
Here we have four matrices, whose composition gives a map:
R
3ReLU◦matrix
−−−−−−−→R
4ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
2
.
Life is not linear!
ReLU is crucial and brings in nonlinearity
no ReLU:
, with ReLU:
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 2 / 5

What is a Neural Network (NN)?
▶NN:
▶The task of an NN is toapproximate

It
consists of
neurons = entries of vectors, and weights = entries of matrices
Input example
Output example
Example
Here we have two matrices: a 3×1 matrix and a 2×3 matrix


w
x
y

and
`
a11a12a13
a21a22a23
´
plus two bias terms as iny= matrix·x+ bias
Example
Here we have four matrices (plus four biases), whose composition gives a map:
R
3
→R
4
→R
3
→R
3
→R
2
Actually...
we need nonlinear maps as well, say ReLU applied componentwise
Here we have four matrices, whose composition gives a map:
R
3ReLU◦matrix
−−−−−−−→R
4ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
2
.
Life is not linear!
ReLU is crucial and brings in nonlinearity
no ReLU:
, with ReLU:
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 2 / 5

What is a Neural Network (NN)?
▶NN:
▶The task of an NN is toapproximate

It
consists of
neurons = entries of vectors, and weights = entries of matrices
Input example
Output example
Example
Here we have two matrices: a 3×1 matrix and a 2×3 matrix


w
x
y

and
`
a11a12a13
a21a22a23
´
plus two bias terms as iny= matrix·x+ bias
Example
Here we have four matrices (plus four biases), whose composition gives a map:
R
3
→R
4
→R
3
→R
3
→R
2
Actually...
we need nonlinear maps as well, say ReLU applied componentwise
Here we have four matrices, whose composition gives a map:
R
3ReLU◦matrix
−−−−−−−→R
4ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
2
.
Life is not linear!
ReLU is crucial and brings in nonlinearity
no ReLU:
, with ReLU:
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 2 / 5

What is a Neural Network (NN)?
▶NN:
▶The task of an NN is toapproximate

It
consists of
neurons = entries of vectors, and weights = entries of matrices
Input example
Output example
Example
Here we have two matrices: a 3×1 matrix and a 2×3 matrix


w
x
y

and
`
a11a12a13
a21a22a23
´
plus two bias terms as iny= matrix·x+ bias
Example
Here we have four matrices (plus four biases), whose composition gives a map:
R
3
→R
4
→R
3
→R
3
→R
2
Actually...
we need nonlinear maps as well, say ReLU applied componentwise
Here we have four matrices, whose composition gives a map:
R
3ReLU◦matrix
−−−−−−−→R
4ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
2
.
Life is not linear!
ReLU is crucial and brings in nonlinearity
no ReLU:
, with ReLU:
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 2 / 5

What is an Equivariant Neural Network (ENN)?
▶Task
▶Examples
▶Image recognition is often invariant under translation; for example,
whether ice cream is present in an image should not depend on the
position of the ice cream in the image
▶Point cloud problems are invariant under the symmetric group
Toy example - part 1
Periodic images, sayn-by-npixels,
have an action ofC
2
n= (Z/nZ)
2
by translation
In math, a periodic image is an element ofV=Fun(C
2
n,R) (mapsC
2
n→R)
The standard action onVis the shift above: ((a,b)
f)(x,y) =f(x+a,y+b)
Toy example - part 2
What is aC
2
n-equivariant mapV→V?
For example, convolution, say byc=
1
4
((1,0) + (0,1) + (−1,0) + (0,−1))
By the way
cis an example of what convolution usually does
Fact 1
NN and ENN need nonlinear maps
Fact 2
Every NN and ENN can be approximated using PL activation functions
1+2 motivate the study of
PL rep theory
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 π/ 5

What is an Equivariant Neural Network (ENN)?
▶Task
▶Examples
▶Image recognition is often invariant under translation; for example,
whether ice cream is present in an image should not depend on the
position of the ice cream in the image
▶Point cloud problems are invariant under the symmetric group
Toy example - part 1
Periodic images, sayn-by-npixels,
have an action ofC
2
n= (Z/nZ)
2
by translation
In math, a periodic image is an element ofV=Fun(C
2
n,R) (mapsC
2
n→R)
The standard action onVis the shift above: ((a,b)
f)(x,y) =f(x+a,y+b)
Toy example - part 2
What is aC
2
n-equivariant mapV→V?
For example, convolution, say byc=
1
4
((1,0) + (0,1) + (−1,0) + (0,−1))
By the way
cis an example of what convolution usually does
Fact 1
NN and ENN need nonlinear maps
Fact 2
Every NN and ENN can be approximated using PL activation functions
1+2 motivate the study of
PL rep theory
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 π/ 5

What is an Equivariant Neural Network (ENN)?
▶Task
▶Examples
▶Image recognition is often invariant under translation; for example,
whether ice cream is present in an image should not depend on the
position of the ice cream in the image
▶Point cloud problems are invariant under the symmetric group
Toy example - part 1
Periodic images, sayn-by-npixels,
have an action ofC
2
n= (Z/nZ)
2
by translation
In math, a periodic image is an element ofV=Fun(C
2
n,R) (mapsC
2
n→R)
The standard action onVis the shift above: ((a,b)
f)(x,y) =f(x+a,y+b)
Toy example - part 2
What is aC
2
n-equivariant mapV→V?
For example, convolution, say byc=
1
4
((1,0) + (0,1) + (−1,0) + (0,−1))
By the way
cis an example of what convolution usually does
Fact 1
NN and ENN need nonlinear maps
Fact 2
Every NN and ENN can be approximated using PL activation functions
1+2 motivate the study of
PL rep theory
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 π/ 5

What is an Equivariant Neural Network (ENN)?
▶Task
▶Examples
▶Image recognition is often invariant under translation; for example,
whether ice cream is present in an image should not depend on the
position of the ice cream in the image
▶Point cloud problems are invariant under the symmetric group
Toy example - part 1
Periodic images, sayn-by-npixels,
have an action ofC
2
n= (Z/nZ)
2
by translation
In math, a periodic image is an element ofV=Fun(C
2
n,R) (mapsC
2
n→R)
The standard action onVis the shift above: ((a,b)
f)(x,y) =f(x+a,y+b)
Toy example - part 2
What is aC
2
n-equivariant mapV→V?
For example, convolution, say byc=
1
4
((1,0) + (0,1) + (−1,0) + (0,−1))
By the way
cis an example of what convolution usually does
Fact 1
NN and ENN need nonlinear maps
Fact 2
Every NN and ENN can be approximated using PL activation functions
1+2 motivate the study of
PL rep theory
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 π/ 5

What is an Equivariant Neural Network (ENN)?
C
2
nequivariant
convolutional NN
(W=VorR) :
Snequivariant
point cloud NN
(n=Fun({1,...,n},R)) :
▶Enter, rep theory
▶Rep theory questions
▶Can we decompose the network into smaller (“simple”) bits?
▶The space of all weights is much smaller than for a vanilla NN: can we
use rep theory to study the involved maps?
Toy example - part 1
Periodic images, sayn-by-npixels,
have an action ofC
2
n= (Z/nZ)
2
by translation
In math, a periodic image is an element ofV=Fun(C
2
n,R) (mapsC
2
n→R)
The standard action onVis the shift above: ((a,b)
f)(x,y) =f(x+a,y+b)
Toy example - part 2
What is aC
2
n-equivariant mapV→V?
For example, convolution, say byc=
1
4
((1,0) + (0,1) + (−1,0) + (0,−1))
By the way
cis an example of what convolution usually does
Fact 1
NN and ENN need nonlinear maps
Fact 2
Every NN and ENN can be approximated using PL activation functions
1+2 motivate the study of
PL rep theory
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 π/ 5

What is an Equivariant Neural Network (ENN)?
C
2
nequivariant
convolutional NN
(W=VorR) :
Snequivariant
point cloud NN
(n=Fun({1,...,n},R)) :
▶Enter, rep theory
▶Rep theory questions
▶Can we decompose the network into smaller (“simple”) bits?
▶The space of all weights is much smaller than for a vanilla NN: can we
use rep theory to study the involved maps?
Toy example - part 1
Periodic images, sayn-by-npixels,
have an action ofC
2
n= (Z/nZ)
2
by translation
In math, a periodic image is an element ofV=Fun(C
2
n,R) (mapsC
2
n→R)
The standard action onVis the shift above: ((a,b)
f)(x,y) =f(x+a,y+b)
Toy example - part 2
What is aC
2
n-equivariant mapV→V?
For example, convolution, say byc=
1
4
((1,0) + (0,1) + (−1,0) + (0,−1))
By the way
cis an example of what convolution usually does
Fact 1
NN and ENN need nonlinear maps
Fact 2
Every NN and ENN can be approximated using PL activation functions
1+2 motivate the study of
PL rep theory
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 π/ 5

PL rep theory (of cyclic groups)
n= 6:
▶LetCn=Z/nZ=⟨a|a
n
= 1⟩
▶The simple complexCn-reps are given by thenth roots of unityL
z
k
▶What about the simplerealCn-reps?
Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5

PL rep theory (of cyclic groups)
n= 6:
▶For Θ = 2π/nobserve that
`
exp(ikΘ) 0
0 exp(ikΘ)
´
∼C
`
cos(kΘ)−sin(kΘ)
sin(kΘ) cos(kΘ)
´
▶⇒the simplerealCn-reps areL0=L
z
0,L1=L
z
1⊕L
z
1, etc.
Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5

PL rep theory (of cyclic groups)
n= 5:
▶For Θ = 2π/nobserve that
`
exp(ikΘ) 0
0 exp(ikΘ)
´
∼C
`
cos(kΘ)−sin(kΘ)
sin(kΘ) cos(kΘ)
´
▶⇒the simplerealCn-reps areL0=L
z
0,L1=L
z
1⊕L
z
1, etc.
Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5

PL rep theory (of cyclic groups)
Thus, we can (explicitly) decomposeR[Cn]

=L0⊕L1⊕... and compute
L0LiR[Cn]R[Cn]L1
.
.
.incl.ReLUproj.proj.proj. Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5

PL rep theory (of cyclic groups)
n= 5:
n= 7:
▶Interaction graph Γ Li→Lj
▶This is ameasurement of difficulty
Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5

PL rep theory (of cyclic groups)
n= 5:
n= 7:
▶Interaction graph Γ Li→Lj
▶This is ameasurement of difficulty
Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5

PL rep theory (of cyclic groups)
n= 5:
n= 7:
▶Interaction graph Γ Li→Lj
▶This is ameasurement of difficulty
Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5

PL rep theory (of cyclic groups)
n= 5:
n= 7:
▶Interaction graph Γ Li→Lj
▶This is ameasurement of difficulty
Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5

PL rep theory (of cyclic groups)
n= 5:
n= 7:
▶Interaction graph Γ Li→Lj
▶This is ameasurement of difficulty
Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5

PL rep theory (of cyclic groups)
n= 5:
n= 7:
▶Interaction graph Γ Li→Lj
▶This is ameasurement of difficulty
Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5

PL rep theory (of cyclic groups)
Somegeneral facts
about PL rep theory
▶There are many PL maps, i.e. forG-representationsV,Wwe have
hom
pl
G
(V,W) = 0 or dimRhom
pl
G
(V,W)>|N|
▶Schur’s lemma is maximally false, i.e. letLandKbe two simple real
G-representations with actions mapsρLandρK. Then:
hom
pl
G
(L,K)̸= 0⇐⇒ker(ρL)⊂ker(ρK).
▶Any PL activation map has an interaction graph Γ equal to the one of:
ReLU or Abs or Id
Thus, Γ depends only onG(up to 3 states it takes)
Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5

PL rep theory (of cyclic groups)-10 -5 5 10
-1.0
-0.5
0.5
1.0
Sin
ReLU
−−−→-10 -5 5 10
-1.0
-0.5
0.5
1.0
ReLU-Sin -10 -5 5 10
-0.2
0.2
0.4
0.6
0.8
1.0
Order 2 -10 -5 5 10
-0.2
0.2
0.4
0.6
0.8
1.0
Order 4 -10 -5 5 10
-0.2
0.2
0.4
0.6
0.8
1.0
Order 6
▶Linear map representation theory↭Fourier approximation of sin

Piecewise linear map representation theory↭Fourier approximation ofReLU◦sin
▶Higher frequencies↭simples with a lot of ingoing arrows
Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5

PL rep theory (of cyclic groups)
▶Finite abelian groups
▶Dihedral groups
▶Symmetric groups
▶Products of these
Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5

What is a Neural Network (NN)?
▶Task:
▶We can see this as afunction
▶Convert the pictures into grayscale values, e.g., 28×28 grid of numbers

Flatten the result into a vector, e.g., 28×287→a vector with 28
2
= 784 entries
▶The output is a vector with 10 entries
▶We thus have a functionR
784
→R
10
Input example
Output example
Example
Here we have two matrices: a 3×1 matrix and a 2×3 matrix


w
x
y

and
`
a11a12a13
a21a22a23
´
plus two bias terms as iny= matrix·x+ bias
Example
Here we have four matrices (plus four biases), whose composition gives a map:
R
3
→R
4
→R
3
→R
3
→R
2
Actually...
we need nonlinear maps as well, say ReLU applied componentwise
Here we have four matrices, whose composition gives a map:
R
3ReLU◦matrix
−−−−−−−→R
4ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
2
.
Life is not linear!
ReLU is crucial and brings in nonlinearity
no ReLU:
, with ReLU:
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 2 / 5 What is a Neural Network (NN)?
▶NN:
▶The task of an NN is toapproximate

It
consists of
neurons = entries of vectors, and weights = entries of matrices
Input example
Output example
Example
Here we have two matrices: a 3×1 matrix and a 2×3 matrix


w
x
y

and
`
a11a12a13
a21a22a23
´
plus two bias terms as iny= matrix·x+ bias
Example
Here we have four matrices (plus four biases), whose composition gives a map:
R
3
→R
4
→R
3
→R
3
→R
2
Actually...
we need nonlinear maps as well, say ReLU applied componentwise
Here we have four matrices, whose composition gives a map:
R
3ReLU◦matrix
−−−−−−−→R
4ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
2
.
Life is not linear!
ReLU is crucial and brings in nonlinearity
no ReLU:
, with ReLU:
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 2 / 5 What is a Neural Network (NN)?
▶NN:
▶The task of an NN is toapproximate

It
consists of
neurons = entries of vectors, and weights = entries of matrices
Input example
Output example
Example
Here we have two matrices: a 3×1 matrix and a 2×3 matrix


w
x
y

and
`
a11a12a13
a21a22a23
´
plus two bias terms as iny= matrix·x+ bias
Example
Here we have four matrices (plus four biases), whose composition gives a map:
R
3
→R
4
→R
3
→R
3
→R
2
Actually...
we need nonlinear maps as well, say ReLU applied componentwise
Here we have four matrices, whose composition gives a map:
R
3ReLU◦matrix
−−−−−−−→R
4ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
2
.
Life is not linear!
ReLU is crucial and brings in nonlinearity
no ReLU:
, with ReLU:
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 2 / 5 What is an Equivariant Neural Network (ENN)?
▶Task
▶Examples
▶Image recognition is often invariant under translation; for example,
whether ice cream is present in an image should not depend on the
position of the ice cream in the image
▶Point cloud problems are invariant under the symmetric group
Toy example - part 1
Periodic images, sayn-by-npixels,
have an action ofC
2
n= (Z/nZ)
2
by translation
In math, a periodic image is an element ofV=Fun(C
2
n,R) (mapsC
2
n→R)
The standard action onVis the shift above: ((a,b)
f)(x,y) =f(x+a,y+b)
Toy example - part 2
What is aC
2
n-equivariant mapV→V?
For example, convolution, say byc=
1
4
((1,0) + (0,1) + (−1,0) + (0,−1))
By the way
cis an example of what convolution usually does
Fact 1
NN and ENN need nonlinear maps
Fact 2
Every NN and ENN can be approximated using PL activation functions
1+2 motivate the study of
PL rep theory
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 π/ 5 What is an Equivariant Neural Network (ENN)?
C
2
nequivariant
convolutional NN
(W=VorR) :
Snequivariant
point cloud NN
(n=Fun({1,...,n},R)) :
▶Enter, rep theory
▶Rep theory questions
▶Can we decompose the network into smaller (“simple”) bits?
▶The space of all weights is much smaller than for a vanilla NN: can we
use rep theory to study the involved maps?
Toy example - part 1
Periodic images, sayn-by-npixels,
have an action ofC
2
n= (Z/nZ)
2
by translation
In math, a periodic image is an element ofV=Fun(C
2
n,R) (mapsC
2
n→R)
The standard action onVis the shift above: ((a,b)
f)(x,y) =f(x+a,y+b)
Toy example - part 2
What is aC
2
n-equivariant mapV→V?
For example, convolution, say byc=
1
4
((1,0) + (0,1) + (−1,0) + (0,−1))
By the way
cis an example of what convolution usually does
Fact 1
NN and ENN need nonlinear maps
Fact 2
Every NN and ENN can be approximated using PL activation functions
1+2 motivate the study of
PL rep theory
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 π/ 5 PL rep theory (of cyclic groups)
n= 6:
▶For Θ = 2π/nobserve that
`
exp(ikΘ) 0
0 exp(ikΘ)
´
∼C
`
cos(kΘ)−sin(kΘ)
sin(kΘ) cos(kΘ)
´
▶⇒the simplerealCn-reps areL0=L
z
0,L1=L
z
1⊕L
z
1, etc.
Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5 PL rep theory (of cyclic groups)
Thus, we can (explicitly) decomposeR[Cn]

=L0⊕L1⊕... and compute
L0LiR[Cn]R[Cn]L1
.
.
.incl.ReLUproj.proj.proj. Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5 PL rep theory (of cyclic groups)
n= 5:
n= 7:
▶Interaction graph Γ Li→Lj
▶This is ameasurement of difficulty
Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5 PL rep theory (of cyclic groups)-10 -5 5 10
-1.0
-0.5
0.5
1.0
Sin
ReLU
−−−→-10 -5 5 10
-1.0
-0.5
0.5
1.0
ReLU-Sin -10 -5 5 10
-0.2
0.2
0.4
0.6
0.8
1.0
Order 2 -10 -5 5 10
-0.2
0.2
0.4
0.6
0.8
1.0
Order 4 -10 -5 5 10
-0.2
0.2
0.4
0.6
0.8
1.0
Order 6
▶Linear map representation theory↭Fourier approximation of sin

Piecewise linear map representation theory↭Fourier approximation ofReLU◦sin
▶Higher frequencies↭simples with a lot of ingoing arrows
Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5 There is still much to do... Thanks for your attention!
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 5 / 5

What is a Neural Network (NN)?
▶Task:
▶We can see this as afunction
▶Convert the pictures into grayscale values, e.g., 28×28 grid of numbers

Flatten the result into a vector, e.g., 28×287→a vector with 28
2
= 784 entries
▶The output is a vector with 10 entries
▶We thus have a functionR
784
→R
10
Input example
Output example
Example
Here we have two matrices: a 3×1 matrix and a 2×3 matrix


w
x
y

and
`
a11a12a13
a21a22a23
´
plus two bias terms as iny= matrix·x+ bias
Example
Here we have four matrices (plus four biases), whose composition gives a map:
R
3
→R
4
→R
3
→R
3
→R
2
Actually...
we need nonlinear maps as well, say ReLU applied componentwise
Here we have four matrices, whose composition gives a map:
R
3ReLU◦matrix
−−−−−−−→R
4ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
2
.
Life is not linear!
ReLU is crucial and brings in nonlinearity
no ReLU:
, with ReLU:
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 2 / 5 What is a Neural Network (NN)?
▶NN:
▶The task of an NN is toapproximate

It
consists of
neurons = entries of vectors, and weights = entries of matrices
Input example
Output example
Example
Here we have two matrices: a 3×1 matrix and a 2×3 matrix


w
x
y

and
`
a11a12a13
a21a22a23
´
plus two bias terms as iny= matrix·x+ bias
Example
Here we have four matrices (plus four biases), whose composition gives a map:
R
3
→R
4
→R
3
→R
3
→R
2
Actually...
we need nonlinear maps as well, say ReLU applied componentwise
Here we have four matrices, whose composition gives a map:
R
3ReLU◦matrix
−−−−−−−→R
4ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
2
.
Life is not linear!
ReLU is crucial and brings in nonlinearity
no ReLU:
, with ReLU:
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 2 / 5 What is a Neural Network (NN)?
▶NN:
▶The task of an NN is toapproximate

It
consists of
neurons = entries of vectors, and weights = entries of matrices
Input example
Output example
Example
Here we have two matrices: a 3×1 matrix and a 2×3 matrix


w
x
y

and
`
a11a12a13
a21a22a23
´
plus two bias terms as iny= matrix·x+ bias
Example
Here we have four matrices (plus four biases), whose composition gives a map:
R
3
→R
4
→R
3
→R
3
→R
2
Actually...
we need nonlinear maps as well, say ReLU applied componentwise
Here we have four matrices, whose composition gives a map:
R
3ReLU◦matrix
−−−−−−−→R
4ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
3ReLU◦matrix
−−−−−−−→R
2
.
Life is not linear!
ReLU is crucial and brings in nonlinearity
no ReLU:
, with ReLU:
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 2 / 5 What is an Equivariant Neural Network (ENN)?
▶Task
▶Examples
▶Image recognition is often invariant under translation; for example,
whether ice cream is present in an image should not depend on the
position of the ice cream in the image
▶Point cloud problems are invariant under the symmetric group
Toy example - part 1
Periodic images, sayn-by-npixels,
have an action ofC
2
n= (Z/nZ)
2
by translation
In math, a periodic image is an element ofV=Fun(C
2
n,R) (mapsC
2
n→R)
The standard action onVis the shift above: ((a,b)
f)(x,y) =f(x+a,y+b)
Toy example - part 2
What is aC
2
n-equivariant mapV→V?
For example, convolution, say byc=
1
4
((1,0) + (0,1) + (−1,0) + (0,−1))
By the way
cis an example of what convolution usually does
Fact 1
NN and ENN need nonlinear maps
Fact 2
Every NN and ENN can be approximated using PL activation functions
1+2 motivate the study of
PL rep theory
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 π/ 5 What is an Equivariant Neural Network (ENN)?
C
2
nequivariant
convolutional NN
(W=VorR) :
Snequivariant
point cloud NN
(n=Fun({1,...,n},R)) :
▶Enter, rep theory
▶Rep theory questions
▶Can we decompose the network into smaller (“simple”) bits?
▶The space of all weights is much smaller than for a vanilla NN: can we
use rep theory to study the involved maps?
Toy example - part 1
Periodic images, sayn-by-npixels,
have an action ofC
2
n= (Z/nZ)
2
by translation
In math, a periodic image is an element ofV=Fun(C
2
n,R) (mapsC
2
n→R)
The standard action onVis the shift above: ((a,b)
f)(x,y) =f(x+a,y+b)
Toy example - part 2
What is aC
2
n-equivariant mapV→V?
For example, convolution, say byc=
1
4
((1,0) + (0,1) + (−1,0) + (0,−1))
By the way
cis an example of what convolution usually does
Fact 1
NN and ENN need nonlinear maps
Fact 2
Every NN and ENN can be approximated using PL activation functions
1+2 motivate the study of
PL rep theory
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 π/ 5 PL rep theory (of cyclic groups)
n= 6:
▶For Θ = 2π/nobserve that
`
exp(ikΘ) 0
0 exp(ikΘ)
´
∼C
`
cos(kΘ)−sin(kΘ)
sin(kΘ) cos(kΘ)
´
▶⇒the simplerealCn-reps areL0=L
z
0,L1=L
z
1⊕L
z
1, etc.
Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5 PL rep theory (of cyclic groups)
Thus, we can (explicitly) decomposeR[Cn]

=L0⊕L1⊕... and compute
L0LiR[Cn]R[Cn]L1
.
.
.incl.ReLUproj.proj.proj. Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5 PL rep theory (of cyclic groups)
n= 5:
n= 7:
▶Interaction graph Γ Li→Lj
▶This is ameasurement of difficulty
Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5 PL rep theory (of cyclic groups)-10 -5 5 10
-1.0
-0.5
0.5
1.0
Sin
ReLU
−−−→-10 -5 5 10
-1.0
-0.5
0.5
1.0
ReLU-Sin -10 -5 5 10
-0.2
0.2
0.4
0.6
0.8
1.0
Order 2 -10 -5 5 10
-0.2
0.2
0.4
0.6
0.8
1.0
Order 4 -10 -5 5 10
-0.2
0.2
0.4
0.6
0.8
1.0
Order 6
▶Linear map representation theory↭Fourier approximation of sin

Piecewise linear map representation theory↭Fourier approximation ofReLU◦sin
▶Higher frequencies↭simples with a lot of ingoing arrows
Play live at https://www.dtubbenhauer.com/pl-reps/site/index.html
It is easy to produce these with a machine
Let us play online
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for ReLU)
Every vertex has a loop and there is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
It is easy to produce these with a machine – for any map
This is now the absolute value
ord(Li) = order of the action ofa
ord

(Li) = same but divide by two iford(Li) is even
Theorem(for Abs)
There is a non-loop edge fromLtoKin Γ
if and only iford(K) dividesord

(L)
Calculation complexity iscaptured in Γ
Theorem
# calcs needed to evaluatef
k
onLi= number of lengthkpath ending atLiin Γf
The piecewise linear butnon-linear L1totrivial sign
Let us play online
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 4 / 5 There is still much to do... Thanks for your attention!
Equivariant neural networks and representation theory Or: Beyond linearity July 2024 5 / 5