1. git clone https://github.com/Nardien/samsung_text_classification 2. cd samsung_text_classification 3. jupyter notebook => 인터넷 창 열림 => dependencies.ipynb 클릭 => shift + enter
What is PyTorch? Developed by Facebook Python first Dynamic Neural Network This tutorial is for PyTorch 0.2.0 Endorsed by Director of AI at Tesla
Instal l ation PyTorch Web: http://pytorch.org/
Packages of PyTorch P ac k a g e D e script i on a Tensor library like Numpy, with strong GPU support torch t o r ch. au t og r ad t o r ch. nn torch.optim torch.multiprocessing a t a p e b a s ed a u tom a tic d i f f e r e n ti a t i o n lib r a r y t h a t su p p o r ts all differentiable Tensor operations in torch a neural networks library deeply integrated with autograd designed for maximum flexibility an optimization package to be used with torch.nn with standard optimization methods such as SGD, RMSProp, LBFGS, Adam etc. python multiprocessing, but with magical memory sharing of torch Tensors across processes. Useful for data loading and hogwild training. DataLoader, Trainer and other utility functions for convenience torch.utils torch.legacy(.nn/.optim) legacy code that has been ported over from torch for backward compatibility reasons This Tutorial
Outline Neural Network in Brief Concepts of PyTorch RNN Comparison with TensorFlow
Neural Network in Brief Supervised Learning – Learning a function f, that f(x)=y X1 Y1 X2 Y2 … … Trying to learn f(.), that f(x)=y Data Label
Neural Network in Brief W i D a t a Neural Network Batch N Big Data Batch 1 Batch 2 Batch 3 … 1 Epoch N=Big Data/Batch Size
Neural Network in Brief W i D a t a Lab e l ’ Neural Network F o r w a r d Big Data Batch N Batch 1 Batch 2 Batch 3 … 1 Epoch N=Big Data/Batch Size Forward Process : from data to label
Neural Network in Brief W i D a t a Lab e l ’ Neural Network Lab e l Loss F o r w a r d Big Data Batch N Batch 1 Batch 2 Batch 3 … 1 Epoch N=Big Data/Batch Size Forward Process : from data to label
Neural Network in Brief W i -> W i+1 D a t a Lab e l ’ Neural Network Lab e l Loss O p t i mi z er Back w a r d F o r w a r d Big Data Batch N Batch 1 Batch 2 Batch 3 … 1 Epoch Forward Process : from data to label Backward Process : update the parameters N=Big Data/Batch Size
W i -> W i+1 D a t a Lab e l ’ Neural Network Lab e l Loss O p t i mi z er F o r w a r d Neural Network in Brief Inside the Neural Network Forward … D a t a W W W W W Label’ Backward Gradient Back w a r d
W i -> W i+1 D a t a Lab e l ’ Neural Network Lab e l Loss O p t i mi z er F o r w a r d Neural Network in Brief Inside the Neural Network Forward … D a t a Backward Gradient Data in the Neural Network Tensor (n-dim array) Gradient of Functions W W W W W Label’ Back w a r d
Concepts of PyTorch W i -> W i+1 D a t a Lab e l ’ Neural Network Lab e l Loss O p t i mi z er F o r w a r d Back w a r d Modules of PyTorch Data: Tensor Fu n ction: NN Modules Optimizer Loss Function
Concepts of PyTorch Modules of PyTorch • Similar to Numpy Data: Tensor Function: NN Modules Optimizer Loss Function
Concepts of PyTorch Operations z=x+y torch.add(x,y, out=z) y.add_(x) # in-place Modules of PyTorch Data: Tensor Function: NN Modules Optimizer Loss Function
Concepts of PyTorch To Numpy a = torch.ones(5) b = a.numpy() To Tensor a = numpy.ones(5) b = torch.from_numpy(a) Modules of PyTorch • Numpy Bridge Data: Tensor Fu n ction: NN Modules Optimizer Loss Function
Concepts of PyTorch Move to GPU x = x.cuda() y = y.cuda() x+y Modules of PyTorch • CUDA Tensors Data: Tensor Fu n ction: NN Modules Optimizer Loss Function
Concepts of PyTorch W i -> W i+1 D a t a Lab e l ’ Neural Network Lab e l Loss O p t i mi z er F o r w a r d Back w a r d Modules of PyTorch Data: Tensor Fu n ction: NN Modules Optimizer Loss Function
W i -> W i+1 D a t a Lab e l ’ Neural Network Lab e l Loss O p t i mi z er F o r w a r d Neural Network in Brief Inside the Neural Network Forward … D a t a Backward Gradient Data in the Neural Network Tensor (n-dim array) Gradient of Functions W W W W W Label’ Back w a r d
x conv1 relu pool i ng c o n v2 relu poo l i ng f c 1 r elu fc2 r elu Define modules (must have) f c 3 • http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network Build network (must have) [Channel, H, W]: 1x32x32->6x28x28
x c o n v1 r elu pooling c o n v2 relu poo l i ng f c 1 r elu fc2 r elu Define modules (must have) f c 3 • http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network Build network (must have) [Channel, H, W]: 6x28x28
x c o n v1 relu pooling c o n v2 relu poo l i ng f c 1 r elu fc2 r elu Define modules (must have) f c 3 • http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network Build network (must have) [Channel, H, W]: 6x28x28 -> 6x14x14
x conv1 relu pool i ng c o n v2 relu poo l i ng f c 1 r elu fc2 r elu Define modules (must have) f c 3 • http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network Build network (must have) [Channel, H, W]: 6x14x14 -> 16x10x10
x conv1 relu pool i ng c o n v2 r elu poo l i ng f c 1 r elu fc2 r elu Define modules (must have) f c 3 • http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network Build network (must have) [Channel, H, W]: 16x10x10
x conv1 relu pool i ng c o n v2 r elu poo l i ng f c 1 r elu fc2 r elu Define modules (must have) f c 3 • http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network Build network (must have) [Channel, H, W]: 16x10x10 -> 16x5x5
x conv1 relu pool i ng c o n v2 relu poo l i ng f c 1 r elu fc2 r elu Define modules (must have) f c 3 • http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network Build network (must have) Flatten the Tensor 16x5x5 Tensor: [Batch N, Channel, H, W]
x conv1 relu pool i ng c o n v2 relu poo l i ng f c 1 r elu fc2 r elu Define modules (must have) f c 3 • http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network Build network (must have)
x conv1 relu pool i ng c o n v2 relu poo l i ng f c 1 r elu fc2 r elu Define modules (must have) f c 3 • http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network Build network (must have)
x conv1 relu pool i ng c o n v2 relu poo l i ng f c 1 r elu fc2 r elu Define modules (must have) f c 3 • http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network Build network (must have)
x conv1 relu pool i ng c o n v2 relu poo l i ng f c 1 r elu fc2 r elu Define modules (must have) f c 3 • http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network Build network (must have)
x conv1 relu pool i ng c o n v2 relu poo l i ng f c 1 r elu fc2 r elu Define modules (must have) f c 3 • http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#define-the-network Build network (must have)
Concepts of PyTorch Modules built on Variable Gradient handled by PyTorch Common Modules Convolution layers Linear layers Pooling layers Dropout layers Etc… Modules of PyTorch • NN Modules (torch.nn) Data: Tensor Fu n ction: NN Modules Optimizer Loss Function
NN Modules Convolution Layer N-th Batch (N), Channel (C) torch.nn.Conv1d: input [N, C, W] # moving kernel in 1D torch.nn.Conv2d: input [N, C, H, W] # moving kernel in 2D torch.nn.Conv3d: input [N, C, D, H, W] # moving kernel in 3D Input for Conv2d H in W in C in *: convolution
NN Modules Convolution Layer N-th Batch (N), Channel (C) torch.nn.Conv1d: input [N, C, W] # moving kernel in 1D torch.nn.Conv2d: input [N, C, H, W] # moving kernel in 2D torch.nn.Conv3d: input [N, C, D, H, W] # moving kernel in 3D H in W in C in * Input for Conv2d 1 st kernel k k C in *: convolution
NN Modules Convolution Layer N-th Batch (N), Channel (C) torch.nn.Conv1d: input [N, C, W] # moving kernel in 1D torch.nn.Conv2d: input [N, C, H, W] # moving kernel in 2D torch.nn.Conv3d: input [N, C, D, H, W] # moving kernel in 3D H in W in C in H out W out 1 *: convolution * Input for Conv2d 1 st kernel k k C in =
NN Modules Convolution Layer N-th Batch (N), Channel (C) torch.nn.Conv1d: input [N, C, W] # moving kernel in 1D torch.nn.Conv2d: input [N, C, H, W] # moving kernel in 2D torch.nn.Conv3d: input [N, C, D, H, W] # moving kernel in 3D H in W in C in H out 1 * Input for Conv2d 1 st kernel k k C in = d=1 k=3 s=1, moving step size *: convolution W out p=1
NN Modules Convolution Layer N-th Batch (N), Channel (C) torch.nn.Conv1d: input [N, C, W] # moving kernel in 1D torch.nn.Conv2d: input [N, C, H, W] # moving kernel in 2D torch.nn.Conv3d: input [N, C, D, H, W] # moving kernel in 3D H in W in C in H out 1 * Input for Conv2d 1 st kernel k k C in = p =1 W out p=1 *: convolution d=1 k=3 s=1, moving step size
NN Modules Convolution Layer N-th Batch (N), Channel (C) torch.nn.Conv1d: input [N, C, W] # moving kernel in 1D torch.nn.Conv2d: input [N, C, H, W] # moving kernel in 2D torch.nn.Conv3d: input [N, C, D, H, W] # moving kernel in 3D H in W in C in H out 1 * Input for Conv2d 1 st kernel k k C in = p =1 k=3 *: convolution W out p=1 d=1 k=3 s=1, moving step size
NN Modules Convolution Layer N-th Batch (N), Channel (C) torch.nn.Conv1d: input [N, C, W] # moving kernel in 1D torch.nn.Conv2d: input [N, C, H, W] # moving kernel in 2D torch.nn.Conv3d: input [N, C, D, H, W] # moving kernel in 3D H in W in C in H out 1 * Input for Conv2d 1 st kernel k k C in = p =1 k=3 s=1 *: convolution W out p=1 d=1 k=3 s=1, moving step size
NN Modules Convolution Layer N-th Batch (N), Channel (C) torch.nn.Conv1d: input [N, C, W] # moving kernel in 1D torch.nn.Conv2d: input [N, C, H, W] # moving kernel in 2D torch.nn.Conv3d: input [N, C, D, H, W] # moving kernel in 3D H in W in C in H out W out * out C -th kernel k H out * k W out *: convolution C in 1 = = Input for Conv2d 1 st kernel k k C in … 1 …
NN Modules Convolution Layer N-th Batch (N), Channel (C) torch.nn.Conv1d: input [N, C, W] # moving kernel in 1D torch.nn.Conv2d: input [N, C, H, W] # moving kernel in 2D torch.nn.Conv3d: input [N, C, D, H, W] # moving kernel in 3D H in W in C in H out W out 1 * out C -th kernel k H out * = = H out W out k W out *: convolution C in 1 C out Input for Conv2d 1 st kernel k k C in … …
NN Modules Convolution Layer N-th Batch (N), Channel (C) torch.nn.Conv1d: input [N, C, W] # moving kernel in 1D torch.nn.Conv2d: input [N, C, H, W] # moving kernel in 2D torch.nn.Conv3d: input [N, C, D, H, W] # moving kernel in 3D H in W in C in * out C -th kernel k k C in * *: convolution Input for Conv2d 1 st kernel k k C in … # of parameters
NN Modules Linear Layer torch.nn.Linear(in_features=3, out_features=5) y=Ax+b
NN Modules Dropout Layer torch.nn.Dropout(p) Random zeros the input with probability p Output are scaled by 1/(1-p) If dropout here
Concepts of PyTorch Modules built on Variable Gradient handled by PyTorch Common Modules Convolution layers Linear layers Pooling layers Dropout layers Etc… Modules of PyTorch • NN Modules (torch.nn) Data: Tensor Variable ( for Gradient ) Fu n ction: NN Modules Optimizer Loss Function
Concepts of PyTorch Modules of PyTorch • Optimizer (torch.optim) SGD Adagrad Adam RMSprop – … 9 Optimizers Loss (torch.nn) L1Loss MSELoss CrossEntropy – … 18 Loss Functions D a t a: Tensor Variable ( for Gradient ) Fu n ction: NN Modules Optimizer Loss Function
Define modules (must have) Build network (must have) What We Build? http://pytorch.org/tutorials/beginner/pytorch_with_examples.html#pytorch-optim
Define modules (must have) Build network (must have) What We Build? … … … D_in=1000 H=100 D_out=100 y _p r ed http://pytorch.org/tutorials/beginner/pytorch_with_examples.html#pytorch-optim
Define modules (must have) Build network (must have) What We Build? … … … D_in=1000 H=100 D_out=100 y _p r ed Don’t Update y (y are labels here) Construct Our Model Optimizer and Loss Function http://pytorch.org/tutorials/beginner/pytorch_with_examples.html#pytorch-optim
Define modules (must have) Build network (must have) … … … D_in=1000 H=100 D_out=100 y _p r ed Reset Gradient Backward Update Step What We Build? Don’t Update y (y are labels here) Construct Our Model Optimizer and Loss Function http://pytorch.org/tutorials/beginner/pytorch_with_examples.html#pytorch-optim
Saving Models First Approach (Recommend by PyTorch) # save only the model parameters torch.save(the_model.state_dict(), PATH) # load only the model parameters the_model = TheModelClass(*args, **kwargs) the_model.load_state_dict(torch.load(PATH)) Second Approach torch.save(the_model, PATH) # save the entire model the_model = torch.load(PATH) # load the entire model http://pytorch.org/docs/master/notes/serialization.html#recommended-approach-for-saving-a-model
Recurrent Neural Network (RNN) out p ut h id d en self.i2h input_size=50+20=70 input http://pytorch.org/tutorials/beginner/former_torchies/nn_tutorial.html#example-2-recurrent-net
Recurrent Neural Network (RNN) out p ut Same module (i.e. same parameters) among the time h id d en self.i2h input_size=50+20=70 input http://pytorch.org/tutorials/beginner/former_torchies/nn_tutorial.html#example-2-recurrent-net
Comparison with TensorFlow Properties TensorFlow PyTorch Graph Static Dynamic (TensorFlow Fold) Dynamic Ramp-up Time - Win Graph Creation and Debugging - Win Feature Coverage Win Catch up quickly Documentation Tie Tie Serialization Win (support other lang.) - Deployment Win (Cloud & Mobile) - Data Loading - Win Device Management Win Need .cuda() Custom Extensions - Win Summarized from https://awni.github.io/pytorch-tensorflow/