{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n\uc2e0\uacbd\ub9dd(Neural Networks)\n=======================\n\n\uc2e0\uacbd\ub9dd\uc740 ``torch.nn`` \ud328\ud0a4\uc9c0\ub97c \uc0ac\uc6a9\ud558\uc5ec \uc0dd\uc131\ud560 \uc218 \uc788\uc2b5\ub2c8\ub2e4.\n\n\uc9c0\uae08\uae4c\uc9c0 ``autograd`` \ub97c \uc0b4\ud3b4\ubd24\ub294\ub370\uc694, ``nn`` \uc740 \ubaa8\ub378\uc744 \uc815\uc758\ud558\uace0 \ubbf8\ubd84\ud558\ub294\ub370\n``autograd`` \ub97c \uc0ac\uc6a9\ud569\ub2c8\ub2e4.\n``nn.Module`` \uc740 \uacc4\uce35(layer)\uacfc ``output`` \uc744 \ubc18\ud658\ud558\ub294 ``forward(input)``\n\uba54\uc11c\ub4dc\ub97c \ud3ec\ud568\ud558\uace0 \uc788\uc2b5\ub2c8\ub2e4.\n\n\uc22b\uc790 \uc774\ubbf8\uc9c0\ub97c \ubd84\ub958\ud558\ub294 \uc2e0\uacbd\ub9dd\uc744 \uc608\uc81c\ub85c \uc0b4\ud3b4\ubcf4\uaca0\uc2b5\ub2c8\ub2e4:\n\n.. figure:: /_static/img/mnist.png\n :alt: convnet\n\n convnet\n\n\uc774\ub294 \uac04\ub2e8\ud55c \ud53c\ub4dc-\ud3ec\uc6cc\ub4dc \ub124\ud2b8\uc6cc\ud06c(Feed-forward network)\uc785\ub2c8\ub2e4. \uc785\ub825(input)\uc744 \ubc1b\uc544\n\uc5ec\ub7ec \uacc4\uce35\uc5d0 \ucc28\ub840\ub85c \uc804\ub2ec\ud55c \ud6c4, \ucd5c\uc885 \ucd9c\ub825(output)\uc744 \uc81c\uacf5\ud569\ub2c8\ub2e4.\n\n\uc2e0\uacbd\ub9dd\uc758 \uc804\ud615\uc801\uc778 \ud559\uc2b5 \uacfc\uc815\uc740 \ub2e4\uc74c\uacfc \uac19\uc2b5\ub2c8\ub2e4:\n\n- \ud559\uc2b5 \uac00\ub2a5\ud55c \ub9e4\uac1c\ubcc0\uc218(\ub610\ub294 \uac00\uc911\uce58(weight))\ub97c \uac16\ub294 \uc2e0\uacbd\ub9dd\uc744 \uc815\uc758\ud569\ub2c8\ub2e4.\n- \ub370\uc774\ud130\uc14b(dataset) \uc785\ub825\uc744 \ubc18\ubcf5\ud569\ub2c8\ub2e4.\n- \uc785\ub825\uc744 \uc2e0\uacbd\ub9dd\uc5d0\uc11c \ucc98\ub9ac\ud569\ub2c8\ub2e4.\n- \uc190\uc2e4(loss; \ucd9c\ub825\uc774 \uc815\ub2f5\uc73c\ub85c\ubd80\ud130 \uc5bc\ub9c8\ub098 \ub5a8\uc5b4\uc838\uc788\ub294\uc9c0)\uc744 \uacc4\uc0b0\ud569\ub2c8\ub2e4.\n- \ubcc0\ud654\ub3c4(gradient)\ub97c \uc2e0\uacbd\ub9dd\uc758 \ub9e4\uac1c\ubcc0\uc218\ub4e4\uc5d0 \uc5ed\uc73c\ub85c \uc804\ud30c\ud569\ub2c8\ub2e4.\n- \uc2e0\uacbd\ub9dd\uc758 \uac00\uc911\uce58\ub97c \uac31\uc2e0\ud569\ub2c8\ub2e4. \uc77c\ubc18\uc801\uc73c\ub85c \ub2e4\uc74c\uacfc \uac19\uc740 \uac04\ub2e8\ud55c \uaddc\uce59\uc744 \uc0ac\uc6a9\ud569\ub2c8\ub2e4:\n ``\uac00\uc911\uce58(wiehgt) = \uac00\uc911\uce58(weight) - \ud559\uc2b5\uc728(learning rate) * \ubcc0\ud654\ub3c4(gradient)``\n\n\uc2e0\uacbd\ub9dd \uc815\uc758\ud558\uae30\n---------------\n\n\uc2e0\uacbd\ub9dd\uc744 \uc815\uc758\ud569\ub2c8\ub2e4:\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import torch\nfrom torch.autograd import Variable\nimport torch.nn as nn\nimport torch.nn.functional as F\n\n\nclass Net(nn.Module):\n\n def __init__(self):\n super(Net, self).__init__()\n # 1 input image channel, 6 output channels, 5x5 square convolution\n # kernel\n self.conv1 = nn.Conv2d(1, 6, 5)\n self.conv2 = nn.Conv2d(6, 16, 5)\n # an affine operation: y = Wx + b\n self.fc1 = nn.Linear(16 * 5 * 5, 120)\n self.fc2 = nn.Linear(120, 84)\n self.fc3 = nn.Linear(84, 10)\n\n def forward(self, x):\n # Max pooling over a (2, 2) window\n x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))\n # If the size is a square you can only specify a single number\n x = F.max_pool2d(F.relu(self.conv2(x)), 2)\n x = x.view(-1, self.num_flat_features(x))\n x = F.relu(self.fc1(x))\n x = F.relu(self.fc2(x))\n x = self.fc3(x)\n return x\n\n def num_flat_features(self, x):\n size = x.size()[1:] # all dimensions except the batch dimension\n num_features = 1\n for s in size:\n num_features *= s\n return num_features\n\n\nnet = Net()\nprint(net)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "``forward`` \ud568\uc218\ub9cc \uc815\uc758\ud558\uac8c \ub418\uba74, (\ubcc0\ud654\ub3c4\ub97c \uacc4\uc0b0\ud558\ub294) ``backward`` \ud568\uc218\ub294\n``autograd`` \ub97c \uc0ac\uc6a9\ud558\uc5ec \uc790\ub3d9\uc73c\ub85c \uc815\uc758\ub429\ub2c8\ub2e4.\n``forward`` \ud568\uc218\uc5d0\uc11c\ub294 \uc5b4\ub5a0\ud55c Tensor \uc5f0\uc0b0\uc744 \uc0ac\uc6a9\ud574\ub3c4 \ub429\ub2c8\ub2e4.\n\n\ubaa8\ub378\uc758 \ud559\uc2b5 \uac00\ub2a5\ud55c \ub9e4\uac1c\ubcc0\uc218\ub4e4\uc740 ``net.parameters()`` \uc5d0 \uc758\ud574 \ubc18\ud658\ub429\ub2c8\ub2e4.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "params = list(net.parameters())\nprint(len(params))\nprint(params[0].size()) # conv1's .weight" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "forward\uc758 \uc785\ub825\uc740 ``autograd.Variable`` \uc774\uace0, \ucd9c\ub825 \ub610\ud55c \ub9c8\ucc2c\uac00\uc9c0\uc785\ub2c8\ub2e4.\nNote: \uc774 \uc2e0\uacbd\ub9dd(LeNet)\uc758 \uc785\ub825\uc740 32x32\uc785\ub2c8\ub2e4. \uc774 \uc2e0\uacbd\ub9dd\uc5d0 MNIST \ub370\uc774\ud130\uc14b\uc744\n\uc0ac\uc6a9\ud558\uae30 \uc704\ud574\uc11c\ub294, \ub370\uc774\ud130\uc14b\uc758 \uc774\ubbf8\uc9c0\ub97c 32x32\ub85c \ud06c\uae30\ub97c \ubcc0\uacbd\ud574\uc57c \ud569\ub2c8\ub2e4.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "input = Variable(torch.randn(1, 1, 32, 32))\nout = net(input)\nprint(out)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\ubaa8\ub4e0 \ub9e4\uac1c\ubcc0\uc218\uc758 \ubcc0\ud654\ub3c4 \ubc84\ud37c(gradient buffer)\ub97c 0\uc73c\ub85c \uc124\uc815\ud558\uace0, \ubb34\uc791\uc704 \uac12\uc73c\ub85c \uc5ed\uc804\ud30c\ub97c \ud569\ub2c8\ub2e4:\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "net.zero_grad()\nout.backward(torch.randn(1, 10))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
``torch.nn`` \uc740 \ubbf8\ub2c8 \ubc30\uce58(mini-batch)\ub9cc \uc9c0\uc6d0\ud569\ub2c8\ub2e4. ``torch.nn`` \ud328\ud0a4\uc9c0\n \uc804\uccb4\ub294 \ud558\ub098\uc758 \uc0d8\ud50c\uc774 \uc544\ub2cc, \uc0d8\ud50c\ub4e4\uc758 \ubbf8\ub2c8\ubc30\uce58\ub9cc\uc744 \uc785\ub825\uc73c\ub85c \ubc1b\uc2b5\ub2c8\ub2e4.\n\n \uc608\ub97c \ub4e4\uc5b4, ``nnConv2D`` \ub294 ``nSamples x nChannels x Height x Width`` \uc758\n 4\ucc28\uc6d0 Tensor\ub97c \uc785\ub825\uc73c\ub85c \ud569\ub2c8\ub2e4.\n\n \ub9cc\uc57d \ud558\ub098\uc758 \uc0d8\ud50c\ub9cc \uc788\ub2e4\uba74, ``input.unsqueeze(0)`` \uc744 \uc0ac\uc6a9\ud574\uc11c \uac00\uc9dc \ucc28\uc6d0\uc744\n \ucd94\uac00\ud569\ub2c8\ub2e4.