PyTorch/張量

張量

PyTorch 中的基本物件是張量。張量類似於 numpy 矩陣，但有兩個重要的補充：它們與 CUDA 協同工作，並且可以計算梯度。

張量的建立和操作類似於 numpy 矩陣

>>> a = np.random.rand(10000, 10000).astype(np.float32)
>>> b = np.random.rand(10000, 10000).astype(np.float32)
>>> t = time.time(); c = np.matmul(a, b); time.time()-t
7.447854280471802

>>> a1 = torch.rand(10000, 10000, dtype=torch.float32) # note how torch.rand supports dtype
>>> b1 = torch.rand(10000, 10000, dtype=torch.float32)
>>> t = time.time(); c1 = torch.matmul(a1, b1); time.time()-t
7.758733749389648

所有像 np.ones、np.zeros、np.empty 等等的功能，以及其他主要功能和算術運算子，也存在於 torch 中

   >>> torch.ones(2,2)
   tensor([[1., 1.],
           [1., 1.]])
   >>> torch.ones(2,2, dtype=torch.int32)
   tensor([[1, 1],
           [1, 1]], dtype=torch.int32)
   >>> a=torch.ones(2,2) # or torch.ones((2,2)) which is the same
   >>> b=a+1
   >>> c=a*b
   >>> c.reshape(1,4) # or c.view(1,4) which is the same
   tensor(2., 2., 2., 2.)

對於張量，函式 size 是一個返回 torch.Size 物件的函式，而不是一個成員，它是一個元組。這樣做很好，因為 torch.Size 繼承了元組，並定義了一些額外的運算子

>>> a=torch.ones(2,3,4)
>>> a.size()
torch.Size([2, 3, 4])
>>> a.size().numel()
24

張量的 sum()、mean() 等等函式返回的不是一個數字，而是一個零維張量。張量元素也是零維張量，而不是數字

   >>> a = torch.ones(2,2)
   >>> a.sum()
   tensor(4.)
   >>> a.sum().size()
   torch.Size([])
   >>> a.sum().dim() 
   0
   >>> a[0,0]
   tensor(1.)

To convert a zero dimensional tensor to a number, you should explicitly call the function item:
   >>> a.sum().item()
   4.0

在 torch 中，用函式 to 替代 numpy 的 astype

   >>> a.to(torch.int16)
   tensor([[1, 1],
           [1, 1]], dtype=torch.int16)

名稱更改是因為函式 to 可以做的不僅僅是更改元素型別。它還可以將資料移動到 CUDA 中，並從 CUDA 中移動出來，並且它適用於各種 torch 資料型別，包括神經網路。

張量和 numpy 矩陣

由於張量和 numpy 矩陣非常相似，如果我們可以將它們相互轉換就好了。而且，我們確實可以。這就像切蛋糕一樣簡單。要將張量轉換為矩陣，只需呼叫 numpy 方法即可。對於反向操作，呼叫 torch.tensor 建構函式

   >>> a=torch.ones(2,2, dtype=torch.float16)
   >>> a.numpy()
   array([[1., 1.],
          [1., 1.]], dtype=float16)
   >>> b=np.ones((2,2), dtype=np.float16)
   >>> torch.tensor(b)
   tensor([[1., 1.],
           [1., 1.]], dtype=torch.float16)

CUDA

雖然你可以在沒有 CUDA 的情況下使用 PyTorch，但它會將計算速度提高 10-20 倍。

在使用 CUDA 之前，請檢查它是否可用。輸入

   torch.cuda.is_available()

如果返回 False，你可以跳過本節的其餘部分。

你也可以檢查 CUDA 和 cuDNN 庫的版本

   >>> torch.version.cuda
   '10.0'
   >>> torch.backends.cudnn.version()
   7401
   >>> torch.backends.cudnn.enabled
   True

與 numpy 不同，張量可以輕鬆地移動到 CUDA 記憶體中，也可以從 CUDA 記憶體中移動出來。在 CUDA 中，你幾乎可以執行任何在 CUDA 之外可以執行的操作。如果你的計算機配備了 CUDA，並且你安裝了驅動程式（NVIDIA CUDA 10.0 或更高版本），你可以執行以下操作

cuda = torch.device('cuda')
a = torch.randn(10000, 10000, device=cuda)
b = torch.randn(10000, 10000, device=cuda)
t = time.time(); c = torch.matmul(a, b); print(time.time()-t)

在我的計算機上，時間為 0.4 秒，也就是 $2.5\times 10^{12}$ 次乘法運算每秒。

你可以使用 to 方法輕鬆地將張量移動到 CUDA 記憶體中，也可以從 CUDA 記憶體中移動出來

>>> cuda = torch.device('cuda')
>>> cpu = torch.device('cpu')
>>> a = torch.ones(5,5)
>>> b = a.to(cuda) # move to cuda
>>> c = b.to(cpu) # move back to cpu
>>> a.device
device(type='cpu')
>>> b.device
device(type='cuda')
>>> c.device
device(type='cpu')

你不能在表示式中混合 CUDA 和 CPU 張量

>>> a+b
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: expected backend CPU and dtype Float but got backend CUDA and dtype Float

自動梯度

PyTorch 中實現的 autograd 模組使透過反向傳播計算梯度變得輕而易舉。你需要指定 requires_grad 引數（"requires" 帶 -s，"grad" 不帶 -s），並呼叫 backward 方法。

>>> a=torch.ones(2,2, requires_grad=True)
>>> b=torch.eye(2,2, requires_grad=True)
>>> c = a*a*(b+1)
>>> d=c.sum() 
>>> d.backward() # calculate gradients
>>> a.grad # gradient of d with respect to a
tensor([[4., 2.],
        [2., 4.]])
>>> b.grad # gradient of d with respect to b
tensor([[1., 1.],
        [1., 1.]])

就地運算子