UP | HOME

Numpy

Table of Contents

1 Numpy

Numpy 是 Python 科学计算的基础包,主要有以下一些功能:

  1. N 维数组
  2. 线性代数,傅里叶变换等
import numpy as np

2 基本用法

Numpy 的数组类是 ndarray ,它的别名是 array 。主要 numpy.array 和 Python 自带的 array.array 是不同的。

属性 含义
arr.ndim 维度
arr.shape 每个维度的大小
arr.size 数组的所有元素个数
arr.dtype 元素的数据类型
arr.itemsize 每个元素的字节数
arr.data 实际数据的引用
a0 = np.arange(24).reshape((2, 3, 4))
print(a0)
print(a0.ndim)
print(a0.shape)
print(a0.size)
print(a0.dtype)
print(a0.itemsize)
print(a0.data)
[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]
3
(2, 3, 4)
24
int32
4
<memory at 0x0000021786B458B8>

Numpy 数组和普通数组的区别如下

a0 = [1, 2, 3]
print(a0)
print(type(a0))
b0 = np.array([6, 7, 8])
print(b0)
print(type(b0))
[1, 2, 3]
<class 'list'>
[6 7 8]
<class 'numpy.ndarray'>

3 创建数组

a1 = np.array([1, 2, 3, 4])
a2 = np.array([(1.5, 2, 3), (4, 5, 6)])
a3 = np.array([[1, 2], [3, 4]], dtype=complex)
print(a1)
print(a2)
print(a3)
[1 2 3 4]
[[1.5 2.  3. ]
 [4.  5.  6. ]]
[[1.+0.j 2.+0.j]
 [3.+0.j 4.+0.j]]

4 初始化数组

Numpy 初始化数组一般是生成一定维度的列表,具体如下

np.zeros((3, 4))
np.ones((2, 3, 4), dtype=np.int16)
np.empty((2, 3))

np.arange(10, 30, 5)  # samillar to range()
np.arange(1, 2, 0.3)  # but accept float

from numpy import pi
np.linspace(0, 2, 9)
x = np.linspace(0, 2 * pi, 10)
y = np.sin(x)

print(list(zip(x, y)))
[(0.0, 0.0), (0.6981317007977318, 0.6427876096865393), (1.3962634015954636, 0.984807753012208), (2.0943951023931953, 0.8660254037844387), (2.792526803190927, 0.3420201433256689), (3.490658503988659, -0.34202014332566866), (4.1887902047863905, -0.8660254037844385), (4.886921905584122, -0.9848077530122081), (5.585053606381854, -0.6427876096865396), (6.283185307179586, -2.4492935982947064e-16)]

随机生成数组,获取一些统计信息

a0 = np.random.random((2, 3))
print(a0)
print(a0.min(), a0.max())
print(a0.sum())
print(a0.mean())

a1 = np.random.normal(0, 0.1, 10)  # mu, sigma, count
print(a1[a1 > 0])
print(a1.mean())
[[0.9252851  0.7129627  0.40223125]
 [0.90696639 0.43173784 0.03253335]]
0.0325333525900483 0.9252851004000952
3.4117166388143234
0.5686194398023873
[0.11057307 0.07345595 0.06929216 0.13555647]
-0.03848054532856991

可以通过 Reshape 快速地修改 Python 数组的维度,这个方法非常有用

a1 = np.arange(12)
print(a1)
a2 = a1.reshape(4, 3)
print(a2)
print(a2.shape)
[ 0  1  2  3  4  5  6  7  8  9 10 11]
[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]
(4, 3)

当给出数组的维度为 -1 时,代表需要自推导维度大小

a0 = np.arange(30)
a0.shape = 2, -1, 3  # -1 means "whatever is needed"
print(a0.shape)

a1 = a0.reshape(-1, 10)
print(a1.shape)
(2, 5, 3)
(3, 10)

5 拼接数组

可以使用 vstack 或者 hstack 来拼接数组,不过这样的效率比较低,一般建议直接初始 化对应大小的数组,然后使用 reshape

a1 = np.arange(4).reshape((2, -1))
a2 = a1 + 20

a3 = np.vstack((a1, a2))
print(a3)

a4 = np.hstack((a1, a2))
print(a4)
[[ 0  1]
 [ 2  3]
 [20 21]
 [22 23]]
[[ 0  1 20 21]
 [ 2  3 22 23]]

6 基本运算和矩阵相关算法

Numpy 重载了 Python 的数学操作符,基本操作如下

a = np.array([20, 30, 40, 50])
b = np.arange(4)
c = a - b
print(b**2) # power
10 * np.sin(a)
print(a < 35)
[0 1 4 9]
[ True  True False False]

矩阵的乘法和矩阵里面每个元素相乘

A = np.array([[1,1], [0,1]] )
B = np.array([[2,0], [3,4]] )
print(A * B)        # elementwise product
print(A @ B)        # matrix product
print(A.dot(B))     # another matrix product
[[2 0]
 [0 4]]
[[5 4]
 [3 4]]
[[5 4]
 [3 4]]

基本线性代数的操作

a0 = np.array([[1.0, 2.0], [3.0, 4.0]])
a1 = a0.T # same as a0.transpose()
print(a1)

a2 = np.linalg.inv(a0)
print(a2)

I = np.eye(2)
J = np.array([[0.0, -1.0], [1.0, 0.0]])
J2 = J @ J  # matrix product
T0 = np.trace(I)  # trace
y = np.array([[5.], [7.]])
res1 = np.linalg.solve(a0, y)
print(res1)
res2 = np.linalg.eig(j)
print(res2)
[[1. 3.]
 [2. 4.]]
[[-2.   1. ]
 [ 1.5 -0.5]]
[[-3.]
 [ 4.]]
(array([0.69722436, 4.30277564]), array([[-0.60889368, -0.3983218 ],
       [ 0.79325185, -0.91724574]]))

7 深度复制 copy 方法

a0 = np.arange(6)
b0 = a0.copy()        # a new array object with new data is created
print(b0)
[0 1 2 3 4 5]

8 索引

Numpy 的索引是一个比较有用的操作,可以实现快速操作,防止进行使用速度比较慢的 for 循环,同时也可以使代码读起来比较简洁

8.1 使用 Indices 索引

a0 = np.arange(12)**2  # the first 12 square numbers
print(a0)
print(a[2])  # the elements of a at the positions i

b0 = np.array([1, 1, 3, 8, 5])  # an array of indices
print(a0[b0]) # get elements with indices as b0
[  0   1   4   9  16  25  36  49  64  81 100 121]
40
[ 1  1  9 64 25]

在实际工程可以使用下标索引来掩码 (mask),挑选出对应的色块

palette = np.array([
    [0, 0, 0],  # black
    [255, 0, 0],  # red
    [0, 255, 0],  # green
    [0, 0, 255],  # blue
    [255, 255, 255]  # white
])
image = np.array([
    [0, 1, 2, 0],  # each value corresponds to a color in the palette
    [0, 3, 4, 0]
])
res = palette[image]  # the (2,4,3) color image
print(res)
[[[  0   0   0]
  [255   0   0]
  [  0 255   0]
  [  0   0   0]]

 [[  0   0   0]
  [  0   0 255]
  [255 255 255]
  [  0   0   0]]]

a0 = np.arange(12).reshape(3, 4)
i = np.array([[0, 1], [1, 2]])  # indices for the first dim of a
j = np.array([[2, 1], [3, 3]])  # indices for the second dim
print(a0)

a1 = a0[i, j]  # i and j must have equal shape
print(a1)

b1 = a0[i, 2]
print(b1)

b2 = a0[:, j]  # i.e., a[ : , j]
print(b2)
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[[ 2  5]
 [ 7 11]]
[[ 2  6]
 [ 6 10]]
[[[ 2  1]
  [ 3  3]]

 [[ 6  5]
  [ 7  7]]

 [[10  9]
  [11 11]]]

8.2 使用布尔值索引

Numpy 支持使用条件

a0 = np.arange(6).reshape(3, 2)
b0 = a0 > 4
print(b0)
print(a0[b0])  # 1d array with the selected elements

a0[b0] = 0  # All elements of 'a' higher than 4 become 0
print(a0)
[[False False]
 [False False]
 [False  True]]
[5]
[[0 1]
 [2 3]
 [4 0]]

多维度的索引

a0 = np.arange(10).reshape(2, -1)
print(a0)
b1 = np.array([False, True])  # first dim selection
b2 = np.array([True, False, True, False, False])  # second dim selection
print(a0[b1, :])  # selecting rows
a0[b1]  # same thing
a0[:, b2]  # selecting columns
print(a0[b1, b2])  # a weird thing to do
[[0 1 2 3 4]
 [5 6 7 8 9]]
[[5 6 7 8 9]]
[5 7]

9 借助 matplotlib 画一些统计图

import numpy as np
import matplotlib.pyplot as plt
filename = '../static/image/2018/10/numpy1.png'

# Build a vector of 10000 normal deviates with variance 0.5^2 and mean 2
mu, sigma = 2, 0.5
v = np.random.normal(mu,sigma,10000)
# Plot a normalized histogram with 50 bins
plt.hist(v, bins=50, density=1)       # matplotlib version (plot)
plt.savefig(filename)
return filename

numpy1.png

import numpy as np
import matplotlib.pyplot as plt
filename = '../static/image/2018/10/numpy2.png'

# Build a vector of 10000 normal deviates with variance 0.5^2 and mean 2
mu, sigma = 2, 0.5
v = np.random.normal(mu,sigma,10000)
# Compute the histogram with numpy and then plot it
(n, bins) = np.histogram(v, bins=50, density=True)  # NumPy version (no plot)
plt.plot(.5*(bins[1:]+bins[:-1]), n)
plt.savefig(filename)
return filename

numpy2.png

10 安装 Numpy 和测试

使用 pip 安装

pip install numpy

运行 Numpy 自带的测试

python -c "import numpy; numpy.test()"

11 参考链接

Last Updated 2020-10-12 Mon 20:20. Created by Jinghui Hu at 2018-10-26 Fri 11:04.