重要

理解TensorFlow如何处理矩阵对于理解计算图中的数据流动是很重要的。

很多算法都依赖与矩阵运算。TensorFlow可以给我们一个简单操作来完成矩阵运算。对于下面所有的例子,我们通过运行下面的命令都先建立一个 graph session :

1 >>> import tensorflow as tf
2 >>> sess = tf.compat.v1.Session()
3 >>> from tensorflow.python.framework import ops
4 >>> ops.reset_default_graph()
5 >>> tf.compat.v1.disable_eager_execution()

创建一个矩阵

我们可以通过 numpy 数组或者嵌套列表来创建一个二维矩阵,就像我们在张量那一节所描述的那样 ( convert_to_tensor )。我们也可以使用张量创建函数并为这些函数( zeros()ones()truncated_normal() 等等)设定一个二维的形状(因为矩阵就是二维张量)。 TensorFlow也允许我们用 diag() 从一维数组或者列表中创建一个对角矩阵。例如:

对角矩阵

>>> identiy_matrix = tf.compat.v1.diag([1.0, 1.0, 1.0])
>>> print(sess.run(identiy_matrix))
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]

随机矩阵

也就是创建一个二维随机张量

>>> A = tf.compat.v1.truncated_normal([2,3])
>>> print(sess.run(A))
[[ 0.19759183 -1.436814   -1.107715  ]
 [-0.6905967  -0.19711868  0.6596967 ]]

常数矩阵

创建一个二维常数填充张量,也就是常数矩阵

>>> B = tf.fill([2,3],5.0)
>>> print(sess.run(B))
[[5. 5. 5.]
 [5. 5. 5.]]

随机矩阵

创建一个二维随机张量,也就是随机矩阵

>>> C = tf.compat.v1.random_uniform([3,2])
>>> print(sess.run(C))
[[0.3477279  0.39023817]
 [0.38307    0.8967395 ]
 [0.8217212  0.32184577]]

convert_to_tensor

使用内置函数convert_to_tensor将数组转化成张量

>>> D = tf.compat.v1.convert_to_tensor(np.array([[1.,2.,3.],[-3.,-7.,-1.],[0.,5.,-2.]]))
>>> print(sess.run(D))
[[ 1.  2.  3.]
 [-3. -7. -1.]
 [ 0.  5. -2.]]

非传统意义上的矩阵

>>> E = tf.zeros([2,3,3])
>>> print(sess.run(E))
[[[0. 0. 0.]
  [0. 0. 0.]
  [0. 0. 0.]]

 [[0. 0. 0.]
  [0. 0. 0.]
  [0. 0. 0.]]]

矩阵加减法

加法

>>> print(sess.run(A+B))
[[4.2034802 5.6497774 6.104109 ]
 [3.8710573 5.6505775 4.063135 ]]

减法

>>> print(sess.run(B-B))
[[0. 0. 0.]
 [0. 0. 0.]]

乘法

>>> print(sess.run(tf.matmul(B, identiy_matrix)))
[[5. 5. 5.]
 [5. 5. 5.]]

# 矩阵运算需要注意两个的维度,否则容易出错
>>> print(sess.run(tf.matmul(A, B)))
Traceback (most recent call last):
...
ValueError: Dimensions must be equal

# 如果对某个模块不明白,可以调用help函数
>>> help(tf.matmul)
Help on function matmul in module tensorflow.python.ops.math_ops:
...
...

矩阵的转置

>>> print(sess.run(tf.transpose(C)))
[[0.11786842 0.32758367 0.54398596]
 [0.35542393 0.546188   0.6743456 ]]

# 对于行列式,可以用
>>> print(sess.run(tf.compat.v1.matrix_determinant(D)))
-37.99999999999999

矩阵的逆(inverse)

# 注意,如果矩阵是对称正定矩阵,则矩阵的逆是基于Cholesky分解,否则基于LU分解。
>>> print(sess.run(tf.compat.v1.matrix_inverse(D)))
[[-0.5        -0.5        -0.5       ]
 [ 0.15789474  0.05263158  0.21052632]
 [ 0.39473684  0.13157895  0.02631579]]
>>> print(sess.run(tf.compat.v1.cholesky(identiy_matrix)))
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]

矩阵的本征值与向量

# 对于矩阵的本征值和本征向量,用下面的代码
>>> print(sess.run(tf.compat.v1.self_adjoint_eigvals(D)))
[-10.65907521  -0.22750691   2.88658212]
# self_adjoint_eig()输出一个数组是本征值,输出第二数组为本征向量, 这在数学上叫本征分解
>>> print(sess.run(tf.compat.v1.self_adjoint_eig(D)[0]))
[-10.65907521  -0.22750691   2.88658212]
>>> print(sess.run(tf.compat.v1.self_adjoint_eig(D)[1]))
[[ 0.21749542  0.63250104 -0.74339638]
 [ 0.84526515  0.2587998   0.46749277]
 [-0.4880805   0.73004459  0.47834331]]
>>> eigenvalues, eigenvectors = sess.run(tf.compat.v1.self_adjoint_eig(D))
>>> eigenvalues
array([-10.65907521,  -0.22750691,   2.88658212])
>>> eigenvectors
array([[ 0.21749542,  0.63250104, -0.74339638],
       [ 0.84526515,  0.2587998 ,  0.46749277],
       [-0.4880805 ,  0.73004459,  0.47834331]])

Python和Colab的初级知识

要访问 Colab Notebook,请登录 Google 帐户并点击以下链接:

本章学习模块

注意

tf.compat.v1.diag模块介绍

Returns a diagonal tensor with a given diagonal values.

Given a diagonal, this operation returns a tensor with the diagonal and everything else padded with zeros. The diagonal is computed as follows:

Assume diagonal has dimensions [D1,…, Dk], then the output is a tensor of rank 2k with dimensions [D1,…, Dk, D1,…, Dk] where:

output[i1,…, ik, i1,…, ik] = diagonal[i1, …, ik] and 0 everywhere else.

For example:

``` # ‘diagonal’ is [1, 2, 3, 4] tf.diag(diagonal) ==> [[1, 0, 0, 0]

[0, 2, 0, 0] [0, 0, 3, 0] [0, 0, 0, 4]]

```

param diagonal:A Tensor. Must be one of the following types: bfloat16, half, float32, float64, int32, int64, complex64, complex128. Rank k tensor where k is at most 1.
param name:A name for the operation (optional).
returns:A Tensor. Has the same type as diagonal.

注意

tf.compat.v1.convert_to_tensor模块介绍

Converts the given value to a Tensor.

This function converts Python objects of various types to Tensor objects. It accepts Tensor objects, numpy arrays, Python lists, and Python scalars. For example:

```python import numpy as np

def my_func(arg):
arg = tf.convert_to_tensor(arg, dtype=tf.float32) return tf.matmul(arg, arg) + arg

# The following calls are equivalent. value_1 = my_func(tf.constant([[1.0, 2.0], [3.0, 4.0]])) value_2 = my_func([[1.0, 2.0], [3.0, 4.0]]) value_3 = my_func(np.array([[1.0, 2.0], [3.0, 4.0]], dtype=np.float32)) ```

This function can be useful when composing a new operation in Python (such as my_func in the example above). All standard Python op constructors apply this function to each of their Tensor-valued inputs, which allows those ops to accept numpy arrays, Python lists, and scalars in addition to Tensor objects.

Note: This function diverges from default Numpy behavior for float and
string types when None is present in a Python list or scalar. Rather than silently converting None values, an error will be thrown.
param value:

An object whose type has a registered Tensor conversion function.

param dtype:

Optional element type for the returned tensor. If missing, the type is inferred from the type of value.

param name:

Optional name to use if a new Tensor is created.

param preferred_dtype:
 

Optional element type for the returned tensor, used when dtype is None. In some cases, a caller may not have a dtype in mind when converting to a tensor, so preferred_dtype can be used as a soft preference. If the conversion to preferred_dtype is not possible, this argument has no effect.

param dtype_hint:
 

same meaning as preferred_dtype, and overrides it.

returns:

A Tensor based on value.

raises:
  • TypeError – If no conversion function is registered for value to dtype.
  • RuntimeError – If a registered conversion function returns an invalid value.
  • ValueError – If the value is a tensor not of given dtype in graph mode.

注意

tf.matmul模块介绍

Multiplies matrix a by matrix b, producing a * b.

The inputs must, following any transpositions, be tensors of rank >= 2 where the inner 2 dimensions specify valid matrix multiplication dimensions, and any further outer dimensions specify matching batch size.

Both matrices must be of the same type. The supported types are: bfloat16, float16, float32, float64, int32, int64, complex64, complex128.

Either matrix can be transposed or adjointed (conjugated and transposed) on the fly by setting one of the corresponding flag to True. These are False by default.

If one or both of the matrices contain a lot of zeros, a more efficient multiplication algorithm can be used by setting the corresponding a_is_sparse or b_is_sparse flag to True. These are False by default. This optimization is only available for plain matrices (rank-2 tensors) with datatypes bfloat16 or float32.

A simple 2-D tensor matrix multiplication:

>>> a = tf.constant([1, 2, 3, 4, 5, 6], shape=[2, 3])
>>> a  # 2-D tensor
<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[1, 2, 3],
       [4, 5, 6]], dtype=int32)>
>>> b = tf.constant([7, 8, 9, 10, 11, 12], shape=[3, 2])
>>> b  # 2-D tensor
<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[ 7,  8],
       [ 9, 10],
       [11, 12]], dtype=int32)>
>>> c = tf.matmul(a, b)
>>> c  # `a` * `b`
<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 58,  64],
       [139, 154]], dtype=int32)>

A batch matrix multiplication with batch shape [2]:

>>> a = tf.constant(np.arange(1, 13, dtype=np.int32), shape=[2, 2, 3])
>>> a  # 3-D tensor
<tf.Tensor: shape=(2, 2, 3), dtype=int32, numpy=
array([[[ 1,  2,  3],
        [ 4,  5,  6]],
       [[ 7,  8,  9],
        [10, 11, 12]]], dtype=int32)>
>>> b = tf.constant(np.arange(13, 25, dtype=np.int32), shape=[2, 3, 2])
>>> b  # 3-D tensor
<tf.Tensor: shape=(2, 3, 2), dtype=int32, numpy=
array([[[13, 14],
        [15, 16],
        [17, 18]],
       [[19, 20],
        [21, 22],
        [23, 24]]], dtype=int32)>
>>> c = tf.matmul(a, b)
>>> c  # `a` * `b`
<tf.Tensor: shape=(2, 2, 2), dtype=int32, numpy=
array([[[ 94, 100],
        [229, 244]],
       [[508, 532],
        [697, 730]]], dtype=int32)>

Since python >= 3.5 the @ operator is supported (see [PEP 465](https://www.python.org/dev/peps/pep-0465/)). In TensorFlow, it simply calls the tf.matmul() function, so the following lines are equivalent:

>>> d = a @ b @ [[10], [11]]
>>> d = tf.matmul(tf.matmul(a, b), [[10], [11]])
param a:

tf.Tensor of type float16, float32, float64, int32, complex64, complex128 and rank > 1.

param b:

tf.Tensor with same type and rank as a.

param transpose_a:
 

If True, a is transposed before multiplication.

param transpose_b:
 

If True, b is transposed before multiplication.

param adjoint_a:
 

If True, a is conjugated and transposed before multiplication.

param adjoint_b:
 

If True, b is conjugated and transposed before multiplication.

param a_is_sparse:
 

If True, a is treated as a sparse matrix. Notice, this does not support `tf.sparse.SparseTensor`, it just makes optimizations that assume most values in a are zero. See tf.sparse.sparse_dense_matmul for some support for tf.sparse.SparseTensor multiplication.

param b_is_sparse:
 

If True, b is treated as a sparse matrix. Notice, this does not support `tf.sparse.SparseTensor`, it just makes optimizations that assume most values in a are zero. See tf.sparse.sparse_dense_matmul for some support for tf.sparse.SparseTensor multiplication.

param output_type:
 

The output datatype if needed. Defaults to None in which case the output_type is the same as input type. Currently only works when input tensors are type (u)int8 and output_type can be int32.

param name:

Name for the operation (optional).

returns:

A tf.Tensor of the same type as a and b where each inner-most matrix is the product of the corresponding matrices in a and b, e.g. if all transpose or adjoint attributes are False:

output[…, i, j] = sum_k (a[…, i, k] * b[…, k, j]), for all indices i, j.

Note: This is matrix product, not element-wise product.

raises:
  • ValueError – If transpose_a and adjoint_a, or transpose_b and adjoint_b are both set to True.
  • TypeError – If output_type is specified but the types of a, b and output_type is not (u)int8, (u)int8 and int32.

注意

tf.transpose模块介绍

Transposes a, where a is a Tensor.

Permutes the dimensions according to the value of perm.

The returned tensor’s dimension i will correspond to the input dimension perm[i]. If perm is not given, it is set to (n-1…0), where n is the rank of the input tensor. Hence by default, this operation performs a regular matrix transpose on 2-D input Tensors.

If conjugate is True and a.dtype is either complex64 or complex128 then the values of a are conjugated and transposed.

@compatibility(numpy) In numpy transposes are memory-efficient constant time operations as they simply return a new view of the same data with adjusted strides.

TensorFlow does not support strides, so transpose returns a new tensor with the items permuted. @end_compatibility

For example:

>>> x = tf.constant([[1, 2, 3], [4, 5, 6]])
>>> tf.transpose(x)
<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[1, 4],
       [2, 5],
       [3, 6]], dtype=int32)>

Equivalently, you could call tf.transpose(x, perm=[1, 0]).

If x is complex, setting conjugate=True gives the conjugate transpose:

>>> x = tf.constant([[1 + 1j, 2 + 2j, 3 + 3j],
...                  [4 + 4j, 5 + 5j, 6 + 6j]])
>>> tf.transpose(x, conjugate=True)
<tf.Tensor: shape=(3, 2), dtype=complex128, numpy=
array([[1.-1.j, 4.-4.j],
       [2.-2.j, 5.-5.j],
       [3.-3.j, 6.-6.j]])>

‘perm’ is more useful for n-dimensional tensors where n > 2:

>>> x = tf.constant([[[ 1,  2,  3],
...                   [ 4,  5,  6]],
...                  [[ 7,  8,  9],
...                   [10, 11, 12]]])

As above, simply calling tf.transpose will default to perm=[2,1,0].

To take the transpose of the matrices in dimension-0 (such as when you are transposing matrices where 0 is the batch dimension), you would set perm=[0,2,1].

>>> tf.transpose(x, perm=[0, 2, 1])
<tf.Tensor: shape=(2, 3, 2), dtype=int32, numpy=
array([[[ 1,  4],
        [ 2,  5],
        [ 3,  6]],
        [[ 7, 10],
        [ 8, 11],
        [ 9, 12]]], dtype=int32)>

Note: This has a shorthand linalg.matrix_transpose):

param a:A Tensor.
param perm:A permutation of the dimensions of a. This should be a vector.
param conjugate:
 Optional bool. Setting it to True is mathematically equivalent to tf.math.conj(tf.transpose(input)).
param name:A name for the operation (optional).
returns:A transposed Tensor.

注意

tf.compat.v1.matrix_determinant模块介绍

Computes the determinant of one or more square matrices.

The input is a tensor of shape […, M, M] whose inner-most 2 dimensions form square matrices. The output is a tensor containing the determinants for all input submatrices […, :, :].

param input:A Tensor. Must be one of the following types: half, float32, float64, complex64, complex128. Shape is […, M, M].
param name:A name for the operation (optional).
returns:A Tensor. Has the same type as input.

注意

tf.compat.v1.matrix_inverse模块介绍

Computes the inverse of one or more square invertible matrices or their adjoints (conjugate transposes).

The input is a tensor of shape […, M, M] whose inner-most 2 dimensions form square matrices. The output is a tensor of the same shape as the input containing the inverse for all input submatrices […, :, :].

The op uses LU decomposition with partial pivoting to compute the inverses.

If a matrix is not invertible there is no guarantee what the op does. It may detect the condition and raise an exception or it may simply return a garbage result.

param input:A Tensor. Must be one of the following types: float64, float32, half, complex64, complex128. Shape is […, M, M].
param adjoint:An optional bool. Defaults to False.
param name:A name for the operation (optional).
returns:A Tensor. Has the same type as input.

注意

tf.compat.v1.cholesky模块介绍

Computes the Cholesky decomposition of one or more square matrices.

The input is a tensor of shape […, M, M] whose inner-most 2 dimensions form square matrices.

The input has to be symmetric and positive definite. Only the lower-triangular part of the input will be used for this operation. The upper-triangular part will not be read.

The output is a tensor of the same shape as the input containing the Cholesky decompositions for all input submatrices […, :, :].

Note: The gradient computation on GPU is faster for large matrices but not for large batch dimensions when the submatrices are small. In this case it might be faster to use the CPU.

param input:A Tensor. Must be one of the following types: float64, float32, half, complex64, complex128. Shape is […, M, M].
param name:A name for the operation (optional).
returns:A Tensor. Has the same type as input.

注意

tf.compat.v1.self_adjoint_eigvals模块介绍

Computes the eigen decomposition of a batch of self-adjoint matrices.

Computes the eigenvalues and eigenvectors of the innermost N-by-N matrices in tensor such that tensor[…,:,:] * v[…, :,i] = e[…, i] * v[…,:,i], for i=0…N-1.

param tensor:

Tensor of shape […, N, N]. Only the lower triangular part of each inner inner matrix is referenced.

param name:

string, optional name of the operation.

returns:

Eigenvalues. Shape is […, N]. Sorted in non-decreasing order. v: Eigenvectors. Shape is […, N, N]. The columns of the inner most

matrices contain eigenvectors of the corresponding matrices in tensor

rtype:

e