格仔 Blog: Examine Output Size in Tensorflow

Friday, December 11, 2020

Examine Output Size in Tensorflow

When we are uncertain the output size of our tensor processed by some layer, we can go through the following:

x = tf.constant([[1, 1., 1., 2., 3.],
                 [1, 1., 4., 5., 6.],
                 [1, 1., 7., 8., 9.],
                 [1, 1., 7., 8., 9.],
                 [1, 1., 7., 8., 9.]])

x = tf.reshape(x, [1, 5, 5, 1])

print(MaxPool2D((5, 5), strides=(2, 2),  padding="same")(x))
print(math.ceil(5/2))

which yields

print(MaxPool2D((5, 5), strides=(2, 2),  padding="same")(x))
tf.Tensor(
[[[[7.]
   [9.]
   [9.]]

  [[7.]
   [9.]
   [9.]]

  [[7.]
   [9.]
   [9.]]]], shape=(1, 3, 3, 1), dtype=float32)

For layer that has training weight, we may try the following for testing:

model = Conv2D(3, (3, 3), strides=(2, 2), padding="same", kernel_initializer=tf.constant_initializer(1.))


x = tf.constant([[1., 2., 3., 4., 5.],
                 [1., 2., 3., 4., 5.],
                 [1., 2., 3., 4., 5.],
                 [1., 2., 3., 4., 5.],
                 [1., 2., 3., 4., 5.]])

x = tf.reshape(x, (1, 5, 5, 1))
print(model(x))

which yields

x = tf.constant([[1., 2., 3., 4., 5.],...
tf.Tensor(
[[[[ 6.  6.  6.]
   [18. 18. 18.]
   [18. 18. 18.]]

  [[ 9.  9.  9.]
   [27. 27. 27.]
   [27. 27. 27.]]

  [[ 6.  6.  6.]
   [18. 18. 18.]
   [18. 18. 18.]]]], shape=(1, 3, 3, 3), dtype=float32)

In fact it can be proved in both MaxPooling2D and Conv2D that if stride $=s$ and padding$=$same, then

\[\text{output_width} = \left\lfloor\frac{\text{input_width}-1}{s}\right\rfloor + 1 = \left\lceil\frac{\text{input_width}}{s}\right\rceil\]

The last equality deserves a proof as it is not highly trivial:

Fact. For any positive intergers $w,s$, we have \[ \left\lfloor \frac{w-1}{s}\right\rfloor + 1 = \left\lceil \frac{w}{s}\right\rceil. \] Proof. We do case by case study. If $w=ks$ for some positive $k\in \N$, then \[\text{LHS} = \left\lfloor k - \frac{1}{s}\right\rfloor +1 = (k-1)+1=k = \lceil k\rceil = \text{RHS}. \] When $w=ks+j$, for some $k\in\N$ and $j\in \N \cap (0, s)$, then \[ \text{LHS} = \left\lfloor k+\frac{j-1}{s}\right\rfloor + 1 = k+1 = \left\lceil k+\frac{j}{s}\right\rceil = \left\lceil \frac{ks+j}{s}\right\rceil = \left\lceil\frac{w}{s}\right\rceil=\text{RHS}.\qed \]

格仔 Blog

Pages

Friday, December 11, 2020

Examine Output Size in Tensorflow

No comments:

Post a Comment