本文共 16330 字,大约阅读时间需要 54 分钟。
title: ‘DeepLearning.ai作业:(4-2)-- 深度卷积网络实例探究(Deep convolutional models:case studies)’
id: dl-ai-4-2h tags:首发于个人博客:,欢迎来访
本周作业分为两部分,一部分是keras的基本使用,另一部分是ResNet的构建。Keras是TensorFlow的高层封装,可以更高效的实现神经网络的搭建。
先导入库
import numpy as npfrom keras import layersfrom keras.layers import Input, Dense, Activation, ZeroPadding2D, BatchNormalization, Flatten, Conv2Dfrom keras.layers import AveragePooling2D, MaxPooling2D, Dropout, GlobalMaxPooling2D, GlobalAveragePooling2Dfrom keras.models import Modelfrom keras.preprocessing import imagefrom keras.utils import layer_utilsfrom keras.utils.data_utils import get_filefrom keras.applications.imagenet_utils import preprocess_inputimport pydotfrom IPython.display import SVGfrom keras.utils.vis_utils import model_to_dotfrom keras.utils import plot_modelfrom kt_utils import *import keras.backend as KK.set_image_data_format('channels_last')import matplotlib.pyplot as pltfrom matplotlib.pyplot import imshow%matplotlib inline
构建模型
def HappyModel(input_shape): """ Implementation of the HappyModel. Arguments: input_shape -- shape of the images of the dataset Returns: model -- a Model() instance in Keras """ ### START CODE HERE ### # Feel free to use the suggested outline in the text above to get started, and run through the whole # exercise (including the later portions of this notebook) once. The come back also try out other # network architectures as well. X_input = Input(input_shape) X = ZeroPadding2D((3, 3))(X_input) X = Conv2D(32,(7,7),strides=(1,1),name="Conv0")(X) X = BatchNormalization(axis = 3, name = 'bn0')(X) X = Activation('relu')(X) X = MaxPooling2D((2, 2), name='max_pool')(X) X = Flatten()(X) X = Dense(1, activation='sigmoid', name='fc')(X) model = Model(inputs = X_input, outputs = X, name='HappyModel') ### END CODE HERE ### return model
然后实例化这个模型
### START CODE HERE ### (1 line)happyModel = HappyModel(X_train.shape[1:])### END CODE HERE ###
进行优化器和loss的选择
### START CODE HERE ### (1 line)happyModel.compile(optimizer='Adam',loss='binary_crossentropy',metrics=['accuracy'])### END CODE HERE ###
训练
### START CODE HERE ### (1 line)happyModel.fit(x=X_train,y = Y_train,epochs=10,batch_size=32)### END CODE HERE ###
预测:
### START CODE HERE ### (1 line)preds = happyModel.evaluate(X_test,Y_test)### END CODE HERE ###print()print ("Loss = " + str(preds[0]))print ("Test Accuracy = " + str(preds[1]))
可以用summary()来看看详细信息:
happyModel.summary()
_________________________________________________________________Layer (type) Output Shape Param # =================================================================input_1 (InputLayer) (None, 64, 64, 3) 0 _________________________________________________________________zero_padding2d_1 (ZeroPaddin (None, 70, 70, 3) 0 _________________________________________________________________Conv0 (Conv2D) (None, 64, 64, 32) 4736 _________________________________________________________________bn0 (BatchNormalization) (None, 64, 64, 32) 128 _________________________________________________________________activation_1 (Activation) (None, 64, 64, 32) 0 _________________________________________________________________max_pool (MaxPooling2D) (None, 32, 32, 32) 0 _________________________________________________________________flatten_1 (Flatten) (None, 32768) 0 _________________________________________________________________fc (Dense) (None, 1) 32769 =================================================================Total params: 37,633Trainable params: 37,569Non-trainable params: 64_________________________________________________________________
用plot_model()来得到详细的graph
plot_model(happyModel, to_file='HappyModel.png')SVG(model_to_dot(happyModel).create(prog='dot', format='svg'))
主要有两个步骤:
这一部分非常深的神经网络的一些问题,主要是参数会变得很小或者爆炸,这样子训练的时候就会收敛的很慢,因此,用残差网络可以有效的改善这个问题。
根据输入输入的维度不同,分为两种块:
1. identity block(一致块)
可以看到,identity block的前后两端维度是一致的,可以直接相加。
在这里我们实现了一个跳跃三层的块。
基本结构是:
First component of main path:
conv_name_base + '2a'
. Use 0 as the seed for the random initialization.bn_name_base + '2a'
.Second component of main path:
conv_name_base + '2b'
. Use 0 as the seed for the random initialization.bn_name_base + '2b'
.Third component of main path:
conv_name_base + '2c'
. Use 0 as the seed for the random initialization.bn_name_base + '2c'
. Note that there is no ReLU activation function in this component.Final step:
注意在跳跃相加部分要用函数keras的函授Add(),不能用加号,不然会出错。
这里f是卷积核的大小,filters是这三层卷积层的深度的list,stage指的是哪一大层的网络,用来取名字的,后面有用,block是在stage下的某一层的网络,用a,b,c,d等字母表示。
# GRADED FUNCTION: identity_blockdef identity_block(X, f, filters, stage, block): """ Implementation of the identity block as defined in Figure 3 Arguments: X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev) f -- integer, specifying the shape of the middle CONV's window for the main path filters -- python list of integers, defining the number of filters in the CONV layers of the main path stage -- integer, used to name the layers, depending on their position in the network block -- string/character, used to name the layers, depending on their position in the network Returns: X -- output of the identity block, tensor of shape (n_H, n_W, n_C) """ # defining name basis conv_name_base = 'res' + str(stage) + block + '_branch' bn_name_base = 'bn' + str(stage) + block + '_branch' # Retrieve Filters F1, F2, F3 = filters # Save the input value. You'll need this later to add back to the main path. X_shortcut = X # First component of main path X = Conv2D(filters = F1, kernel_size = (1, 1), strides = (1,1), padding = 'valid', name = conv_name_base + '2a', kernel_initializer = glorot_uniform(seed=0))(X) X = BatchNormalization(axis = 3, name = bn_name_base + '2a')(X) X = Activation('relu')(X) ### START CODE HERE ### # Second component of main path (≈3 lines) X = Conv2D(filters = F2, kernel_size = (f, f), strides= (1,1), padding = 'same', name = conv_name_base + '2b', kernel_initializer = glorot_uniform(seed=0))(X) X = BatchNormalization(axis = 3, name = bn_name_base + '2b')(X) X = Activation('relu')(X) # Third component of main path (≈2 lines) X = Conv2D(filters = F3, kernel_size = (1, 1), strides= (1,1), padding = 'valid', name = conv_name_base + '2c', kernel_initializer = glorot_uniform(seed=0))(X) X = BatchNormalization(axis = 3, name = bn_name_base + '2c')(X) # Final step: Add shortcut value to main path, and pass it through a RELU activation (≈2 lines) X = Add()([X,X_shortcut]) X = Activation('relu')(X) ### END CODE HERE ### return X
2. The convolutional block(卷积块)
当两端的维度不一致时,可以加一个卷积核来转化维度,这时候没有激活函数。
First component of main path:
conv_name_base + '2a'
.bn_name_base + '2a'
.Second component of main path:
conv_name_base + '2b'
.bn_name_base + '2b'
.Third component of main path:
conv_name_base + '2c'
.bn_name_base + '2c'
. Note that there is no ReLU activation function in this component.Shortcut path:
conv_name_base + '1'
.bn_name_base + '1'
.Final step:
这里参数新增了s是stride每一步数
def convolutional_block(X, f, filters, stage, block, s = 2): """ Implementation of the convolutional block as defined in Figure 4 Arguments: X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev) f -- integer, specifying the shape of the middle CONV's window for the main path filters -- python list of integers, defining the number of filters in the CONV layers of the main path stage -- integer, used to name the layers, depending on their position in the network block -- string/character, used to name the layers, depending on their position in the network s -- Integer, specifying the stride to be used Returns: X -- output of the convolutional block, tensor of shape (n_H, n_W, n_C) """ # defining name basis conv_name_base = 'res' + str(stage) + block + '_branch' bn_name_base = 'bn' + str(stage) + block + '_branch' # Retrieve Filters F1, F2, F3 = filters # Save the input value X_shortcut = X ##### MAIN PATH ##### # First component of main path X = Conv2D(F1, (1, 1), strides = (s,s), name = conv_name_base + '2a', kernel_initializer = glorot_uniform(seed=0))(X) X = BatchNormalization(axis = 3, name = bn_name_base + '2a')(X) X = Activation('relu')(X) ### START CODE HERE ### # Second component of main path (≈3 lines) X = Conv2D(F2, (f, f), strides = (1,1), name = conv_name_base + '2b', padding = 'same', kernel_initializer = glorot_uniform(seed=0))(X) X = BatchNormalization(axis = 3, name = bn_name_base + '2b')(X) X = Activation('relu')(X) # Third component of main path (≈2 lines) X = Conv2D(F3, (1, 1), strides = (1,1), name = conv_name_base + '2c', padding = 'valid', kernel_initializer = glorot_uniform(seed=0))(X) X = BatchNormalization(axis = 3, name = bn_name_base + '2c')(X) ##### SHORTCUT PATH #### (≈2 lines) X_shortcut = Conv2D(F3, (1, 1), strides = (s,s), name = conv_name_base + '1', padding = 'valid', kernel_initializer = glorot_uniform(seed=0))(X_shortcut) X_shortcut = BatchNormalization(axis = 3, name = bn_name_base + '1')(X_shortcut) # Final step: Add shortcut value to main path, and pass it through a RELU activation (≈2 lines) X = Add()([X,X_shortcut]) X = Activation('relu')(X) ### END CODE HERE ### return X
构建一个50层的网络,分为5块,结构如下:
The details of this ResNet-50 model are:
'fc' + str(classes)
.Exercise: Implement the ResNet with 50 layers described in the figure above. We have implemented Stages 1 and 2. Please implement the rest. (The syntax for implementing Stages 3-5 should be quite similar to that of Stage 2.) Make sure you follow the naming convention in the text above.
You’ll need to use this function:
# GRADED FUNCTION: ResNet50def ResNet50(input_shape = (64, 64, 3), classes = 6): """ Implementation of the popular ResNet50 the following architecture: CONV2D -> BATCHNORM -> RELU -> MAXPOOL -> CONVBLOCK -> IDBLOCK*2 -> CONVBLOCK -> IDBLOCK*3 -> CONVBLOCK -> IDBLOCK*5 -> CONVBLOCK -> IDBLOCK*2 -> AVGPOOL -> TOPLAYER Arguments: input_shape -- shape of the images of the dataset classes -- integer, number of classes Returns: model -- a Model() instance in Keras """ # Define the input as a tensor with shape input_shape X_input = Input(input_shape) # Zero-Padding X = ZeroPadding2D((3, 3))(X_input) # Stage 1 X = Conv2D(64, (7, 7), strides = (2, 2), name = 'conv1', kernel_initializer = glorot_uniform(seed=0))(X) X = BatchNormalization(axis = 3, name = 'bn_conv1')(X) X = Activation('relu')(X) X = MaxPooling2D((3, 3), strides=(2, 2))(X) # Stage 2 X = convolutional_block(X, f = 3, filters = [64, 64, 256], stage = 2, block='a', s = 1) X = identity_block(X, 3, [64, 64, 256], stage=2, block='b') X = identity_block(X, 3, [64, 64, 256], stage=2, block='c') ### START CODE HERE ### # Stage 3 (≈4 lines) X = convolutional_block(X, f = 3, filters = [128, 128, 512], stage = 3, block='a', s = 2) X = identity_block(X, 3, [128, 128, 512], stage=3, block='b') X = identity_block(X, 3, [128, 128, 512], stage=3, block='c') X = identity_block(X, 3, [128, 128, 512], stage=3, block='d') # Stage 4 (≈6 lines) X = convolutional_block(X, f = 3, filters = [256, 256, 1024], stage = 4, block='a', s = 2) X = identity_block(X, 3, [256, 256, 1024], stage=4, block='b') X = identity_block(X, 3, [256, 256, 1024], stage=4, block='c') X = identity_block(X, 3, [256, 256, 1024], stage=4, block='d') X = identity_block(X, 3, [256, 256, 1024], stage=4, block='e') X = identity_block(X, 3, [256, 256, 1024], stage=4, block='f') # Stage 5 (≈3 lines) X = convolutional_block(X, f = 3, filters = [512, 512, 2048], stage = 5, block='a', s = 2) X = identity_block(X, 3, [512, 512, 2048], stage=5, block='b') X = identity_block(X, 3, [512, 512, 2048], stage=5, block='c') # AVGPOOL (≈1 line). Use "X = AveragePooling2D(...)(X)" X = AveragePooling2D(pool_size=(2,2),strides=(1,1),padding='valid')(X) ### END CODE HERE ### # output layer X = Flatten()(X) X = Dense(classes, activation='softmax', name='fc' + str(classes), kernel_initializer = glorot_uniform(seed=0))(X) # Create model model = Model(inputs = X_input, outputs = X, name='ResNet50') return model
转载地址:http://urrii.baihongyu.com/