The ConvNeXt class is a class written in Python that implements the ConvNeXt model, which is a convolutional neural network that uses grouped convolutions and layer scaling to improve performance and efficiency. The ConvNeXt class has the following attributes and methods:
ConvNeXt(model_type='base',drop_path_rate=0.0,layer_scale_init_value=1e-6,classes=1000,include_top=True,pooling=None)
- model_type: A string that indicates the type of the model, which can be one of 'tiny', 'small', 'base', 'large' or 'xlarge'. Different model types have different depths and projection dimensions.
- drop_path_rate: A float that indicates the probability of stochastic depth, i.e., the probability of each convolutional block being dropped. Stochastic depth can improve the generalization and robustness of the model.
- layer_scale_init_value: A float that indicates the initial value of layer scaling, i.e., the coefficient that each convolutional block's output is multiplied by. Layer scaling can stabilize the training and convergence of the model.
- classes: An integer that indicates the number of classes for the classification task. If include_top is True, the last layer of the model is a fully connected layer that outputs neurons equal to the number of classes.
- include_top: A boolean that indicates whether to include the top layer. If True, the last layer of the model is a fully connected layer that outputs neurons equal to the number of classes. If False, the last layer of the model is a global average pooling layer or a global max pooling layer, depending on the pooling parameter.
- pooling: A string or None that indicates the pooling method. If include_top is False, the last layer of the model is a pooling layer, depending on the pooling parameter. If pooling is 'avg', global average pooling is used; if pooling is 'max', global max pooling is used; if pooling is None, no pooling is used.
- device: A string, indicating the device to use for computation. The default value is 'GPU', which means using GPU if available. Other value is 'CPU'.
- loss_object: A TensorFlow object that indicates the loss function. The default loss function is categorical cross-entropy.
- optimizer: A parallel optimizer for Note. The default optimizer is Adam.
- param: A list that stores all the parameters (weights and biases) of the model.
- km: An integer that indicates the kernel mode.
- build(dtype='float32'): A method that builds the structure of the model. It accepts one parameter dtype, which indicates the data type, defaulting to 'float32'. This method creates all the convolutional blocks, downsampling blocks, normalization layers, pooling layers and fully connected layers required by the model and stores them in their respective attributes.
- fp(data, p): A method that performs forward propagation. It accepts two parameters data and p, which indicate the input data and process number respectively. This method passes the input data through all the layers of the model in turn and returns the output data.
- loss(output, labels, p): A method that calculates the loss value. It accepts three parameters output, labels and p, which indicate the output data, true labels and process number respectively. This method uses the loss function to calculate the difference between the output data and true labels and returns the loss value.
- GradientTape(data, labels, p): A method, used to calculate the gradient value. It accepts three arguments data, labels and p, indicating the instance input data, true labels and device number respectively. This method uses a persistent gradient tape to record the operations and compute the gradient of the loss with respect to the parameters. This method returns the tape, output data and loss value.
- opt(gradient, p): A method that performs optimization update. It accepts two parameters gradient and p, which indicate the gradient value and process number respectively. This method uses the optimizer to update all parameters of the model according to gradient value and returns updated parameters.
The ConvNeXtV2 class is a Python class that implements the ConvNeXtV2 model, which is a type of convolutional neural network that uses global response normalization and depthwise separable convolutions to improve the efficiency and accuracy of image classification tasks. The ConvNeXtV2 model has several variants, such as tiny, small, medium and large, that differ in their number of layers and feature dimensions. The ConvNeXtV2 class has the following attributes and methods:
ConvNeXtV2(
model_type='tiny', classes=1000,
classifier_activation="softmax",
drop_path_rate=0., head_init_scale=1., include_top=True,
pooling=None, device='GPU'
)
- model_type: A string, indicating the variant of the model. The default value is 'tiny', which corresponds to the smallest model.
- classes: An int, indicating the number of classes for the classification task. If include_top is True, the model's last layer is a fully connected layer that outputs classes neurons.
- classifier_activation: A string or None, indicating the activation function for the classifier layer. If include_top is True, this argument specifies the activation function for the last layer. The default value is "softmax", which corresponds to the softmax activation function. Other possible values are "sigmoid", "tanh", etc.
- drop_path_rate: A float, indicating the stochastic depth rate. The default value is 0., which means no stochastic depth is applied. If greater than 0., some blocks are randomly dropped during training to reduce overfitting and improve generalization.
- head_init_scale: A float, indicating the initialization scaling value for classifier weights and biases. The default value is 1., which means no scaling is applied. If different from 1., the weights and biases are multiplied by this value after initialization.
- include_top: A bool, indicating whether to include the top layer for classification. If True, the model's last layer is a fully connected layer that outputs classes neurons. If False, the model's last layer is a global average pooling layer or a global max pooling layer, depending on the pooling argument.
- pooling: A string or None, indicating the pooling method. If include_top is False, the model's last layer is a pooling layer, depending on the pooling argument. If pooling is 'avg', global average pooling is used; if pooling is 'max', global max pooling is used; if pooling is None, no pooling is used.
- device: A string, indicating the device to use for computation. The default value is 'GPU', which means using GPU if available. Other value is 'CPU'.
- downsample_layers: A list of Layers objects, storing the stem and three intermediate downsampling convolution layers of the model.
- stages: A list of Layers objects, storing four feature resolution stages of the model, each consisting of multiple residual blocks.
- loss_object: A TensorFlow object, indicating the loss function. The default loss function is categorical crossentropy loss.
- optimizer: A TensorFlow object, indicating the optimizer. The default optimizer is Adam optimizer.
- param: A list, storing all the parameters (weights and biases) of the model.
- km: An integer that indicates the kernel mode.
- build(dtype='float32'): A method, used to build the model's structure. It accepts one argument
dtype
, indicating the data type, defaulting to 'float32'. This method creates all the convolutional layers, depthwise separable convolutional layers, inverted residual blocks, batch normalization layers, activation layers, dropout layers, drop connect layers, global pooling layers and fully connected layers that are needed for the model, and stores them in corresponding attributes. - fp(data,p=None): A method, used to perform forward propagation. It accepts two arguments
data
andp
, indicating the input data and process number respectively. This method passes the input data through all the layers of the model and returns the output data. - loss(output, labels,p): A method, used to calculate the loss value. It accepts three arguments
output
,labels
andp
, indicating the output data, true labels and process number respectively. This method uses the loss function to calculate the difference between output data and true labels and returns the loss value. - GradientTape(data, labels, p): A method, used to calculate the gradient value. It accepts three arguments
data
,labels
andp
, indicating the input data, true labels and process number respectively. This method uses a persistent gradient tape to record the operations and compute the gradient of the loss with respect to the parameters. It returns the tape, output data and loss value. - opt(gradient, p): A method, used to perform optimization update. It accepts two arguments
gradient
andp
, indicating the gradient value and process number respectively. This method uses the optimizer to update all the parameters of the model according to gradient value and returns updated parameters.
The DenseNet121 class is a Python class that implements the DenseNet-121 model, which is a type of convolutional neural network that uses dense blocks and transition layers to improve the feature extraction and efficiency of convolutional neural networks. The DenseNet121 class has the following attributes and methods:
DenseNet121(growth_rate=32, compression_factor=0.5, num_classes=1000, include_top=True, pooling=None, dtype='float32')
- growth_rate: An int, indicating the number of filters added by each dense layer. A larger growth rate increases the model size and complexity.
- compression_factor: a float, indicating the compression factor for the transition layers. A smaller compression factor reduces the number of filters and the model size.
- num_classes: An int, indicating the number of classes for the classification task. If include_top is True, the model's last layer is a fully connected layer that outputs num_classes neurons.
- include_top: A bool, indicating whether to include the top layer for classification. If True, the model's last layer is a fully connected layer that outputs num_classes neurons. If False, the model's last layer is a global average pooling layer or a global max pooling layer, depending on the pooling argument.
- pooling: A string or None, indicating the pooling method. If include_top is False, the model's last layer is a pooling layer, depending on the pooling argument. If pooling is 'avg', global average pooling is used; if pooling is 'max', global max pooling is used; if pooling is None, no pooling is used.
- device: A string, indicating the device to use for computation. The default value is 'GPU', which means using GPU if available. Other value is 'CPU'.
- dtype: A string or TensorFlow dtype object, indicating the data type for computation. The default value is 'float32', which corresponds to 32-bit floating point numbers.
- loss_object: A TensorFlow object, indicating the loss function. The default loss function is categorical crossentropy loss.
- optimizer: A parallel optimizer for Note. The default optimizer is Adam.
- param: A list, storing all the parameters (weights and biases) of the model.
- km: An integer that indicates the kernel mode.
- build(): a method, used to build the model's structure. This method creates all the convolutional layers, dense layers, batch normalization layers, average pooling layers and fully connected layers that are needed for the model, and stores them in corresponding attributes.
- fp(data, p): A method, used to perform forward propagation. It accepts two arguments
data
andp
, indicating the input data and process number respectively. This method passes the input data through all the layers of the model and returns the output data. - loss(output, labels, p): A method, used to calculate the loss value. It accepts three arguments
output
,labels
andp
, indicating the output data, true labels and process number respectively. This method uses the loss function to calculate the difference between output data and true labels and returns the loss value. - GradientTape(data, labels, p): A method, used to calculate the gradient value. It accepts three arguments data, labels and p, indicating the instance input data, true labels and device number respectively. This method uses a persistent gradient tape to record the operations and compute the gradient of the loss with respect to the parameters. This method returns the tape, output data and loss value.
- opt(gradient, p): A method, used to perform optimization update. It accepts two arguments
gradient
andp
, indicating the gradient value and process number respectively. This method uses the optimizer to update all the parameters of the model according to gradient value and returns updated parameters.
The DenseNet169 class is a Python class that implements the DenseNet-169 model, which is a type of convolutional neural network that uses dense blocks and transition layers to improve the feature extraction and efficiency of convolutional neural networks. The DenseNet169 class has the following attributes and methods:
DenseNet169(growth_rate=32, compression_factor=0.5, num_classes=1000, include_top=True, pooling=None, dtype='float32')
- growth_rate: An int, indicating the number of filters added by each dense layer. A larger growth rate increases the model size and complexity.
- compression_factor: a float, indicating the compression factor for the transition layers. A smaller compression factor reduces the number of filters and the model size.
- num_classes: An int, indicating the number of classes for the classification task. If include_top is True, the model's last layer is a fully connected layer that outputs num_classes neurons.
- include_top: A bool, indicating whether to include the top layer for classification. If True, the model's last layer is a fully connected layer that outputs num_classes neurons. If False, the model's last layer is a global average pooling layer or a global max pooling layer, depending on the pooling argument.
- pooling: A string or None, indicating the pooling method. If include_top is False, the model's last layer is a pooling layer, depending on the pooling argument. If pooling is 'avg', global average pooling is used; if pooling is 'max', global max pooling is used; if pooling is None, no pooling is used.
- device: A string, indicating the device to use for computation. The default value is 'GPU', which means using GPU if available. Other value is 'CPU'.
- dtype: A string or TensorFlow dtype object, indicating the data type for computation. The default value is 'float32', which corresponds to 32-bit floating point numbers.
- loss_object: A TensorFlow object, indicating the loss function. The default loss function is categorical crossentropy loss.
- optimizer: A parallel optimizer for Note. The default optimizer is Adam.
- param: A list, storing all the parameters (weights and biases) of the model.
- km: An integer that indicates the kernel mode.
- build(): a method, used to build the model's structure. This method creates all the convolutional layers, dense layers, batch normalization layers, average pooling layers and fully connected layers that are needed for the model, and stores them in corresponding attributes.
- fp(data, p): A method, used to perform forward propagation. It accepts two arguments
data
andp
, indicating the input data and process number respectively. This method passes the input data through all the layers of the model and returns the output data. - loss(output, labels, p): A method, used to calculate the loss value. It accepts three arguments
output
,labels
andp
, indicating the output data, true labels and process number respectively. This method uses the loss function to calculate the difference between output data and true labels and returns the loss value. - GradientTape(data, labels, p): A method, used to calculate the gradient value. It accepts three arguments data, labels and p, indicating the instance input data, true labels and device number respectively. This method uses a persistent gradient tape to record the operations and compute the gradient of the loss with respect to the parameters. This method returns the tape, output data and loss value.
- opt(gradient, p): A method, used to perform optimization update. It accepts two arguments
gradient
andp
, indicating the gradient value and process number respectively. This method uses the optimizer to update all the parameters of the model according to gradient value and returns updated parameters.
The DenseNet201 class is a Python class that implements the DenseNet-201 model, which is a type of convolutional neural network that uses dense blocks and transition layers to improve the feature extraction and efficiency of convolutional neural networks. The DenseNet201 class has the following attributes and methods:
DenseNet201(growth_rate=32, compression_factor=0.5, num_classes=1000, include_top=True, pooling=None, dtype='float32')
- growth_rate: An int, indicating the number of filters added by each dense layer. A larger growth rate increases the model size and complexity.
- compression_factor: a float, indicating the compression factor for the transition layers. A smaller compression factor reduces the number of filters and the model size.
- num_classes: An int, indicating the number of classes for the classification task. If include_top is True, the model's last layer is a fully connected layer that outputs num_classes neurons.
- include_top: A bool, indicating whether to include the top layer for classification. If True, the model's last layer is a fully connected layer that outputs num_classes neurons. If False, the model's last layer is a global average pooling layer or a global max pooling layer, depending on the pooling argument.
- pooling: A string or None, indicating the pooling method. If include_top is False, the model's last layer is a pooling layer, depending on the pooling argument. If pooling is 'avg', global average pooling is used; if pooling is 'max', global max pooling is used; if pooling is None, no pooling is used.
- device: A string, indicating the device to use for computation. The default value is 'GPU', which means using GPU if available. Other value is 'CPU'.
- dtype: A string or TensorFlow dtype object, indicating the data type for computation. The default value is 'float32', which corresponds to 32-bit floating point numbers.
- loss_object: A TensorFlow object, indicating the loss function. The default loss function is categorical crossentropy loss.
- optimizer: A parallel optimizer for Note. The default optimizer is Adam.
- param: A list, storing all the parameters (weights and biases) of the model.
- km: An integer that indicates the kernel mode.
- build(): a method, used to build the model's structure. This method creates all the convolutional layers, dense layers, batch normalization layers, average pooling layers and fully connected layers that are needed for the model, and stores them in corresponding attributes.
- fp(data, p): A method, used to perform forward propagation. It accepts two arguments
data
andp
, indicating the input data and process number respectively. This method passes the input data through all the layers of the model and returns the output data. - loss(output, labels, p): A method, used to calculate the loss value. It accepts three arguments
output
,labels
andp
, indicating the output data, true labels and process number respectively. This method uses the loss function to calculate the difference between output data and true labels and returns the loss value. - GradientTape(data, labels, p): A method, used to calculate the gradient value. It accepts three arguments data, labels and p, indicating the instance input data, true labels and device number respectively. This method uses a persistent gradient tape to record the operations and compute the gradient of the loss with respect to the parameters. This method returns the tape, output data and loss value.
- opt(gradient, p): A method, used to perform optimization update. It accepts two arguments
gradient
andp
, indicating the gradient value and process number respectively. This method uses the optimizer to update all the parameters of the model according to gradient value and returns updated parameters.
The EfficientNet class is a Python class that implements the EfficientNet model, which is a type of convolutional neural network that uses a compound scaling method to balance the network depth, width and resolution. It also uses inverted residual blocks with depthwise separable convolutions to reduce the computational cost and parameter count. The model has several variants, such as B0 to B7, that differ in their scaling coefficients and input sizes. The EfficientNet class has the following attributes and methods:
EfficientNet(
input_shape,
model_name='B0',
drop_connect_rate=0.2,
depth_divisor=8,
activation="swish",
blocks_args="default",
include_top=True,
weights="imagenet",
input_tensor=None,
pooling=None,
classes=1000,
classifier_activation="softmax",
device='GPU',
dtype='float32'
)
- input_shape: A tuple of three integers, indicating the shape of the input data, excluding the batch dimension.
- model_name: A string, indicating the variant of the model. The default value is 'B0', which corresponds to the base model. Other possible values are 'B1' to 'B7', which correspond to different scaling coefficients and input sizes.
- drop_connect_rate: A float, indicating the dropout rate for the drop connect layer. The default value is 0.2, which means 20% of the connections are randomly dropped.
- depth_divisor: An integer, indicating the divisor for rounding the filters and repeats. The default value is 8, which means the filters and repeats are rounded to the nearest multiple of 8.
- activation: A string, indicating the activation function for the model. The default value is "swish", which corresponds to the swish activation function. Other possible values are "relu", "sigmoid", "tanh", etc.
- blocks_args: A list of dictionaries or "default", indicating the arguments for each block of the model. Each dictionary contains the keys "kernel_size", "repeats", "filters_in", "filters_out", "expand_ratio", "id_skip", "strides", "se_ratio" and "conv_type", which correspond to different parameters for each block. If blocks_args is "default", the default arguments from the original paper are used.
- include_top: A bool, indicating whether to include the top layer for classification. If True, the model's last layer is a fully connected layer that outputs classes neurons. If False, the model's last layer is a global average pooling layer or a global max pooling layer, depending on the pooling argument.
- weights: A string or None, indicating the initial weights for the model. If weights is "imagenet", the model is initialized with pre-trained weights on ImageNet dataset. If weights is None, the model is initialized with random weights. If weights is a path to a file, the model is initialized with weights from that file.
- input_tensor: A TensorFlow tensor or None, indicating the input tensor for the model. If input_tensor is None, a new input tensor is created with shape (None,) + input_shape.
- pooling: A string or None, indicating the pooling method. If include_top is False, the model's last layer is a pooling layer, depending on the pooling argument. If pooling is 'avg', global average pooling is used; if pooling is 'max', global max pooling is used; if pooling is None, no pooling is used.
- classes: An int, indicating the number of classes for the classification task. If include_top is True, the model's last layer is a fully connected layer that outputs classes neurons.
- classifier_activation: A string or None, indicating the activation function for the classifier layer. If include_top is True, this argument specifies the activation function for the last layer. The default value is "softmax", which corresponds to the softmax activation function. Other possible values are "sigmoid", "tanh", etc.
- device: A string, indicating the device to use for computation. The default value is 'GPU', which means using GPU if available. Other value is 'CPU'.
- dtype: A string or TensorFlow dtype object, indicating the data type for computation. The default value is 'float32', which corresponds to 32-bit floating point numbers.
- loss_object: A TensorFlow object, indicating the loss function. The default loss function is categorical crossentropy loss.
- optimizer: A parallel optimizer for Note. The default optimizer is Adam.
- param: A list, storing all the parameters (weights and biases) of the model.
- km: An integer that indicates the kernel mode.
- build(): A method, used to build the model's structure. It accepts no arguments. This method creates all the convolutional layers, depthwise separable convolutional layers, inverted residual blocks, batch normalization layers, activation layers, dropout layers, drop connect layers, global pooling layers and fully connected layers that are needed for the model, and stores them in corresponding attributes.
- fp(data, p): A method, used to perform forward propagation. It accepts two arguments
data
andp
, indicating the input data and process number respectively. This method passes the input data through all the layers of the model and returns the output data. - loss(output, labels, p): A method, used to calculate the loss value. It accepts three arguments
output
,labels
andp
, indicating the output data, true labels and process number respectively. This method uses the loss function to calculate the difference between output data and true labels and returns the loss value. - GradientTape(data, labels, p): A method, used to calculate the gradient value. It accepts three arguments data, labels and p, indicating the instance input data, true labels and device number respectively. This method uses a persistent gradient tape to record the operations and compute the gradient of the loss with respect to the parameters. This method returns the tape, output data and loss value.
- opt(gradient, p): A method, used to perform optimization update. It accepts two arguments
gradient
andp
, indicating the gradient value and process number respectively. This method uses the optimizer to update all the parameters of the model according to gradient value and returns updated parameters.
The EfficientNetV2 class is a Python class that implements the EfficientNetV2 model, which is a type of convolutional neural network that uses a compound scaling method to balance the network depth, width and resolution. It also uses inverted residual blocks with depthwise separable convolutions to reduce the computational cost and parameter count. The EfficientNetV2 model introduces some new features, such as fused MBConv blocks, progressive learning of feature maps, and self-training with noisy student. The model has several variants, such as B0 to M2, that differ in their scaling coefficients and input sizes. The EfficientNetV2 class has the following attributes and methods:
EfficientNetV2(
input_shape,
model_name="efficientnetv2-b0",
dropout_rate=0.2,
drop_connect_rate=0.2,
depth_divisor=8,
min_depth=8,
bn_momentum=0.9,
activation="swish",
blocks_args="default",
include_top=True,
weights="imagenet",
pooling=None,
classes=1000,
classifier_activation="softmax",
include_preprocessing=True,
device='GPU',
dtype='float32'
)
- input_shape: A tuple of three integers, indicating the shape of the input data, excluding the batch dimension.
- model_name: A string, indicating the variant of the model. The default value is 'B0', which corresponds to the base model. Other possible values are 'B1' to 'M2', which correspond to different scaling coefficients and input sizes.
- dropout_rate: A float, indicating the dropout rate for the dropout layer. The default value is 0.2, which means 20% of the neurons are randomly dropped.
- drop_connect_rate: A float, indicating the dropout rate for the drop connect layer. The default value is 0.2, which means 20% of the connections are randomly dropped.
- depth_divisor: An integer, indicating the divisor for rounding the filters and repeats. The default value is 8, which means the filters and repeats are rounded to the nearest multiple of 8.
- min_depth: An integer or None, indicating the minimum depth for rounding the filters. The default value is 8, which means the minimum depth is 8. If None, no minimum depth is applied.
- bn_momentum: A float, indicating the momentum for the batch normalization layer. The default value is 0.9, which means 90% of the previous moving average is retained.
- activation: A string, indicating the activation function for the model. The default value is "swish", which corresponds to the swish activation function. Other possible values are "relu", "sigmoid", "tanh", etc.
- blocks_args: A list of dictionaries or "default", indicating the arguments for each block of the model. Each dictionary contains the keys "kernel_size", "num_repeat", "input_filters", "output_filters", "expand_ratio", "id_skip", "strides", "se_ratio" and "conv_type", which correspond to different parameters for each block. If blocks_args is "default", the default arguments from the original paper are used.
- include_top: A bool, indicating whether to include the top layer for classification. If True, the model's last layer is a fully connected layer that outputs classes neurons. If False, the model's last layer is a global average pooling layer or a global max pooling layer, depending on the pooling argument.
- weights: A string or None, indicating the initial weights for the model. If weights is "imagenet", the model is initialized with pre-trained weights on ImageNet dataset. If weights is None, the model is initialized with random weights. If weights is a path to a file, the model is initialized with weights from that file.
- pooling: A string or None, indicating the pooling method. If include_top is False, the model's last layer is a pooling layer, depending on the pooling argument. If pooling is 'avg', global average pooling is used; if pooling is 'max', global max pooling is used; if pooling is None, no pooling is used.
- classes: An int, indicating the number of classes for the classification task. If include_top is True, the model's last layer is a fully connected layer that outputs classes neurons.
- classifier_activation: A string or None, indicating the activation function for the classifier layer. If include_top is True, this argument specifies the activation function for the last layer. The default value is "softmax", which corresponds to the softmax activation function. Other possible values are "sigmoid", "tanh", etc.
- include_preprocessing: A bool or None, indicating whether to include preprocessing layers for rescaling and normalization. If True, the model's first layer is a rescaling layer that scales the input data to a certain range, and the second layer is a normalization layer that normalizes the input data to have zero mean and unit variance. If False, no preprocessing layers are added. If None, the default preprocessing layers are used according to the model variant.
- device: A string, indicating the device to use for computation. The default value is 'GPU', which means using GPU if available. Other value is 'CPU'.
- dtype: A string or TensorFlow dtype object, indicating the data type for computation. The default value is 'float32', which corresponds to 32-bit floating point numbers.
- loss_object: A TensorFlow object, indicating the loss function. The default loss function is categorical crossentropy loss.
- optimizer: A parallel optimizer for Note. The default optimizer is Adam.
- param: A list, storing all the parameters (weights and biases) of the model.
- km: An integer that indicates the kernel mode.
- build(): A method, used to build the model's structure. It accepts no arguments. This method creates all the convolutional layers, depthwise separable convolutional layers, inverted residual blocks, batch normalization layers, activation layers, dropout layers, drop connect layers, global pooling layers and fully connected layers that are needed for the model, and stores them in corresponding attributes.
- fp(data, p): A method, used to perform forward propagation. It accepts two arguments
data
andp
, indicating the input data and process number respectively. This method passes the input data through all the layers of the model and returns the output data. - loss(output, labels, p): A method, used to calculate the loss value. It accepts three arguments
output
,labels
andp
, indicating the output data, true labels and process number respectively. This method uses the loss function to calculate the difference between output data and true labels and returns the loss value. - GradientTape(data, labels, p): A method, used to calculate the gradient value. It accepts three arguments data, labels and p, indicating the instance input data, true labels and device number respectively. This method uses a persistent gradient tape to record the operations and compute the gradient of the loss with respect to the parameters. This method returns the tape, output data and loss value.
- opt(gradient, p): A method, used to perform optimization update. It accepts two arguments
gradient
andp
, indicating the gradient value and process number respectively. This method uses the optimizer to update all the parameters of the model according to gradient value and returns updated parameters.
The GPT2 class is a Python class that implements the GPT-2 model, which is a type of transformer-based language model that can generate coherent and diverse texts on various topics and tasks. The GPT-2 model has several variants, such as small, medium, large and XL, that differ in their number of parameters and layers. The GPT2 class has the following attributes and methods:
GPT2(one_hot=True)
- one_hot: A bool, indicating whether to use one-hot encoding for the labels. If True, the labels are converted to one-hot vectors before computing the loss. If False, the labels are used as indices for the logits.
- norm: A norm object, indicating the layer normalization layer for the model output.
- block: A dictionary, storing the block objects for each layer of the model. Each block object contains the attention layer, the feed-forward layer and the residual connections for that layer.
- opt: A TensorFlow object, indicating the optimizer. The default optimizer is Adam optimizer.
- param: A list, storing all the parameters (weights and biases) of the model.
- flag: An integer that indicates whether the model has been built or not.
- fp(X, past=None): A method, used to perform forward propagation. It accepts two arguments
X
andpast
, indicating the input data and previous hidden states respectively. This method passes the input data through all the layers of the model and returns a dictionary with keys 'present' and 'logits'. The 'present' value is a tensor that contains the current hidden states of the model, which can be used as past for the next iteration. The 'logits' value is a tensor that contains the output logits of the model for each token in the input data. - loss(output, labels): A method, used to calculate the loss value. It accepts two arguments
output
andlabels
, indicating the output data and true labels respectively. This method uses the categorical crossentropy loss function to calculate the difference between output logits and true labels and returns the loss value. If one_hot is True, this method converts the labels to one-hot vectors before computing the loss.
The MobileNet class is a Python class that implements the MobileNet model, which is a type of convolutional neural network that uses depthwise separable convolutions and ReLU6 activation to reduce the computational cost and model size. The MobileNet class has the following attributes and methods:
MobileNet(alpha=1.0, depth_multiplier=1, dropout=1e-3, include_top=True, pooling=None, classes=1000)
- alpha: A float, indicating the width multiplier that controls the number of filters in each layer. A smaller alpha reduces the number of filters and the model size.
- depth_multiplier: A float, indicating the depth multiplier that controls the number of depthwise convolution output channels. A smaller depth_multiplier reduces the number of channels and the model size.
- dropout: A float, indicating the dropout rate that is applied to the last layer before the classification layer. Dropout can improve the model's generalization and robustness.
- classes: An int, indicating the number of classes for the classification task. If include_top is True, the model's last layer is a fully connected layer that outputs classes neurons.
- include_top: A bool, indicating whether to include the top layer for classification. If True, the model's last layer is a fully connected layer that outputs classes neurons. If False, the model's last layer is a global average pooling layer or a global max pooling layer, depending on the pooling argument.
- pooling: A string or None, indicating the pooling method. If include_top is False, the model's last layer is a pooling layer, depending on the pooling argument. If pooling is 'avg', global average pooling is used; if pooling is 'max', global max pooling is used; if pooling is None, no pooling is used.
- device: A string, indicating the device to use for computation. The default value is 'GPU', which means using GPU if available. Other value is 'CPU'.
- loss_object: A TensorFlow object, indicating the loss function. The default loss function is categorical crossentropy loss.
- optimizer: A parallel optimizer for Note. The default optimizer is Adam.
- param: A list, storing all the parameters (weights and biases) of the model.
- km: An integer that indicates the kernel mode.
- build(dtype='float32'): A method, used to build the model's structure. It accepts one argument
dtype
, indicating the data type, defaulting to 'float32'. This method creates all the convolution blocks, depthwise convolution blocks, normalization layers, pooling layers and fully connected layers that are needed for the model, and stores them in corresponding attributes. - fp(data, p): A method, used to perform forward propagation. It accepts two arguments
data
andp
, indicating the input data and process number respectively. This method passes the input data through all the layers of the model and returns the output data. - loss(output, labels, p): A method, used to calculate the loss value. It accepts three arguments
output
,labels
andp
, indicating the output data, true labels and process number respectively. This method uses the loss function to calculate the difference between output data and true labels and returns the loss value. - GradientTape(data, labels, p): A method, used to calculate the gradient value. It accepts three arguments data, labels and p, indicating the instance input data, true labels and device number respectively. This method uses a persistent gradient tape to record the operations and compute the gradient of the loss with respect to the parameters. This method returns the tape, output data and loss value.
- opt(gradient, p): A method, used to perform optimization update. It accepts two arguments
gradient
andp
, indicating the gradient value and process number respectively. This method uses the optimizer to update all the parameters of the model according to gradient value and returns updated parameters.
The MobileNetV2 class is a Python class that implements the MobileNetV2 model, which is a type of convolutional neural network that uses inverted residual blocks and linear bottlenecks to improve performance and efficiency. The MobileNetV2 class has the following attributes and methods:
MobileNetV2(alpha=1.0,classes=1000,include_top=True,pooling=None)
- alpha: A float, indicating the width multiplier that controls the number of filters in each layer. A smaller alpha reduces the number of filters and the model size.
- classes: An int, indicating the number of classes for the classification task. If include_top is True, the model's last layer is a fully connected layer that outputs classes neurons.
- include_top: A bool, indicating whether to include the top layer for classification. If True, the model's last layer is a fully connected layer that outputs classes neurons. If False, the model's last layer is a global average pooling layer or a global max pooling layer, depending on the pooling argument.
- pooling: A string or None, indicating the pooling method. If include_top is False, the model's last layer is a pooling layer, depending on the pooling argument. If pooling is 'avg', global average pooling is used; if pooling is 'max', global max pooling is used; if pooling is None, no pooling is used.
- device: A string, indicating the device to use for computation. The default value is 'GPU', which means using GPU if available. Other value is 'CPU'.
- loss_object: A TensorFlow object, indicating the loss function. The default loss function is categorical crossentropy loss.
- optimizer: A parallel optimizer for Note. The default optimizer is Adam.
- param: A list, storing all the parameters (weights and biases) of the model.
- km: An integer that indicates the kernel mode.
- build(dtype='float32'): A method, used to build the model's structure. It accepts one argument
dtype
, indicating the data type, defaulting to 'float32'. This method creates all the convolution blocks, depthwise convolution blocks, normalization layers, pooling layers and fully connected layers that are needed for the model, and stores them in corresponding attributes. - fp(data, p): A method, used to perform forward propagation. It accepts two arguments
data
andp
, indicating the input data and process number respectively. This method passes the input data through all the layers of the model and returns the output data. - loss(output, labels, p): A method, used to calculate the loss value. It accepts three arguments
output
,labels
andp
, indicating the output data, true labels and process number respectively. This method uses the loss function to calculate the difference between output data and true labels and returns the loss value. - GradientTape(data, labels, p): A method, used to calculate the gradient value. It accepts three arguments data, labels and p, indicating the instance input data, true labels and device number respectively. This method uses a persistent gradient tape to record the operations and compute the gradient of the loss with respect to the parameters. This method returns the tape, output data and loss value.
- opt(gradient, p): A method, used to perform optimization update. It accepts two arguments
gradient
andp
, indicating the gradient value and process number respectively. This method uses the optimizer to update all the parameters of the model according to gradient value and returns updated parameters.
The MobileNetV3 class is a Python class that implements the MobileNetV3 model, which is a type of convolutional neural network that uses inverted residual blocks and attention mechanisms to achieve high performance on image classification and object detection tasks. The MobileNetV3 class has the following attributes and methods:
MobileNetV3(alpha=1.0, model_type="large", minimalistic=False, include_top=True, classes=1000, pooling=None, dropout_rate=0.2, classifier_activation="softmax", include_preprocessing=True, device='GPU')
- alpha: A float, indicating the width multiplier for the network. A larger alpha increases the number of filters in each layer and the model size and complexity.
- model_type: A string, indicating the type of the MobileNetV3 model. It can be either "large" or "small", corresponding to different configurations of inverted residual blocks and attention mechanisms.
- minimalistic: A bool, indicating whether to use a minimalistic version of the MobileNetV3 model. If True, the model uses smaller kernel size, ReLU activation and no squeeze-and-excitation modules. If False, the model uses larger kernel size, hard swish activation and squeeze-and-excitation modules.
- include_top: A bool, indicating whether to include the top layer for classification. If True, the model's last layer is a fully connected layer that outputs classes neurons. If False, the model's last layer is a global average pooling layer or a global max pooling layer, depending on the pooling argument.
- classes: An int, indicating the number of classes for the classification task. If include_top is True, the model's last layer is a fully connected layer that outputs classes neurons.
- pooling: A string or None, indicating the pooling method. If include_top is False, the model's last layer is a pooling layer, depending on the pooling argument. If pooling is 'avg', global average pooling is used; if pooling is 'max', global max pooling is used; if pooling is None, no pooling is used.
- dropout_rate: A float, indicating the dropout rate for the top layer. Dropout is a regularization technique that randomly drops out some units during training to prevent overfitting.
- classifier_activation: A string or TensorFlow object, indicating the activation function for the top layer. The default activation function is softmax.
- include_preprocessing: A bool, indicating whether to include preprocessing for the input data. If True, the input data will be rescaled by 1/127.5 and offset by -1.
- device: A string, indicating the device to use for computation. The default value is 'GPU', which means using GPU if available. Other value is 'CPU'.
- loss_object: A TensorFlow object, indicating the loss function. The default loss function is categorical crossentropy loss.
- optimizer: A parallel optimizer for Note. The default optimizer is Adam.
- param: A list, storing all the parameters (weights and biases) of the model.
- km: An integer that indicates the kernel mode.
- build(): a method, used to build the model's structure. This method creates all the convolutional layers, inverted residual blocks, attention mechanisms and fully connected layers that are needed for the model, and stores them in corresponding attributes.
- fp(data, p): A method, used to perform forward propagation. It accepts two arguments
data
andp
, indicating the input data and process number respectively. This method passes the input data through all the layers of the model and returns the output data. - loss(output, labels, p): A method, used to calculate the loss value. It accepts three arguments
output
,labels
andp
, indicating the output data, true labels and process number respectively. This method uses the loss function to calculate the difference between output data and true labels and returns the loss value. - GradientTape(data, labels, p): A method, used to calculate the gradient value. It accepts three arguments
data
,labels
andp
, indicating the input data, true labels and process number respectively. This method uses a persistent gradient tape to record the operations and compute the gradient of the loss with respect to the parameters. This method returns the tape, output data and loss value. - opt(gradient, p): A method, used to perform optimization update. It accepts two arguments
gradient
andp
, indicating the gradient value and process number respectively. This method uses the optimizer to update all the parameters of the model according to gradient value and returns updated parameters.
The ResNetRS class is a Python class that implements the ResNet-RS model, which is a type of convolutional neural network that uses residual blocks and stochastic depth to achieve state-of-the-art performance on image classification tasks. The ResNet-RS class has the following attributes and methods:
ResNetRS(
bn_momentum=0.0,
bn_epsilon=1e-5,
activation: str = "relu",
se_ratio=0.25,
dropout_rate=0.25,
drop_connect_rate=0.2,
include_top=True,
block_args: List[Dict[str, int]] = None,
model_name="resnet-rs-50",
pooling=None,
classes=1000,
include_preprocessing=True,
)
- bn_momentum: A float, indicating the momentum for the batch normalization layers.
- bn_epsilon: A float, indicating the epsilon for the batch normalization layers.
- activation: A string, indicating the activation function for the convolutional layers. The default activation function is ReLU.
- se_ratio: A float, indicating the ratio for the Squeeze and Excitation blocks. The default ratio is 0.25.
- dropout_rate: A float, indicating the dropout rate for the last layer before the classification layer. Dropout can improve the model's generalization and robustness.
- drop_connect_rate: A float, indicating the initial rate for the stochastic depth. Stochastic depth can improve the model's generalization and robustness by randomly dropping out blocks during training.
- include_top: A bool, indicating whether to include the top layer for classification. If True, the model's last layer is a fully connected layer that outputs classes neurons. If False, the model's last layer is a global average pooling layer or a global max pooling layer, depending on the pooling argument.
- block_args: A list of dictionaries, indicating the arguments for each block group. Each dictionary contains the input filters and the number of repeats for each block group. The default block arguments are based on the model depth.
- model_name: A string, indicating the name of the ResNet-RS variant. The name determines the model depth and block arguments.
- pooling: A string or None, indicating the pooling method. If include_top is False, the model's last layer is a pooling layer, depending on the pooling argument. If pooling is 'avg', global average pooling is used; if pooling is 'max', global max pooling is used; if pooling is None, no pooling is used.
- classes: An int, indicating the number of classes for the classification task. If include_top is True, the model's last layer is a fully connected layer that outputs classes neurons.
- include_preprocessing: a bool, indicating whether to include preprocessing for the input data. If True, the input data will be rescaled and normalized according to ImageNet statistics.
- device: A string, indicating the device to use for computation. The default value is 'GPU', which means using GPU if available. Other value is 'CPU'.
- loss_object: A TensorFlow object, indicating the loss function. The default loss function is categorical crossentropy loss.
- optimizer: A parallel optimizer for Note. The default optimizer is Adam.
- param: A list, storing all the parameters (weights and biases) of the model.
- km: An integer that indicates the kernel mode.
- build(dtype='float32'): A method, used to build the model's structure. It accepts one argument
dtype
, indicating the data type, defaulting to 'float32'. This method creates all the stem blocks, block groups and head blocks that are needed for the model, and stores them in corresponding attributes. - fp(data, p): A method, used to perform forward propagation. It accepts two arguments
data
andp
, indicating the input data and process number respectively. This method passes the input data through all the layers of the model and returns the output data. - loss(output, labels, p): A method, used to calculate the loss value. It accepts three arguments
output
,labels
andp
, indicating the output data, true labels and process number respectively. This method uses the loss function to calculate the difference between output data and true labels and returns the loss value. - GradientTape(data, labels, p): A method, used to calculate the gradient value. It accepts three arguments data, labels and p, indicating the instance input data, true labels and device number respectively. This method uses a persistent gradient tape to record the operations and compute the gradient of the loss with respect to the parameters. This method returns the tape, output data and loss value.
- opt(gradient, p): A method, used to perform optimization update. It accepts two arguments
gradient
andp
, indicating the gradient value and process number respectively. This method uses the optimizer to update all the parameters of the model according to gradient value and returns updated parameters.
The SwiftFormer class is a Python class that implements the SwiftFormer model, which is a type of convolutional neural network that uses convolution and self-attention to achieve high performance on image classification and segmentation tasks. The SwiftFormer class has the following attributes and methods:
SwiftFormer(model_type,
mlp_ratios=4, downsamples=[True, True, True, True],
act_layer=activation_dict['gelu'],
num_classes=1000,
down_patch_size=3, down_stride=2, down_pad=1,
drop_rate=0., drop_path_rate=0.,
use_layer_scale=True, layer_scale_init_value=1e-5,
fork_feat=False,
init_cfg=None,
pretrained=None,
vit_num=1,
distillation=True,
include_top=True,
pooling=None,
device='GPU',
dtype='float32')
- model_type: A string, indicating the type of the SwiftFormer model. It can be one of 'swiftformer_tiny', 'swiftformer_small', 'swiftformer_base', 'swiftformer_large', or 'swiftformer_xlarge'.
- mlp_ratios: An int or a list of ints, indicating the ratio of hidden dimension to input dimension for the MLP layers. A larger ratio increases the model size and complexity.
- downsamples: A list of bools, indicating whether to perform downsampling between each stage. Downsampling reduces the spatial resolution and increases the number of channels.
- act_layer: A TensorFlow object, indicating the activation function for the MLP layers. The default activation function is GELU.
- num_classes: An int, indicating the number of classes for the classification task. If include_top is True, the model's last layer is a fully connected layer that outputs num_classes neurons.
- down_patch_size: An int or a tuple of two ints, indicating the patch size for the downsampling layers. A larger patch size reduces the spatial resolution more.
- down_stride: An int or a tuple of two ints, indicating the stride for the downsampling layers. A larger stride reduces the spatial resolution more.
- down_pad: An int or a tuple of two ints, indicating the padding for the downsampling layers. A larger padding preserves more information at the edges.
- drop_rate: A float, indicating the dropout rate for the MLP layers. Dropout is a regularization technique that randomly drops out some units during training to prevent overfitting.
- drop_path_rate: A float, indicating the drop path rate for the residual connections. Drop path is a regularization technique that randomly drops out some paths during training to prevent overfitting.
- use_layer_scale: A bool, indicating whether to use layer scale for the residual connections. Layer scale is a technique that scales up the residual connections by a learnable factor to improve gradient flow and stability.
- layer_scale_init_value: A float, indicating the initial value for the layer scale factors. A smaller value reduces the impact of layer scale at the beginning of training.
- fork_feat: A bool, indicating whether to fork features from different stages for dense prediction. If True, the model's output is a list of tensors with different resolutions and channels. If False, the model's output is a single tensor with the final resolution and channels.
- init_cfg: A dict or None, indicating the initialization configuration for the model parameters. If None, use default initialization methods.
- pretrained: A string or None, indicating the path to a pretrained checkpoint file for loading weights. If None, use random initialization.
- vit_num: An int, indicating the number of self-attention layers in each stage. Self-attention is a technique that allows each unit to attend to all other units in its receptive field and learn long-range dependencies.
- distillation: A bool, indicating whether to use distillation for knowledge transfer from teacher model to student model. Distillation is a technique that improves the performance of a smaller model (student) by learning from a larger model (teacher).
- include_top: A bool, indicating whether to include the top layer for classification. If True, the model's last layer is a fully connected layer that outputs num_classes neurons. If False, the model's last layer is a global average pooling layer or a global max pooling layer, depending on the pooling argument.
- pooling: A string or None, indicating the pooling method. If include_top is False, the model's last layer is a pooling layer, depending on the pooling argument. If pooling is 'avg', global average pooling is used; if pooling is 'max', global max pooling is used; if pooling is None, no pooling is used.
- device: A string, indicating the device to use for computation. The default value is 'GPU', which means using GPU if available. Other value is 'CPU'.
- dtype: A string or TensorFlow dtype object, indicating the data type for computation. The default value is 'float32', which corresponds to 32-bit floating point numbers.
- loss_object: A TensorFlow object, indicating the loss function. The default loss function is categorical crossentropy loss.
- optimizer: A parallel optimizer for Note. The default optimizer is AdamW.
- param: A list, storing all the parameters (weights and biases) of the model.
- km: An integer that indicates the kernel mode.
- build(): a method, used to build the model's structure. This method creates all the convolutional layers, self-attention layers, MLP layers, batch normalization layers, and fully connected layers that are needed for the model, and stores them in corresponding attributes.
- fp(data, p): A method, used to perform forward propagation. It accepts two arguments
data
andp
, indicating the input data and process number respectively. This method passes the input data through all the layers of the model and returns the output data. - loss(output, labels, p): A method, used to calculate the loss value. It accepts three arguments
output
,labels
andp
, indicating the output data, true labels and process number respectively. This method uses the loss function to calculate the difference between output data and true labels and returns the loss value. - GradientTape(data, labels, p): A method, used to calculate the gradient value. It accepts three arguments
data
,labels
andp
, indicating the input data, true labels and process number respectively. This method uses a persistent gradient tape to record the operations and compute the gradient of the loss with respect to the parameters. This method returns the tape, output data and loss value. - opt(gradient, p): A method, used to perform optimization update. It accepts two arguments
gradient
andp
, indicating the gradient value and process number respectively. This method uses the optimizer to update all the parameters of the model according to gradient value and returns updated parameters.
The VGG16 class is a Python class that implements the VGG-16 model, which is a type of convolutional neural network that uses 16 layers of convolution, pooling and fully connected layers to achieve high performance on image classification tasks. The VGG16 class has the following attributes and methods:
VGG16(include_top=True,pooling=None,classes=1000)
- include_top: A bool, indicating whether to include the top layer for classification. If True, the model's last layer is a fully connected layer that outputs classes neurons. If False, the model's last layer is a global average pooling layer or a global max pooling layer, depending on the pooling argument.
- pooling: A string or None, indicating the pooling method. If include_top is False, the model's last layer is a pooling layer, depending on the pooling argument. If pooling is 'avg', global average pooling is used; if pooling is 'max', global max pooling is used; if pooling is None, no pooling is used.
- classes: An int, indicating the number of classes for the classification task. If include_top is True, the model's last layer is a fully connected layer that outputs classes neurons.
- device: A string, indicating the device to use for computation. The default value is 'GPU', which means using GPU if available. Other value is 'CPU'.
- loss_object: A TensorFlow object, indicating the loss function. The default loss function is categorical crossentropy loss.
- optimizer: A parallel optimizer for Note. The default optimizer is Adam.
- param: A list, storing all the parameters (weights and biases) of the model.
- km: An integer that indicates the kernel mode.
- build(dtype='float32'): A method, used to build the model's structure. It accepts one argument
dtype
, indicating the data type, defaulting to 'float32'. This method creates all the convolutional layers, max pooling layers and fully connected layers that are needed for the model, and stores them in corresponding attributes. - fp(data, p): A method, used to perform forward propagation. It accepts two arguments
data
andp
, indicating the input data and process number respectively. This method passes the input data through all the layers of the model and returns the output data. - loss(output, labels, p): A method, used to calculate the loss value. It accepts three arguments
output
,labels
andp
, indicating the output data, true labels and process number respectively. This method uses the loss function to calculate the difference between output data and true labels and returns the loss value. - GradientTape(data, labels, p): A method, used to calculate the gradient value. It accepts three arguments data, labels and p, indicating the instance input data, true labels and device number respectively. This method uses a persistent gradient tape to record the operations and compute the gradient of the loss with respect to the parameters. This method returns the tape, output data and loss value.
- opt(gradient, p): A method, used to perform optimization update. It accepts two arguments
gradient
andp
, indicating the gradient value and process number respectively. This method uses the optimizer to update all the parameters of the model according to gradient value and returns updated parameters.
The VGG19 class is a Python class that implements the VGG-19 model, which is a type of convolutional neural network that uses 19 layers of convolution, pooling and fully connected layers to achieve high performance on image classification tasks. The VGG19 class has the following attributes and methods:
VGG19(include_top=True,pooling=None,classes=1000)
- include_top: A bool, indicating whether to include the top layer for classification. If True, the model's last layer is a fully connected layer that outputs classes neurons. If False, the model's last layer is a global average pooling layer or a global max pooling layer, depending on the pooling argument.
- pooling: A string or None, indicating the pooling method. If include_top is False, the model's last layer is a pooling layer, depending on the pooling argument. If pooling is 'avg', global average pooling is used; if pooling is 'max', global max pooling is used; if pooling is None, no pooling is used.
- classes: An int, indicating the number of classes for the classification task. If include_top is True, the model's last layer is a fully connected layer that outputs classes neurons.
- device: A string, indicating the device to use for computation. The default value is 'GPU', which means using GPU if available. Other value is 'CPU'.
- loss_object: A TensorFlow object, indicating the loss function. The default loss function is categorical crossentropy loss.
- optimizer: A parallel optimizer for Note. The default optimizer is Adam.
- param: A list, storing all the parameters (weights and biases) of the model.
- km: An integer that indicates the kernel mode.
- build(dtype='float32'): A method, used to build the model's structure. It accepts one argument
dtype
, indicating the data type, defaulting to 'float32'. This method creates all the convolutional layers, max pooling layers and fully connected layers that are needed for the model, and stores them in corresponding attributes. - fp(data, p): A method, used to perform forward propagation. It accepts two arguments
data
andp
, indicating the input data and process number respectively. This method passes the input data through all the layers of the model and returns the output data. - loss(output, labels, p): A method, used to calculate the loss value. It accepts three arguments
output
,labels
andp
, indicating the output data, true labels and process number respectively. This method uses the loss function to calculate the difference between output data and true labels and returns the loss value. - GradientTape(data, labels, p): A method, used to calculate the gradient value. It accepts three arguments data, labels and p, indicating the instance input data, true labels and device number respectively. This method uses a persistent gradient tape to record the operations and compute the gradient of the loss with respect to the parameters. This method returns the tape, output data and loss value.
- opt(gradient, p): A method, used to perform optimization update. It accepts two arguments
gradient
andp
, indicating the gradient value and process number respectively. This method uses the optimizer to update all the parameters of the model according to gradient value and returns updated parameters.