Skip to content
王佳欣 edited this page Jun 27, 2018 · 2 revisions

network structure

  • backbone network: resnet, vgg, mobilenet
  • transform network: psp
  • upsample network: duc, bilinear

transform network

  • input image H x W --(backbone network)--> feature map H/2^k x W/2^k. by default k=3
  • feature map x --(psp pool paths/psp transform network)--> feature map list --(concat)--> psp feature map
output_slices = [x]
for module in self.pool_paths:
    x = module(psp_x)
    output_slices.append(x)
return torch.cat(output_slices, dim=1)
  • psp feature map --(upsample)--> classification result

detail

  • there are the smallest feature map size(max pool_size * scale) for basic transform network, so sometimes bilinear upsample will be applied before psp transform network
  • to keep the output channel = 2 * input channel, when input chanel%len(pool_sizes)!=0, the last output channels will more than other output channels
for pool_size, out_c in zip(pool_sizes, path_out_c_list):
	pool_path = TN.Sequential(TN.AvgPool2d(kernel_size=pool_size*scale,
											stride=pool_size*scale,
											padding=0),
							   TN.Conv2d(in_channels=in_channels,
										 out_channels=out_c,
										 kernel_size=1,
										 stride=1,
										 padding=0),
							   TN.BatchNorm2d(num_features=out_c),
							   TN.Upsample(size=out_size,mode='bilinear',align_corners=False))
	pool_paths.append(pool_path)

self.pool_paths=TN.ModuleList(pool_paths)
Clone this wiki locally