资讯详情

史上最详细人脸检测libfacedetection讲解-网络详解--第二节

以下是关于我个人的libfacedetection(人脸检测-pytorch)如有错误,欢迎在评论区指出,我会尽快纠正。据说人脸检测速度可以达到1万FPS,我们来看看结果。

下一章是分析yunet网络结构的组成。在这里,我们可以看到网络的组成有四个部分,主网络、头网络、anchor计算框架,关键点(5)loss和bbox的loss。

在这里插入图片描述

主干网络

从网络架构的角度来看,基本模型是conv,batchnorm,relu堆叠,采样提取特征图。

class Yunet(nn.Module):     def __init__(self, cfg_layers, activation_type='relu'):         super().__init__()          self.model0 = Conv_head(*cfg_layers[0], activation_type=activation_type)         for i in range(1, len(cfg_layers)):             self.add_module(f'model{i}', Conv4layerBlock(*cfg_layers[i], activation_type=activation_type))         # self.model1 = Conv4layerBlock(16, 64, activation_type=activation_type)         # self.model2 = Conv4layerBlock(64, 64, activation_type=activation_type)         # self.model3 = Conv4layerBlock(64, 64, activation_type=activation_type)         # self.model4 = Conv4layerBlock(64, 64, activation_type=activation_type)         # self.model5 = Conv4layerBlock(64, 64, activation_type=activation_type)         # self.model6 = Conv4layerBlock(64, 64, activation_type=activation_type)         self.init_weights()      def init_weights(self):         for m in self.modules():             if isinstance(m, nn.Conv2d):                 if m.bias is not None:                     nn.init.xavier_normal_(m.weight.data)                     m.bias.data.fill_(0.02)                 else:                     m.weight.data.normal_(0, 0.01)             elif isinstance(m, nn.BatchNorm2d):                 m.weight.data.fill_(1)                 m.bias.data.zero_()      def forward(self, x):         x = self.model0(x)         x = F.max_pool2d(x, 2)         x = self.model1(x)         x = self.model2(x)         x = F.max_pool2d(x, 2)         p1 = self.model3(x)         x = F.max_pool2d(p1, 2)         p2 = self.model4(x)         x = F.max_pool2d(p2, 2)         p3 = self.model5(x)         x = F.max_pool2d(p3, 2)         p4 = self.model6(x)                  return [p1, p2, p3, p4] 

一系列卷积后得到的特征大小为10*10、通道数为64。

头部网络

头部网络原作者采用Yuhead,但我在这里用的是Yuhead_PAN,PAN其作用是对小目标的特征更加明确,PAN与FPN区别在于,PAN在FPN然后添加一个自底向上的下采样,从而从低维向高维传递语义信息,使小目标更加清晰。

PAN结构:

在这里,我们可以看到四层语义信息。

    def forward(self, x):         self.img_size = x.shape[-2:]         feats = self.backbone(x)         outs = self.head(feats)         head_data=[(x.permute(0, 2, 3, 1).contiguous()) for x in outs]         head_data = torch.cat([o.view(o.size(0), -1) for o in head_data], dim=1)         head_data = head_data.view(head_data.size(0), -1, self.out_factor)          loc_data = head_data[:, :, 0 : 4   self.num_landmarks * 2]         conf_data = head_data[:, :, -self.num_classes - 1 : -1]         iou_data = head_data[:,:, -1:]         output = (loc_data, conf_data, iou_data)         return output 

头后得出三个值,loc_data(20,23500,14)、conf_data(20,23500,2)、iou_data(2,23500,1)。 conf 形状:torch.size(batch_size,num_priors,num_classes) loc 形状:torch.size(batch_size,num_priors,14) loc中的14:人脸bbox(4个值) 人脸关键点(10值)= 14

anchor框的计算

主要代码如下:

self.anchor_generator = PriorBox(             min_sizes=cfg['model']['anchor']['min_sizes'],             steps=cfg['model']['anchor']['steps'],             clip=cfg['model']['anchor']['clip'],             ratio=cfg['model']['anchor']['ratio']         )  class PriorBox(object):     def __init__(self, min_sizes, steps, clip, ratio):         super(PriorBox, self).__init__()         self.min_sizes = min_sizes         self.steps = steps         self.clip = clip         self.ratio = ratio     def __call__(self, image_size):         feature_map_2th = [int(int((image_size[0]   1) / 2) / 2),                                 int(int((image_size[1]   1) / 2) / 2)]         feature_map_3th = [int(feature_map_2th[0] / 2),                                 int(feature_map_2th[1] / 2)]         feature_map_4th = [int(feature_map_3th[0] / 2),                                 int(feature_map_3th[1] / 2)]         feature_map_5th = [int(feature_map_4th[0] / 2),                                 int(feature_map_4th[1] / 2)]         feature_map_6th = [int(feature_map_5th[0] / 2),                                 int(feature_map_5th[1] / 2)]          feature_maps = [feature_map_3th, feature_map_4th,                             feature_map_5th, feature_map_6th]
        anchors = []
        for k, f in enumerate(feature_maps):
            min_sizes = self.min_sizes[k]
            for i, j in product(range(f[0]), range(f[1])):
                for min_size in min_sizes:
                    cx = (j + 0.5) * self.steps[k] / image_size[1]
                    cy = (i + 0.5) * self.steps[k] / image_size[0]
                    for r in self.ratio:
                        s_ky = min_size / image_size[0]
                        s_kx = r * min_size / image_size[1]
                        anchors += [cx, cy, s_kx, s_ky]
        # back to torch land
        output = torch.Tensor(anchors).view(-1, 4)
        if self.clip:
            output.clamp_(max=1, min=0)
        return output

我们可以看到config文件中的anchor设置参数min_sizes,数字从小到大,表示用前面的特征图检测小目标,用后面的特征图检测大目标。

LOSS计算

这里主要讲的是人脸bbox的loss、关键点loss和正负样例的分类loss。

bbox loss: 人脸bbox采用的是smoothl1。得到的先验框与真实框进行loss计算。

loss_bbox_eiou = eiou_loss(loc_p[:, 0:4], loc_t[:, 0:4], variance=self.variance, smooth_point=self.smooth_point, reduction='sum')

关键点loss: 与人脸bbox计算相同。

loss_lm_smoothl1 = F.smooth_l1_loss(loc_p[:, 4:loc_dim], loc_t[:, 4:loc_dim], reduction='sum')

分类交叉熵损失: 采用softmax为损失,但是会筛选样本,正负样本为1:3

loss_cls_ce = F.cross_entropy(conf_p, targets_weighted, reduction='sum')

结语:本章节在这里就告一段落,感谢阅读,如果觉得还可以的话给博主一个赞就是对博主最大的支持~

标签: 7jb4继电器3th2040

锐单商城拥有海量元器件数据手册IC替代型号,打造 电子元器件IC百科大全!

 锐单商城 - 一站式电子元器件采购平台  

 深圳锐单电子有限公司