Content

BackGround

Progress

PointNet

主要问题

PointNet 结构

Baseline

Hierarchical point set feature learning（Encoder）

Decoder

Loss

Experiments

自测表现

Reference

BackGround

PointNet点云数据的处理是一项创新工作。PointNet一些问题也暴露在使用中。PointNet由于不捕捉空间的局部特征，限制了PointNet细粒度识别能力。因此，提出PointNet 对PointNet改进缺点。

对于PointNet，详情见：PointNet（Analysis & Coding）

Progress

使用空间距离PointNet递归数据的局部点云的特征提取，有助于在不断增加的上下文尺度下学习局部特征；
针对点云密度不同的问题，提出了自适应多尺度特征的新学习层PointNet 鲁棒学习点的特点可以更高效。
针对Segnmention问题，提出了使用反向插值和跳过连接的方法，帮助学习到了点云的逐点特征。

PointNet

主要问题

解决PointNet由度量引起的局部特征中不捕获。PointNet 捕捉多尺度对点云的特征；
因为PointNet中使用了Max Pooling获得整体特征，因此对特征的损失很大。本文借鉴了多层卷积神经网络的层次化特征PointNet 采用多层次特征学习；
如何划分点集，以提取更细粒的局部特征；
在Segmentation介绍了任务Encoder-Decoder的思想；

在Grouping部分，作者的提出了两种解决思路，一种是KNN（K Nearest Neighbors）的方法，另一种是Ball Query的方法。相比KNN的方法，BQ方法的局部领域可保证固定的区域尺寸，使得局部区域特征在整个特征空间中更具有通用性（其实KNN和BQ的思路差不多）。作者在这里使用了BQ的方法。在这里，BQ过程的球心为FPS所选出的点。作者在补充材料中加入了使用KNN和BQ方法在分类方面的对比

在代码中,因为要求解点和点之间的距离，所以首先对输入的两个点的左边求解其距离，具体代码如下:

#计算点到点的距离（欧氏距离）
#src为N，dst为M，输出为N，M
def square_distance(src, dst):
    """
    Calculate Euclid distance between each two points.

    src^T * dst = xn * xm + yn * ym + zn * zm；
    sum(src^2, dim=-1) = xn*xn + yn*yn + zn*zn;
    sum(dst^2, dim=-1) = xm*xm + ym*ym + zm*zm;
    dist = (xn - xm)^2 + (yn - ym)^2 + (zn - zm)^2
         = sum(src**2,dim=-1)+sum(dst**2,dim=-1)-2*src^T*dst       dist=(x-y)^2=x^2+y^2-2*x*y    x,y两个点云

    Input:
        src: source points, [B, N, C]
        dst: target points, [B, M, C]
    Output:
        dist: per-point square distance, [B, N, M]
    """
    B, N, _ = src.shape
    _, M, _ = dst.shape 
    dist = -2 * torch.matmul(src, dst.permute(0, 2, 1))         # 计算点到点的距离    交换行列方便乘法    
    dist += torch.sum(src ** 2, -1).view(B, N, 1)               # 加上点到点的距离    
    dist += torch.sum(dst ** 2, -1).view(B, 1, M)               # 加上点到点的距离
    return dist

在可以求解了两个点的坐标之后，开始使用BQ的方法根据FPS选组的中心点求解，具体代码如下：

#返回制定求半径之内的点集合 radius为半径，nsample为要找点的个数，xyz为输入，new_xnew_x为输出
def query_ball_point(radius, nsample, xyz, new_xyz):
    """
    Input:
        radius: local region radius
        nsample: max sample number in local region
        xyz: all points, [B, N, 3]
        new_xyz: query points, [B, S, 3]
    Return:
        group_idx: grouped points index, [B, S, nsample]
    """
    device = xyz.device
    B, N, C = xyz.shape
    _, S, _ = new_xyz.shape
    group_idx = torch.arange(N, dtype=torch.long).to(device).view(1, 1, N).repeat([B, S, 1]) # 创建一个大小为B，S，N的整数阵[B, S, N]
    sqrdists = square_distance(new_xyz, xyz)                                                 #计算点到点的距离
    group_idx[sqrdists > radius ** 2] = N                                                    #如果大于半径平方，则设置为N（我们目标要找小于的）
    group_idx = group_idx.sort(dim=-1)[0][:, :, :nsample]                                    # 将点到点的距离排序（升序），取前nsample个点
    group_first = group_idx[:, :, 0].view(B, S, 1).repeat([1, 1, nsample])                   # 创建一个大小为B，S，nsample的整数阵[B, S, nsample]，如果有值为N的点，那么舍去。用第一个点来代替
    mask = group_idx == N                                                                    #将这些点替换为第一个点
    group_idx[mask] = group_first[mask]                                                      
    return group_idx

作者在代码中将Sampling和Grouping两个过程和在了一起也就是FPS+BQ的过程，如下图所示：

但是在代码中有两个部分，分别是将输入的数据点划分成多个的Group的操作（Sampling_and_group()）和将输入的点划分为单个Group的操作(Sampling_and_group_all()).具体实现的代码过程如下：

#sample+Group（多个group）
def sample_and_group(npoint, radius, nsample, xyz, points, returnfps=False):            
    """
    Input:
        npoint:
        radius:
        nsample:
        xyz: input points position data, [B, N, 3]
        points: input points data, [B, N, D]
    Return:
        new_xyz: sampled points position data, [B, npoint, nsample, 3]
        new_points: sampled points data, [B, npoint, nsample, 3+D]
    """
    B, N, C = xyz.shape
    S = npoint
    fps_idx = farthest_point_sample(xyz, npoint)                            #FPS [B, npoint, C]
    new_xyz = index_points(xyz, fps_idx)                                    #中心点的set
    idx = query_ball_point(radius, nsample, xyz, new_xyz)                   #求半径内的点集合 
    grouped_xyz = index_points(xyz, idx)                                    #找到找出的点[B, npoint, nsample, C]
    grouped_xyz_norm = grouped_xyz - new_xyz.view(B, S, 1, C)               #转化为相对坐标

    if points is not None:
        grouped_points = index_points(points, idx)                          #如果有点，拼接 C+D
        new_points = torch.cat([grouped_xyz_norm, grouped_points], dim=-1)  #拼接 C+D [B, npoint, nsample, C+D]
    else:
        new_points = grouped_xyz_norm                                       #如果没有点，返回相对坐标 [B, npoint, nsample, C]
    if returnfps:                                                           #是否返回FPS
        return new_xyz, new_points, grouped_xyz, fps_idx
    else:
        return new_xyz, new_points

#sample+Group+Feature（只有一个group）
def sample_and_group_all(xyz, points):
    """
    Input:
        xyz: input points position data, [B, N, 3]
        points: input points data, [B, N, D]
    Return:
        new_xyz: sampled points position data, [B, 1, 3]
        new_points: sampled points data, [B, 1, N, 3+D]
    """
    device = xyz.device
    B, N, C = xyz.shape
    new_xyz = torch.zeros(B, 1, C).to(device)                               
    grouped_xyz = xyz.view(B, 1, N, C)
    if points is not None:
        new_points = torch.cat([grouped_xyz, points.view(B, 1, N, -1)], dim=-1)
    else:
        new_points = grouped_xyz
    return new_xyz, new_points

PointNet过程：

在理想情况下，点云的数据应该是远近分布均匀的。但是一般的传感器采集回来的点云数据中，大部分的数据都是非均匀的。造成这个问题的主要原因是采集过程中，传感器的分辨率是固定的，扫描速度是固定的。所以随着距离（R）的变远，点云的密度也会降低。如下图所示的扫描过程中，1，2，3点的点云密度要大于4，5，6点的点云密度（他们都是由一个扫描器扫描的，但是R不同）

为了解决这个问题，作者提出了将局部特征和全局特征凭的方法。在这里，作者具体提出了两个方法，以一个是MSG（Multi-scale grouping），另一个是MRG（Multi-resolution grouping）。MSG就是通过对一个数据中，将数据采用不同的尺度进行采样，最终并将他们并列拼接在一起。MRG的方法是对上一层Set Abstraction中进行特征提取，之后再对本层各个局部领域进行特征根提取，之后在做并列拼接。具体示意如下图所示。

作者在文中对多尺度（MSG和MRG）的方法和单一尺度（SSG）最了对比，得出多尺度确实要比单一尺度结果要好（但是又DP的加入，不确定是哪个造成了具体影响），具体效果如下图所示：

作者将MSG的特征拼接的方法使用在了PointNet的过程中，提升了对局部特征提取的能力。在代码中，分别用使用了MSG和没有使用MSG的部分。具体的实现代码如下所示(作者在代码中并没有MRG部分，不知道为什么~~~~)：

#SetAbsolutePosition = sample + group + PointNet
class PointNetSetAbstraction(nn.Module):
    def __init__(self, npoint, radius, nsample, in_channel, mlp, group_all):#npoint:中心点的数量，radius:半径，nsample:每个区域点的数量，in_channel:输入的点的通道，mlp:mlp的维度，group_all:是否有group
        super(PointNetSetAbstraction, self).__init__()
        self.npoint = npoint
        self.radius = radius
        self.nsample = nsample
        self.mlp_convs = nn.ModuleList()
        self.mlp_bns = nn.ModuleList()
        last_channel = in_channel
        for out_channel in mlp:                                                 #mlp:对mlp中的设置循环处理
            self.mlp_convs.append(nn.Conv2d(last_channel, out_channel, 1))
            self.mlp_bns.append(nn.BatchNorm2d(out_channel))
            last_channel = out_channel
        self.group_all = group_all

    def forward(self, xyz, points):
        """
        Input:
            xyz: input points position data, [B, C, N]
            points: input points data, [B, D, N]
        Return:
            new_xyz: sampled points position data, [B, C, S]
            new_points_concat: sample points feature data, [B, D', S]
        """
        xyz = xyz.permute(0, 2, 1)
        if points is not None:                  #有点
            points = points.permute(0, 2, 1)

        if self.group_all:                      #只形成一个group
            new_xyz, new_points = sample_and_group_all(xyz, points)
        else:                                   #多个group
            new_xyz, new_points = sample_and_group(self.npoint, self.radius, self.nsample, xyz, points)
        # new_xyz: sampled points position data, [B, npoint, C]
        # new_points: sampled points data, [B, npoint, nsample, C+D]
        new_points = new_points.permute(0, 3, 2, 1) # [B, C+D, nsample,npoint]     #交换维度
        for i, conv in enumerate(self.mlp_convs):   #利用group中的每个点做MLP（PointNet）
            bn = self.mlp_bns[i]
            new_points =  F.relu(bn(conv(new_points)))

        new_points = torch.max(new_points, 2)[0]     #得到全局特征
        new_xyz = new_xyz.permute(0, 2, 1)
        return new_xyz, new_points

#SetAbsolutePosition    MSG
class PointNetSetAbstractionMsg(nn.Module):
    def __init__(self, npoint, radius_list, nsample_list, in_channel, mlp_list):
        super(PointNetSetAbstractionMsg, self).__init__()
        self.npoint = npoint
        self.radius_list = radius_list                              #这里可以输入多个半径，用于提取不同尺度的特征
        self.nsample_list = nsample_list
        self.conv_blocks = nn.ModuleList()
        self.bn_blocks = nn.ModuleList()
        for i in range(len(mlp_list)):
            convs = nn.ModuleList()
            bns = nn.ModuleList()
            last_channel = in_channel + 3
            for out_channel in mlp_list[i]:           #循环实现MLP
                convs.append(nn.Conv2d(last_channel, out_channel, 1))
                bns.append(nn.BatchNorm2d(out_channel))
                last_channel = out_channel
            self.conv_blocks.append(convs)
            self.bn_blocks.append(bns)

    def forward(self, xyz, points):
        """
        Input:
            xyz: input points position data, [B, C, N]
            points: input points data, [B, D, N]
        Return:
            new_xyz: sampled points position data, [B, C, S]
            new_points_concat: sample points feature data, [B, D', S]
        """
        xyz = xyz.permute(0, 2, 1)
        if points is not None:
            points = points.permute(0, 2, 1)

        B, N, C = xyz.shape
        S = self.npoint                                                            
        new_xyz = index_points(xyz, farthest_point_sample(xyz, S))                  #输入FPS选的点的编号
        new_points_list = []
        for i, radius in enumerate(self.radius_list):                               #对不用半径做循环，用于提取不同尺度的特征
            K = self.nsample_list[i]      
            group_idx = query_ball_point(radius, K, xyz, new_xyz)                   #quer_ball选取点
            grouped_xyz = index_points(xyz, group_idx)                              #编号
            grouped_xyz -= new_xyz.view(B, S, 1, C)
            if points is not None:                                                  #如果有点
                grouped_points = index_points(points, group_idx)
                grouped_points = torch.cat([grouped_points, grouped_xyz], dim=-1)   #拼接点的特征数据和坐标
            else:
                grouped_points = grouped_xyz

            grouped_points = grouped_points.permute(0, 3, 2, 1)  # [B, D, K, S]
            for j in range(len(self.conv_blocks[i])):                               #对没个局部的区域（当前设置的尺度）求全局特征（PointNEt）
                conv = self.conv_blocks[i][j]                                       #cov
                bn = self.bn_blocks[i][j]                                           #bn
                grouped_points =  F.relu(bn(conv(grouped_points)))                  #relu(cov)
            new_points = torch.max(grouped_points, 2)[0]                            # [B, D', S]
            new_points_list.append(new_points)

        new_xyz = new_xyz.permute(0, 2, 1)                                          #sampled points position data, [B, C, S]
        new_points_concat = torch.cat(new_points_list, dim=1)                       #sample points feature data, [B, D', S]  拼接不同半径的点云特征
        return new_xyz, new_points_concat

Decoder

-->Classification

Baseline

在分类任务中，作者将Encoder中的输出输入到Decoder的任务中，之后经过了全连接。注意，在这个过程中作者加入了DroupOut（0.4），目的是为了防止过拟合。，具体实现如下：

class get_model(nn.Module):
    def __init__(self,num_class,normal_channel=True):
        super(get_model, self).__init__()
        in_channel = 3 if normal_channel else 0
        self.normal_channel = normal_channel
        self.sa1 = PointNetSetAbstractionMsg(512, 
                                             [0.1, 0.2, 0.4],       #多个尺度半径
                                             [16, 32, 128], in_channel,[[32, 32, 64], [64, 64, 128], [64, 96, 128]])
        self.sa2 = PointNetSetAbstractionMsg(128, [0.2, 0.4, 0.8], [32, 64, 128], 320,[[64, 64, 128], [128, 128, 256], [128, 128, 256]])
        self.sa3 = PointNetSetAbstraction(None, None, None, 640 + 3, [256, 512, 1024], True)
        self.fc1 = nn.Linear(1024, 512)                             #全连接
        self.bn1 = nn.BatchNorm1d(512)                              #批标准化
        self.drop1 = nn.Dropout(0.4)                                #dropout掉了40%的神经元
        self.fc2 = nn.Linear(512, 256)                             
        self.bn2 = nn.BatchNorm1d(256)
        self.drop2 = nn.Dropout(0.5)
        self.fc3 = nn.Linear(256, num_class)

    def forward(self, xyz):
        B, _, _ = xyz.shape 
        if self.normal_channel:
            norm = xyz[:, 3:, :]
            xyz = xyz[:, :3, :]
        else:
            norm = None
        l1_xyz, l1_points = self.sa1(xyz, norm)
        l2_xyz, l2_points = self.sa2(l1_xyz, l1_points)
        l3_xyz, l3_points = self.sa3(l2_xyz, l2_points)
        x = l3_points.view(B, 1024)
        x = self.drop1(F.relu(self.bn1(self.fc1(x))))
        x = self.drop2(F.relu(self.bn2(self.fc2(x))))
        x = self.fc3(x)
        x = F.log_softmax(x, -1)


        return x,l3_points

-->Segmentation

Baseline

在分割部分，因为在Encoder的过程中得到的是全局特征，但是在分割任务中需要逐点的特征。因此，PointNet++使用反向插值和跳过链接的方法，获得了逐点的特征。

PointFeature Propagation

其中反向插值的具体作法是使用基于K个紧邻的反向距离做加权平均（距离越远，权重越小），然后将Set Abstraction的特征和插值特征拼接。类似于CNN中的逐点卷积过程。其中策略如下图公式所示。

其中代码的插值操作如下所示：

#分割用，线性插值和MLP
class PointNetFeaturePropagation(nn.Module):
    def __init__(self, in_channel, mlp):
        super(PointNetFeaturePropagation, self).__init__()
        self.mlp_convs = nn.ModuleList()
        self.mlp_bns = nn.ModuleList()
        last_channel = in_channel
        for out_channel in mlp:
            self.mlp_convs.append(nn.Conv1d(last_channel, out_channel, 1))
            self.mlp_bns.append(nn.BatchNorm1d(out_channel))                        #一维批归一化
            last_channel = out_channel

    def forward(self, xyz1, xyz2, points1, points2):
        """
        Input:
            xyz1: input points position data, [B, C, N]
            xyz2: sampled input points position data, [B, C, S]
            points1: input points data, [B, D, N]
            points2: input points data, [B, D, S]
        Return:
            new_points: upsampled points data, [B, D', N]
        """
        xyz1 = xyz1.permute(0, 2, 1)
        xyz2 = xyz2.permute(0, 2, 1)

        points2 = points2.permute(0, 2, 1)
        B, N, C = xyz1.shape
        _, S, _ = xyz2.shape

        if S == 1:                                                                      #当点个数为1
            interpolated_points = points2.repeat(1, N, 1)                               #直接复制
        else:                                                                           #大于一插值
            dists = square_distance(xyz1, xyz2)                                         #计算距离                           
            dists, idx = dists.sort(dim=-1)                                             #按照距离排序
            dists, idx = dists[:, :, :3], idx[:, :, :3]  # [B, N, 3]                    #取前三个点

            dist_recip = 1.0 / (dists + 1e-8)                                           #距离越远权重越小
            norm = torch.sum(dist_recip, dim=2, keepdim=True)                           #求和
            weight = dist_recip / norm                                                  #对权重做归一化处理   平均
            interpolated_points = torch.sum(index_points(points2, idx) * weight.view(B, N, 3, 1), dim=2)#插值 多个点的权重相加获得插值点

        if points1 is not None:                                                         #如果有Points1
            points1 = points1.permute(0, 2, 1)
            new_points = torch.cat([points1, interpolated_points], dim=-1)              #拼接Points和插值结果
        else:
            new_points = interpolated_points

        new_points = new_points.permute(0, 2, 1)
        for i, conv in enumerate(self.mlp_convs):                                       #对拼接候得点做MLP
            bn = self.mlp_bns[i]
            new_points = F.relu(bn(conv(new_points)))
        return new_points

对于跳过链接的方法，其中如图的C2来自于在Encoder。也就是将Encoder时候的对应层的特征拼接了过来。具体如上图Baseline中的Skip link connection。在代码中也有所体现。对于具体的Segmentation代码如下所示。在实现的过程中，在三次Set Abstraction的Encoder操作后，增加了PointNetFeatrePropagation（），在这里做了插值，之后做了Skip link connection的操作。在分割的过程中，作者也使用了DropOut（0.5）防止过拟合。具体实现代码如下：

class get_model(nn.Module):
    def __init__(self, num_classes, normal_channel=False):
        super(get_model, self).__init__()
        if normal_channel:
            additional_channel = 3
        else:
            additional_channel = 0
        self.normal_channel = normal_channel
        self.sa1 = PointNetSetAbstractionMsg(512, #点的数量
                                             [0.1, 0.2, 0.4],      #三个选择的尺度半径
                                             [32, 64, 128],        #每个尺度的点的数量
                                             3+additional_channel, #0
                                             [[32, 32, 64],        
                                              [64, 64, 128], 
                                              [64, 96, 128]])
        self.sa2 = PointNetSetAbstractionMsg(128, [0.4,0.8], [64, 128], 128+128+64, [[128, 128, 256], [128, 196, 256]])
        self.sa3 = PointNetSetAbstraction(npoint=None, radius=None, nsample=None, in_channel=512 + 3, mlp=[256, 512, 1024], group_all=True)
        self.fp3 = PointNetFeaturePropagation(in_channel=1536, mlp=[256, 256])
        self.fp2 = PointNetFeaturePropagation(in_channel=576, mlp=[256, 128])
        self.fp1 = PointNetFeaturePropagation(in_channel=150+additional_channel, mlp=[128, 128])
        self.conv1 = nn.Conv1d(128, 128, 1)
        self.bn1 = nn.BatchNorm1d(128)
        self.drop1 = nn.Dropout(0.5)
        self.conv2 = nn.Conv1d(128, num_classes, 1)

    def forward(self, xyz, cls_label):
        # Set Abstraction layers
        B,C,N = xyz.shape
        if self.normal_channel:
            l0_points = xyz
            l0_xyz = xyz[:,:3,:]
        else:
            l0_points = xyz
            l0_xyz = xyz
        #SA
        l1_xyz, l1_points = self.sa1(l0_xyz, l0_points)
        l2_xyz, l2_points = self.sa2(l1_xyz, l1_points)
        l3_xyz, l3_points = self.sa3(l2_xyz, l2_points)
        # Feature Propagation layers
        l2_points = self.fp3(l2_xyz, l3_xyz, l2_points, l3_points)
        l1_points = self.fp2(l1_xyz, l2_xyz, l1_points, l2_points)
        cls_label_one_hot = cls_label.view(B,16,1).repeat(1,1,N)                     #Skip connerction    也就是C2C的跳跃连接
        l0_points = self.fp1(l0_xyz, l1_xyz, torch.cat([cls_label_one_hot,l0_xyz,l0_points],1), l1_points)
        # FC layers
        feat = F.relu(self.bn1(self.conv1(l0_points)))
        x = self.drop1(feat)
        x = self.conv2(x)
        x = F.log_softmax(x, dim=1)
        x = x.permute(0, 2, 1)
        return x, l3_points

Loss

在PointNet中，Classification和Segmentation都使用了交叉熵Loss

Experiments

PointNet++对Classification和Segmentation设计了不同的网络结构，并进行了充分的实验。使用ModelNet40数据集做了Classification实验，结果如下图所示。使用ShapeNet Part做了分割实验，结果如下图所示。

自测表现

我自己也使用PointNet++在ModelNet40下跑了分类的结果，跑了25个Epoch，使用的是Tesla-T4，具体结果如下图所示

Reference

PointNet++TensorFlow版本：https://github.com/charlesq34/pointnet2

PointNet++ Pytorch版本：https://github.com/yanx27/Pointnet_Pointnet2_pytorch

Qi, C., Su, H., Mo, K., & Guibas, L. (2017, April 10). PointNet: Deep Learning on point sets for 3D classification and segmentation. Retrieved May 24, 2022, from https://arxiv.org/abs/1612.00593

Qi, C., Yi, L., Su, H., & Guibas, L. (2017, June 07). PointNet++: Deep hierarchical feature learning on point sets in a metric space. Retrieved May 24, 2022, from https://arxiv.org/abs/1706.02413

资讯详情

Paper Reading: PointNet++ (Analysis & Coding）

BackGround

Progress

PointNet

主要问题

PointNet 结构

Baseline

Hierarchical point set feature learning（Encoder）

Decoder

Loss

Experiments

自测表现

Reference

动力学技术KTU1121 USB Type-C 端口保护器的介绍、特性、及应用

Paper Reading: PointNet++ (Analysis & Coding）

BackGround

Progress

PointNet

主要问题

PointNet 结构

Baseline

Hierarchical point set feature learning（Encoder）

Decoder

Loss

Experiments

自测表现

Reference

动力学技术KTU1121 USB Type-C 端口保护器的介绍、特性、及应用

最近热搜

历史搜索 清除历史记录

历史搜索清除历史记录