柚子快報(bào)激活碼778899分享:PointNet2(一)分類
? ? ? ? 發(fā)現(xiàn)PVN3D中使用到了pointnet2和 densfusion等網(wǎng)絡(luò),為了看懂pvn3d,因此得看看pointnet2,然而帶cpp,cu文件的程序一時(shí)辦事編譯不成功,因此找到了一個(gè)?Pointnet_Pointnet2_pytorch-master,里面有pointnet和pointnet2網(wǎng)絡(luò),在這個(gè)程序中學(xué)習(xí)pointnet2.
首先看Pointnet2_utils.py文件
1.pc_normalize函數(shù)
? ? ? ? 這一個(gè)是點(diǎn)歸一化,就是找到點(diǎn)集合得中心,然后每個(gè)點(diǎn)都減去這個(gè)中心,然后計(jì)算x*x+y*y+z*z再開根號(hào),得到距離。使用max計(jì)算最大距離,然后讓每一個(gè)重心化之后的點(diǎn)除上這個(gè)距離,就得到了歸一化得坐標(biāo)。
def pc_normalize(pc):
l = pc.shape[0] #點(diǎn)的個(gè)數(shù)
centroid = np.mean(pc, axis=0) #點(diǎn)的中心
pc = pc - centroid #重心化
m = np.max(np.sqrt(np.sum(pc**2, axis=1))) #計(jì)算得到點(diǎn)里中心點(diǎn)的最大距離
pc = pc / m #除上最大距離
return pc
2.square_distance? ?
M個(gè)點(diǎn)和N個(gè)點(diǎn)之間的距離構(gòu)成了N*M矩陣, 每一個(gè)元素(i,j)中的值存儲(chǔ)的都是,點(diǎn)集合N中的第i個(gè)點(diǎn)和? 點(diǎn)集合M中的第j個(gè)點(diǎn)之間的距離。
#計(jì)算距離矩陣 比如src是 B組N個(gè)點(diǎn) dst是B組M個(gè)點(diǎn), 則最后得到的距離矩陣是 B組N行M列
def square_distance(src, dst):
"""
Calculate Euclid distance between each two points.
src^T * dst = xn * xm + yn * ym + zn * zm;
sum(src^2, dim=-1) = xn*xn + yn*yn + zn*zn;
sum(dst^2, dim=-1) = xm*xm + ym*ym + zm*zm;
dist = (xn - xm)^2 + (yn - ym)^2 + (zn - zm)^2
= sum(src**2,dim=-1)+sum(dst**2,dim=-1)-2*src^T*dst
Input:
src: source points, [B, N, C]
dst: target points, [B, M, C]
Output:
dist: per-point square distance, [B, N, M]
"""
B, N, _ = src.shape #B組M個(gè)點(diǎn)
_, M, _ = dst.shape #B組B個(gè)點(diǎn)
dist = -2 * torch.matmul(src, dst.permute(0, 2, 1)) #B組 N行M列矩陣,每一個(gè)元素都是-2*i*j
dist += torch.sum(src ** 2, -1).view(B, N, 1) #平方項(xiàng),使用了廣播特性
dist += torch.sum(dst ** 2, -1).view(B, 1, M) #平方項(xiàng),使用了廣播特性
return dist
3.index_points
根據(jù)點(diǎn)索引提取點(diǎn)坐標(biāo)或者點(diǎn)屬性特征矢量
#根據(jù)idx 索引得到點(diǎn)位置或者屬性
def index_points(points, idx):
"""
Input:
points: input points data, [B, N, C]
idx: sample index data, [B, S]
Return:
new_points:, indexed points data, [B, S, C]
"""
device = points.device #點(diǎn)所在的設(shè)備
B = points.shape[0] #B組
view_shape = list(idx.shape) #
view_shape[1:] = [1] * (len(view_shape) - 1)
repeat_shape = list(idx.shape)
repeat_shape[0] = 1
batch_indices = torch.arange(B, dtype=torch.long).to(device).view(view_shape).repeat(repeat_shape)
new_points = points[batch_indices, idx, :]
return new_points
4.最遠(yuǎn)點(diǎn)采樣
? ? 算法原理比較簡(jiǎn)單,首先隨機(jī)生成B*1的farthest,作為centroids的第一次選擇的點(diǎn)索引,然后
使用xyz[batch_indices, farthest, :].view(B, 1, 3) 把這些索引到的點(diǎn)給提取出來,然后讓每一組中的點(diǎn)都減去對(duì)應(yīng)組的索引點(diǎn),再計(jì)算距離,再第0次迭代中,由于計(jì)算出來的點(diǎn)點(diǎn)距離都小于1e10,因此distance都被更新了。
? ? ? ? 這時(shí)候選擇出來最大距離點(diǎn),就是第二個(gè)最遠(yuǎn)點(diǎn)。選擇該最遠(yuǎn)點(diǎn)的索引作為第1次迭代(注意從第0次迭代開始),然后還是與每一個(gè)點(diǎn)進(jìn)行比較,計(jì)算距離,如果距離小于distance中的距離,則讓該距離小的值填充distance中對(duì)應(yīng)位置,最后,在選擇該輪中distance中的最大值所對(duì)應(yīng)的索引作為 farthest。
相當(dāng)于每一組一個(gè)distanceK,每一次都用距離最小值更新distanceK, 等同于選擇與 最遠(yuǎn)點(diǎn)集中所有點(diǎn) 距離最小的點(diǎn)中的 距離最大點(diǎn),有點(diǎn)繞口,詳細(xì)來說,分為三步:
1)讓剩余點(diǎn)集中的一個(gè)點(diǎn),與最遠(yuǎn)點(diǎn)集中的所有點(diǎn)計(jì)算距離,選擇最小距離
2)遍歷1),得到剩余點(diǎn)集中每一個(gè)點(diǎn) 到 最遠(yuǎn)點(diǎn)集中的 最小距離
3)在所有最小距離中,選擇最大距離所對(duì)應(yīng)的點(diǎn)
def farthest_point_sample(xyz, npoint):
"""
Input:
xyz: pointcloud data, [B, N, 3]
npoint: number of samples
Return:
centroids: sampled pointcloud index, [B, npoint]
"""
device = xyz.device #設(shè)備
B, N, C = xyz.shape #batch n個(gè)點(diǎn) c
centroids = torch.zeros(B, npoint, dtype=torch.long).to(device)
distance = torch.ones(B, N).to(device) * 1e10 #B*N矩陣,
farthest = torch.randint(0, N, (B,), dtype=torch.long).to(device) # 隨機(jī)生成B個(gè) 0-N中的數(shù)值
batch_indices = torch.arange(B, dtype=torch.long).to(device) #B個(gè)索引值
for i in range(npoint):
centroids[:, i] = farthest # 第0次是 隨機(jī)的索引,
centroid = xyz[batch_indices, farthest, :].view(B, 1, 3) #batch_indices = :, farthest, centroid得到中心點(diǎn)
dist = torch.sum((xyz - centroid) ** 2, -1) # 減去 farthest 所對(duì)應(yīng)的點(diǎn),得到距離值
mask = dist < distance #生成掩膜
distance[mask] = dist[mask] #把dist中比distance中對(duì)應(yīng)位置更小的距離,更新到distance中
farthest = torch.max(distance, -1)[1] #最大距離值的索引
return centroids
5.query_ball_point
功能:讓group_idx 為N的位置填充上 每一組中每一行第一個(gè)元素值,相當(dāng)于是小于nsample個(gè)點(diǎn)的時(shí)候,就填充第一個(gè)點(diǎn)的索引
理解:最遠(yuǎn)采樣點(diǎn)為中心,得到小于中心+radius 區(qū)域內(nèi)的點(diǎn)的索引,如果不夠,則使用第一個(gè)查詢的點(diǎn)的索引進(jìn)行填充。最終每一個(gè)組中,每一個(gè)點(diǎn),都采樣到了相同個(gè)數(shù)(nsample個(gè))的點(diǎn),
def query_ball_point(radius, nsample, xyz, new_xyz):
"""
Input:
radius: local region radius
nsample: max sample number in local region
xyz: all points, [B, N, 3]
new_xyz: query points, [B, S, 3]
Return:
group_idx: grouped points index, [B, S, nsample]
"""
device = xyz.device
B, N, C = xyz.shape
_, S, _ = new_xyz.shape
group_idx = torch.arange(N, dtype=torch.long).to(device).view(1, 1, N).repeat([B, S, 1]) #group_idx =[B,S,N]
sqrdists = square_distance(new_xyz, xyz) #得到new_xyz和xyz之間的距離矩陣
group_idx[sqrdists > radius ** 2] = N #距離矩陣中的值大于查詢半徑的時(shí)候,,索引值設(shè)置為N,
group_idx = group_idx.sort(dim=-1)[0][:, :, :nsample] #按照行進(jìn)行排序(相當(dāng)于對(duì)一行中的所有列進(jìn)行排序),只選取前nsample個(gè)
group_first = group_idx[:, :, 0].view(B, S, 1).repeat([1, 1, nsample]) #讓每一組中每一行第一個(gè)元素值,填充group_first
mask = group_idx == N #制作一個(gè)掩膜
group_idx[mask] = group_first[mask] #讓group_idx 為N的位置填充上 每一組中每一行第一個(gè)元素值,相當(dāng)于是小于nsample個(gè)點(diǎn)的時(shí)候,就填充第一個(gè)點(diǎn)的索引
return group_idx #返回索引
6.sample_and_group
這個(gè)函數(shù)的活,除了最遠(yuǎn)點(diǎn)采樣和聚組 這兩個(gè)函數(shù),剩下的就是一個(gè)根據(jù)索引計(jì)算屬性的函數(shù)了,然后把點(diǎn)位置和點(diǎn)屬性進(jìn)行特征連接,得到心得特征。
def sample_and_group(npoint, radius, nsample, xyz, points, returnfps=False):
"""
Input:
npoint:
radius:
nsample:
xyz: input points position data, [B, N, 3]
points: input points data, [B, N, D]
Return:
new_xyz: sampled points position data, [B, npoint, nsample, 3]
new_points: sampled points data, [B, npoint, nsample, 3+D]
"""
B, N, C = xyz.shape
S = npoint
fps_idx = farthest_point_sample(xyz, npoint) # [B, npoint, C]
new_xyz = index_points(xyz, fps_idx) #得到最遠(yuǎn)點(diǎn)采樣的點(diǎn)集合
idx = query_ball_point(radius, nsample, xyz, new_xyz)
grouped_xyz = index_points(xyz, idx) # [B, npoint, nsample, C] #得到球查詢的點(diǎn)的組
grouped_xyz_norm = grouped_xyz - new_xyz.view(B, S, 1, C) #使用最遠(yuǎn)采樣點(diǎn)作為中心點(diǎn),進(jìn)行重心化
if points is not None:
grouped_points = index_points(points, idx) #對(duì)屬性/特征進(jìn)行采樣
new_points = torch.cat([grouped_xyz_norm, grouped_points], dim=-1) # [B, npoint, nsample, C+D] 連接點(diǎn)位置(3) 和 點(diǎn)特征
else:
new_points = grouped_xyz_norm #只要點(diǎn)位置,沒有點(diǎn)屬性
if returnfps:
return new_xyz, new_points, grouped_xyz, fps_idx
else:
return new_xyz, new_points #返回了最遠(yuǎn)點(diǎn)采樣中心點(diǎn),和生成的特征
7.sample_and_group_all
傳進(jìn)來的點(diǎn)和屬性全都用,不做采樣了,然后如果有屬性,就把位置和屬性連接起來構(gòu)成新的屬性,如果沒有屬性,則只使用位置。
def sample_and_group_all(xyz, points):
"""
Input:
xyz: input points position data, [B, N, 3]
points: input points data, [B, N, D]
Return:
new_xyz: sampled points position data, [B, 1, 3]
new_points: sampled points data, [B, 1, N, 3+D]
"""
device = xyz.device
B, N, C = xyz.shape
new_xyz = torch.zeros(B, 1, C).to(device)
grouped_xyz = xyz.view(B, 1, N, C)
if points is not None:
new_points = torch.cat([grouped_xyz, points.view(B, 1, N, -1)], dim=-1)
else:
new_points = grouped_xyz
return new_xyz, new_points
8.PointNetSetAbstractionMsg
# PointNetSetAbstractionMsg(512, [0.1, 0.2, 0.4], [16, 32, 128], in_channel,[[32, 32, 64], [64, 64, 128], [64, 96, 128]])
看這一行參數(shù),表示,最遠(yuǎn)點(diǎn)采樣采集512個(gè)點(diǎn)作為中心點(diǎn),開始聚組,球半徑分別為,0.1,0.2,0.3,在每一個(gè)半徑中選擇16個(gè)點(diǎn),32個(gè)點(diǎn)和128個(gè)點(diǎn)。in_chanenl在第一次時(shí)候?yàn)?(位置),為6(帶法向)。分別生成了:
B*512*16*3? ? 通道為3,最遠(yuǎn)點(diǎn)采樣了512個(gè)點(diǎn),每個(gè)點(diǎn)組了16個(gè)點(diǎn),每個(gè)點(diǎn)是3維。
B*512*32*3
B*512*128*3
三個(gè)張量。
第一個(gè)SA:
然后B*512*16*3? ? 通道為3,[32, 32, 64]分別relu(bn(conv()))三件套,三角套第一次生成?B*32* 512*16,max后把組點(diǎn)個(gè)數(shù)給銷毀,變成B*32* 512; 在第一次基礎(chǔ)上第二次生成B*32* 512*16,,max后把組點(diǎn)個(gè)數(shù)給銷毀,變成B*32* 512; 在第二次基礎(chǔ)上第三次生成B*64* 512*16,,max后把組點(diǎn)個(gè)數(shù)給銷毀,變成B*64* 512.?。
new_xyz, 為512個(gè)點(diǎn)
new_points 選中列表最后一個(gè)數(shù)值,[32, 32, 64], [64, 64, 128], [64, 96, 128]], 因此結(jié)果為:B*(64+ 128 + 128)* 512 ,為 B* 320* 512.
也就是第一次SA后得到new_xyz(512個(gè)點(diǎn)),new_points (B* 320* 512)
第二個(gè)SA:
self.sa2 = PointNetSetAbstractionMsg(128, [0.2, 0.4, 0.8], [32, 64, 128], 320,[[64, 64, 128], [128, 128, 256], [128, 128, 256]])
同理,是在512個(gè)點(diǎn)中采集128個(gè)點(diǎn),這一次特征矢量是320維度+3位置=323,最后計(jì)算結(jié)果就是
B*(128+ 256+ 256)* 512 ,為 B* 640* 512.
第三個(gè)SA:
PointNetSetAbstraction(None, None, None, 640 + 3, [256, 512, 1024], True)
顯然,三次三件套后就是?B* 1024* 128,128是第二次SA中的最遠(yuǎn)采樣點(diǎn)個(gè)數(shù),然后max一下變成了B* 1024矢量
# PointNetSetAbstractionMsg(512, [0.1, 0.2, 0.4], [16, 32, 128], in_channel,[[32, 32, 64], [64, 64, 128], [64, 96, 128]])
class PointNetSetAbstraction(nn.Module):
def __init__(self, npoint, radius, nsample, in_channel, mlp, group_all):
super(PointNetSetAbstraction, self).__init__()
self.npoint = npoint
self.radius = radius
self.nsample = nsample
self.mlp_convs = nn.ModuleList()
self.mlp_bns = nn.ModuleList()
last_channel = in_channel
for out_channel in mlp:
self.mlp_convs.append(nn.Conv2d(last_channel, out_channel, 1))
self.mlp_bns.append(nn.BatchNorm2d(out_channel))
last_channel = out_channel
self.group_all = group_all
def forward(self, xyz, points):
"""
Input:
xyz: input points position data, [B, C, N]
points: input points data, [B, D, N]
Return:
new_xyz: sampled points position data, [B, C, S]
new_points_concat: sample points feature data, [B, D', S]
"""
xyz = xyz.permute(0, 2, 1)
if points is not None:
points = points.permute(0, 2, 1)
if self.group_all:
new_xyz, new_points = sample_and_group_all(xyz, points)
else:
new_xyz, new_points = sample_and_group(self.npoint, self.radius, self.nsample, xyz, points)
# new_xyz: sampled points position data, [B, npoint, C]
# new_points: sampled points data, [B, npoint, nsample, C+D]
new_points = new_points.permute(0, 3, 2, 1) # [B, C+D, nsample,npoint]
for i, conv in enumerate(self.mlp_convs):
bn = self.mlp_bns[i]
new_points = F.relu(bn(conv(new_points)))
new_points = torch.max(new_points, 2)[0]
new_xyz = new_xyz.permute(0, 2, 1)
return new_xyz, new_points
9.get_model
最后一次PointNetSetAbstraction,生成了一個(gè)表示該物體的一個(gè)特征向量。該向量被送到全連接層,最后一層就是分類個(gè)數(shù)(40或10)了。
B* 1024矢量經(jīng)過,1024--->512---》256---》num_class(10或者40),就得到了類別輸出,這時(shí)候和標(biāo)簽真值使用softmax作為loss函數(shù),計(jì)算得到誤差。
class get_model(nn.Module):
def __init__(self,num_class,normal_channel=True):
super(get_model, self).__init__()
in_channel = 3 if normal_channel else 0
self.normal_channel = normal_channel
self.sa1 = PointNetSetAbstractionMsg(512, [0.1, 0.2, 0.4], [16, 32, 128], in_channel,[[32, 32, 64], [64, 64, 128], [64, 96, 128]])
self.sa2 = PointNetSetAbstractionMsg(128, [0.2, 0.4, 0.8], [32, 64, 128], 320,[[64, 64, 128], [128, 128, 256], [128, 128, 256]])
self.sa3 = PointNetSetAbstraction(None, None, None, 640 + 3, [256, 512, 1024], True)
self.fc1 = nn.Linear(1024, 512)
self.bn1 = nn.BatchNorm1d(512)
self.drop1 = nn.Dropout(0.4)
self.fc2 = nn.Linear(512, 256)
self.bn2 = nn.BatchNorm1d(256)
self.drop2 = nn.Dropout(0.5)
self.fc3 = nn.Linear(256, num_class)
def forward(self, xyz):
B, _, _ = xyz.shape
if self.normal_channel:
norm = xyz[:, 3:, :]
xyz = xyz[:, :3, :]
else:
norm = None
l1_xyz, l1_points = self.sa1(xyz, norm)
l2_xyz, l2_points = self.sa2(l1_xyz, l1_points)
l3_xyz, l3_points = self.sa3(l2_xyz, l2_points)
x = l3_points.view(B, 1024)
x = self.drop1(F.relu(self.bn1(self.fc1(x))))
x = self.drop2(F.relu(self.bn2(self.fc2(x))))
x = self.fc3(x)
x = F.log_softmax(x, -1)
return x,l3_points
柚子快報(bào)激活碼778899分享:PointNet2(一)分類
推薦閱讀
本文內(nèi)容根據(jù)網(wǎng)絡(luò)資料整理,出于傳遞更多信息之目的,不代表金鑰匙跨境贊同其觀點(diǎn)和立場(chǎng)。
轉(zhuǎn)載請(qǐng)注明,如有侵權(quán),聯(lián)系刪除。