柚子快報邀請碼778899分享:MVX-net3D算法筆記
柚子快報邀請碼778899分享:MVX-net3D算法筆記
本文為個人學(xué)習(xí)過程中所記錄筆記,便于梳理思路和后續(xù)查看用,如有錯誤,感謝批評指正! 參考: paper: code:
Abstract:
??采用Pointfusion 和VoxelFusion實現(xiàn)了相機和點云的早融合。在KITTI數(shù)據(jù)集上包括5類別的鳥瞰數(shù)據(jù)和3D檢測數(shù)據(jù)中獲得前2名的數(shù)據(jù)。
I. INTRODUCTION
??目前做3D檢測有常見的兩種思路:(1)將3D點云轉(zhuǎn)換成手工特征,比如BEVmap,然后采用2DCNN的方法進行檢測和分類,該方法收到量化的影響,當(dāng)目標(biāo)較少,上面的點云較少時,性能下降嚴重。(2)直接采用3DCNN對三維點云進行處理,該方法所需內(nèi)存太大,存在計算瓶頸。 ??VoxelNet的提出,大大提升了對于點云的處理效率。 ??本文中,將VoxelNet擴展到了多模態(tài),將點云和圖像的語義特征在早期進行融合,具體有兩種融合方法: ??(1)PointFusion:將2D圖像特征提取器提取圖像特征,將原始點云投影到圖像上,提取有點云對應(yīng)的位置的圖像特征,然后維度處理以后和點云特征直接相加融合,最后將結(jié)果輸入VoxelNet進行處理。 ??(2)VoxelFusion:采用voxelnet生成3D voxels,然后投影到圖像,然后針對每個投影后的voxel采用與訓(xùn)練的CNN進行特征提取。與Pointfusion 相比,voxelfusion是一個相對的后融合技術(shù)。
II. RELATED WORK
III. PROPOSED METHOD
??PointFusion or VoxelFusion是選其一進行采用的。 PointFusion:將原始點云投影到圖像上,然后和圖像一起輸入2D預(yù)訓(xùn)練特征提取器。 VoxelFusion:將voxel網(wǎng)格化后非空的結(jié)果投影到圖像上,然后再一起輸入2D特征提取器。
2D Detection Network ??采Faster rcnn框架提取特征。VGG16骨干。 B. VoxelNet ??包括VFE、卷積中間層和3DRPN。 ??VFE解碼在獨立的voxel水平的原始點云,VFE全連接層。詳細見點云處理方式筆記 C. Multimodal Fusion ??PointFusion: 見后續(xù)代碼分析。 ??VoxelFusion:非空的voxel投影到圖像上產(chǎn)生2D的ROI,然后進行ROI pooling。相比于pointfusion,內(nèi)存需求更低,速度更快,并且更容易通過投影所有voxel的方式擴展,使得更多利用圖像特征,避免點云覆蓋不到的目標(biāo)物漏檢的情況。(遺憾的是,該方法暫無代碼實現(xiàn),可能因為該方法在論文中指標(biāo)更低的緣故。)
D. Training Details ??在VoxelFusion中,將所有的voxel都投影到圖像上能夠更好的處理遠距離目標(biāo)的檢測。 ??測試了將原始圖片直接投影到圖像上的效果不如經(jīng)過CNN提取特征后投影的效果。
??代碼分析:參考mmdetection3d框架,PointFusion方法。 ??模型部分代碼整體結(jié)構(gòu)如下:
def forward(self,
inputs: Union[dict, List[dict]],
data_samples: OptSampleList = None,
mode: str = 'tensor',
**kwargs) -> ForwardResults:
"""The unified entry for a forward process in both training and test.
The method should accept three modes: "tensor", "predict" and "loss":
- "tensor": Forward the whole network and return tensor or tuple of
tensor without any post-processing, same as a common nn.Module.
- "predict": Forward and return the predictions, which are fully
processed to a list of :obj:`Det3DDataSample`.
- "loss": Forward and return a dict of losses according to the given
inputs and data samples.
Note that this method doesn't handle neither back propagation nor
optimizer updating, which are done in the :meth:`train_step`.
Args:
inputs (dict | list[dict]): When it is a list[dict], the
outer list indicate the test time augmentation. Each
dict contains batch inputs
which include 'points' and 'imgs' keys.
- points (list[torch.Tensor]): Point cloud of each sample.
- imgs (torch.Tensor): Image tensor has shape (B, C, H, W).
data_samples (list[:obj:`Det3DDataSample`],
list[list[:obj:`Det3DDataSample`]], optional): The
annotation data of every samples. When it is a list[list], the
outer list indicate the test time augmentation, and the
inter list indicate the batch. Otherwise, the list simply
indicate the batch. Defaults to None.
mode (str): Return what kind of value. Defaults to 'tensor'.
Returns:
The return type depends on ``mode``.
- If ``mode="tensor"``, return a tensor or a tuple of tensor.
- If ``mode="predict"``, return a list of :obj:`Det3DDataSample`.
- If ``mode="loss"``, return a dict of tensor.
"""
if mode == 'loss':
return self.loss(inputs, data_samples, **kwargs)
elif mode == 'predict':
if isinstance(data_samples[0], list):
# aug test
assert len(data_samples[0]) == 1, 'Only support ' \
'batch_size 1 ' \
'in mmdet3d when ' \
'do the test' \
'time augmentation.'
return self.aug_test(inputs, data_samples, **kwargs)
else:
return self.predict(inputs, data_samples, **kwargs)
elif mode == 'tensor':
return self._forward(inputs, data_samples, **kwargs)
else:
raise RuntimeError(f'Invalid mode "{mode}". '
'Only supports loss, predict and tensor mode')
??分為訓(xùn)練和推理兩種模式,兩種模式的通用的第一步均是特征提取,主要包括圖像特征提取和點云特征提取。以推理過程為例:
def predict(self, batch_inputs_dict: Dict[str, Optional[Tensor]],
batch_data_samples: List[Det3DDataSample],
**kwargs) -> List[Det3DDataSample]:
"""Forward of testing.
Args:
batch_inputs_dict (dict): The model input dict which include
'points' keys.
- points (list[torch.Tensor]): Point cloud of each sample.
batch_data_samples (List[:obj:`Det3DDataSample`]): The Data
Samples. It usually includes information such as
`gt_instance_3d`.
Returns:
list[:obj:`Det3DDataSample`]: Detection results of the
input sample. Each Det3DDataSample usually contain
'pred_instances_3d'. And the ``pred_instances_3d`` usually
contains following keys.
- scores_3d (Tensor): Classification scores, has a shape
(num_instances, )
- labels_3d (Tensor): Labels of bboxes, has a shape
(num_instances, ).
- bbox_3d (:obj:`BaseInstance3DBoxes`): Prediction of bboxes,
contains a tensor with shape (num_instances, 7).
"""
batch_input_metas = [item.metainfo for item in batch_data_samples]
img_feats, pts_feats = self.extract_feat(batch_inputs_dict,
batch_input_metas)
if pts_feats and self.with_pts_bbox:
results_list_3d = self.pts_bbox_head.predict(
pts_feats, batch_data_samples, **kwargs)
else:
results_list_3d = None
if img_feats and self.with_img_bbox:
# TODO check this for camera modality
results_list_2d = self.predict_imgs(img_feats, batch_data_samples,
**kwargs)
else:
results_list_2d = None
detsamples = self.add_pred_to_datasample(batch_data_samples,
results_list_3d,
results_list_2d)
return detsamples
??調(diào)用函數(shù):img_feats, pts_feats = self.extract_feat(batch_inputs_dict, batch_input_metas)下面將分別做介紹。 ??首先圖像特征提取模塊,采用FASTERNCNN結(jié)構(gòu),resnet50提取特征,然后采用FPN作為neck,調(diào)用函數(shù)img_feats = self.extract_img_feat(imgs, batch_input_metas)
def extract_img_feat(self, img: Tensor, input_metas: List[dict]) -> dict:
"""Extract features of images."""
if self.with_img_backbone and img is not None:
input_shape = img.shape[-2:]
# update real input shape of each single img
for img_meta in input_metas:
img_meta.update(input_shape=input_shape)
if img.dim() == 5 and img.size(0) == 1:
img.squeeze_()
elif img.dim() == 5 and img.size(0) > 1:
B, N, C, H, W = img.size()
img = img.view(B * N, C, H, W)
img_feats = self.img_backbone(img) # backbone采用resnet50
else:
return None
if self.with_img_neck:
img_feats = self.img_neck(img_feats) #NECK采用FPN網(wǎng)絡(luò)
return img_feats
"""
img_feats[0].shape: ([1, 256, 176, 232])
img_feats[1].shape: ([1, 256, 88, 116])
img_feats[2].shape: ([1, 256, 44, 58])
img_feats[3].shape: ([1, 256, 22, 29])
img_feats[4].shape: ([1, 256, 11, 15])
"""
??點云特征提取與圖像點云融合模塊,調(diào)用函數(shù):pts_feats = self.extract_pts_feat(voxel_dict, points=points, img_feats=img_feats, batch_input_metas=batch_input_metas)
def extract_pts_feat(
self,
voxel_dict: Dict[str, Tensor],
points: Optional[List[Tensor]] = None,
img_feats: Optional[Sequence[Tensor]] = None,
batch_input_metas: Optional[List[dict]] = None
) -> Sequence[Tensor]:
"""Extract features of points.
Args:
voxel_dict(Dict[str, Tensor]): Dict of voxelization infos.
points (List[tensor], optional): Point cloud of multiple inputs.
img_feats (list[Tensor], tuple[tensor], optional): Features from
image backbone.
batch_input_metas (list[dict], optional): The meta information
of multiple samples. Defaults to True.
Returns:
Sequence[tensor]: points features of multiple inputs
from backbone or neck.
"""
if not self.with_pts_bbox:
return None
voxel_features, feature_coors = self.pts_voxel_encoder(
voxel_dict['voxels'], voxel_dict['coors'], points, img_feats,
batch_input_metas) # torch.Size([11986, 128]) torch.Size([11986, 4])# 見類DynamicVFE,完成點云特征處理以及融合
batch_size = voxel_dict['coors'][-1, 0] + 1
x = self.pts_middle_encoder(voxel_features, feature_coors, batch_size) # torch.Size([1, 256, 200, 150])
x = self.pts_backbone(x) # 2x5個2D卷積層 輸出為兩個特征圖,分別為torch.Size([1, 128, 200, 150])torch.Size([1, 256, 100, 75])
if self.with_pts_neck:
x = self.pts_neck(x) # 采用反卷積對齊連個特征圖為torch.Size([1, 256, 200, 150]),最后concat torch.Size([1, 512, 200, 150])
return x
??點云特征處理以及融合模塊, 調(diào)用函數(shù):self.pts_voxel_encoder(voxel_dict['voxels'], voxel_dict['num_points'], voxel_dict['coors'], img_feats, batch_input_metas)
見類DynamicVFE:
def forward(self,
features: Tensor,
coors: Tensor,
points: Optional[Sequence[Tensor]] = None,
img_feats: Optional[Sequence[Tensor]] = None,
img_metas: Optional[dict] = None,
*args,
**kwargs) -> tuple:
"""Forward functions.
self.pts_voxel_encoder(
voxel_dict['voxels'], voxel_dict['coors'], points, img_feats,
batch_input_metas)
Args:
features (torch.Tensor): Features of voxels, shape is NxC.
coors (torch.Tensor): Coordinates of voxels, shape is Nx(1+NDim).
points (list[torch.Tensor], optional): Raw points used to guide the
multi-modality fusion. Defaults to None.
img_feats (list[torch.Tensor], optional): Image features used for
multi-modality fusion. Defaults to None.
img_metas (dict, optional): [description]. Defaults to None.
Returns:
tuple: If `return_point_feats` is False, returns voxel features and
its coordinates. If `return_point_feats` is True, returns
feature of each points inside voxels.
"""
features_ls = [features] # features is just points
# Find distance of x, y, and z from cluster center
if self._with_cluster_center: # True
voxel_mean, mean_coors = self.cluster_scatter(features, coors)#torch.Size([11986, 4])
points_mean = self.map_voxel_center_to_point(
coors, voxel_mean, mean_coors)
# TODO: maybe also do cluster for reflectivity
f_cluster = features[:, :3] - points_mean[:, :3]
features_ls.append(f_cluster) # 加入去中心點后的特征
# Find distance of x, y, and z from pillar center
if self._with_voxel_center:
f_center = features.new_zeros(size=(features.size(0), 3))
f_center[:, 0] = features[:, 0] - (
coors[:, 3].type_as(features) * self.vx + self.x_offset)
f_center[:, 1] = features[:, 1] - (
coors[:, 2].type_as(features) * self.vy + self.y_offset)
f_center[:, 2] = features[:, 2] - (
coors[:, 1].type_as(features) * self.vz + self.z_offset)
features_ls.append(f_center)# 加入去pillar中心點后的特征
if self._with_distance:
points_dist = torch.norm(features[:, :3], 2, 1, keepdim=True)
features_ls.append(points_dist)
# Combine together feature decorations
features = torch.cat(features_ls, dim=-1) # torch.Size([23878, 10])
for i, vfe in enumerate(self.vfe_layers):
point_feats = vfe(features) # 全連接 + ReLU # 進入融合層是torch.Size([23878, 64])
if (i == len(self.vfe_layers) - 1 and self.fusion_layer is not None
and img_feats is not None):
point_feats = self.fusion_layer(img_feats, points, point_feats,
img_metas) # 融合 #torch.Size([23878, 128])
voxel_feats, voxel_coors = self.vfe_scatter(point_feats, coors) #voxel 化
if i != len(self.vfe_layers) - 1:
# need to concat voxel feats if it is not the last vfe
feat_per_point = self.map_voxel_center_to_point(
coors, voxel_feats, voxel_coors)
features = torch.cat([point_feats, feat_per_point], dim=1)
if self.return_point_feats:
return point_feats
return voxel_feats, voxel_coors
??融合層,調(diào)用函數(shù):point_feats = self.fusion_layer(img_feats, points, point_feats, img_metas) # 最后一層開始融合
見類PointFusion:
def forward(self, img_feats: List[Tensor], pts: List[Tensor],
pts_feats: Tensor, img_metas: List[dict]) -> Tensor:
"""Forward function.
Args:
img_feats (List[Tensor]): Image features.
pts: (List[Tensor]): A batch of points with shape N x 3.
pts_feats (Tensor): A tensor consist of point features of the
total batch.
img_metas (List[dict]): Meta information of images.
Returns:
Tensor: Fused features of each point.
"""
# pts_feats.shape = torch.Size([23878, 64])
# 利用點云在圖像上的對應(yīng)坐標(biāo), 去各level特征圖中采樣出和點云點數(shù)N對應(yīng)的點。這個過程是points級別。
img_pts = self.obtain_mlvl_feats(img_feats, pts, img_metas) # torch.Size([23878, 640])
img_pre_fuse = self.img_transform(img_pts) # 全連接 + BN torch.Size([23878, 128])
if self.training and self.dropout_ratio > 0:
img_pre_fuse = F.dropout(img_pre_fuse, self.dropout_ratio)
pts_pre_fuse = self.pts_transform(pts_feats) # 全連接 + BN torch.Size([23878, 128])
fuse_out = img_pre_fuse + pts_pre_fuse # 直接將兩者特征圖相加融合
if self.activate_out:
fuse_out = F.relu(fuse_out)
if self.fuse_out: # false
fuse_out = self.fuse_conv(fuse_out)
return fuse_out #torch.Size([23878, 128])
??融合后的特征輸入稀疏卷積,調(diào)用函數(shù): x = self.pts_middle_encoder(voxel_features, voxel_dict['coors'], batch_size),
見類SparseEncoder:
def forward(self, voxel_features: Tensor, coors: Tensor,
batch_size: int) -> Union[Tensor, Tuple[Tensor, list]]:
"""Forward of SparseEncoder.
Args:
voxel_features (torch.Tensor): Voxel features in shape (N, C).
coors (torch.Tensor): Coordinates in shape (N, 4),
the columns in the order of (batch_idx, z_idx, y_idx, x_idx).
batch_size (int): Batch size.
Returns:
torch.Tensor | tuple[torch.Tensor, list]: Return spatial features
include:
- spatial_features (torch.Tensor): Spatial features are out from
the last layer.
- encode_features (List[SparseConvTensor], optional): Middle layer
output features. When self.return_middle_feats is True, the
module returns middle features.
"""
# voxel_features.shape torch.Size([11986, 128]) coors.shape torch.Size([11986, 4])
coors = coors.int()
input_sp_tensor = SparseConvTensor(voxel_features, coors,
self.sparse_shape, batch_size) # 根據(jù)voxel特征和voxel坐標(biāo)以及空間形狀和batch,建立稀疏tensor
x = self.conv_input(input_sp_tensor) # 子流線稀疏卷積+BN+Relu
encode_features = []
for encoder_layer in self.encoder_layers:
x = encoder_layer(x)
encode_features.append(x)
# for detection head
# [200, 176, 5] -> [200, 176, 2]
out = self.conv_out(encode_features[-1])
spatial_features = out.dense() # torch.Size([1, 128, 2, 200, 150])
N, C, D, H, W = spatial_features.shape
spatial_features = spatial_features.view(N, C * D, H, W) # torch.Size([1, 256, 200, 150])
if self.return_middle_feats:
return spatial_features, encode_features
else:
return spatial_features # torch.Size([1, 256, 200, 150])
??將稀疏卷積處理后的融合特征輸入second網(wǎng)絡(luò)處理,調(diào)用函數(shù):x = self.pts_backbone(x)
類SECOND:
def forward(self, x: Tensor) -> Tuple[Tensor, ...]:
"""Forward function.
Args:
x (torch.Tensor): Input with shape (N, C, H, W).
Returns:
tuple[torch.Tensor]: Multi-scale features.
"""
outs = []
for i in range(len(self.blocks)):
x = self.blocks[i](x)
outs.append(x)
return tuple(outs)# 2x5個2D卷積層 輸出為兩個特征圖,分別為torch.Size([1, 128, 200, 150])torch.Size([1, 256, 100, 75])
??接著送入SECONDFPN網(wǎng)絡(luò):調(diào)用函數(shù):if self.with_pts_neck: x = self.pts_neck(x)
見類SECONDFPN:
def forward(self, x):
"""Forward function.
Args:
x (List[torch.Tensor]): Multi-level features with 4D Tensor in
(N, C, H, W) shape.
Returns:
list[torch.Tensor]: Multi-level feature maps.
"""
assert len(x) == len(self.in_channels)
ups = [deblock(x[i]) for i, deblock in enumerate(self.deblocks)] # 反卷積操作,把兩個特征圖分辨率對齊為torch.Size([1, 128, 200, 150])
if len(ups) > 1:
out = torch.cat(ups, dim=1)
else:
out = ups[0]
return [out] # torch.Size([1, 512, 200, 150])
??至此,我們完成了圖像特征提取,點云特征提取、點云特征圖像特征融合幾個過程,得到了img_feats, pts_feats兩個輸出。數(shù)據(jù)維度如下:
img_feats, pts_feats = self.extract_feat(batch_inputs_dict, batch_input_metas)
"""
img_feats[0].shape: ([1, 256, 176, 232])
img_feats[1].shape: ([1, 256, 88, 116])
img_feats[2].shape: ([1, 256, 44, 58])
img_feats[3].shape: ([1, 256, 22, 29])
img_feats[4].shape: ([1, 256, 11, 15])
pts_feats[0].shape: torch.Size([1, 512, 200, 150])
"""
??當(dāng)執(zhí)行前向推理預(yù)測時,調(diào)用:
def predict(self, batch_inputs_dict: Dict[str, Optional[Tensor]],
batch_data_samples: List[Det3DDataSample],
**kwargs) -> List[Det3DDataSample]:
"""Forward of testing.
Args:
batch_inputs_dict (dict): The model input dict which include
'points' keys.
- points (list[torch.Tensor]): Point cloud of each sample.
batch_data_samples (List[:obj:`Det3DDataSample`]): The Data
Samples. It usually includes information such as
`gt_instance_3d`.
Returns:
list[:obj:`Det3DDataSample`]: Detection results of the
input sample. Each Det3DDataSample usually contain
'pred_instances_3d'. And the ``pred_instances_3d`` usually
contains following keys.
- scores_3d (Tensor): Classification scores, has a shape
(num_instances, )
- labels_3d (Tensor): Labels of bboxes, has a shape
(num_instances, ).
- bbox_3d (:obj:`BaseInstance3DBoxes`): Prediction of bboxes,
contains a tensor with shape (num_instances, 7).
"""
batch_input_metas = [item.metainfo for item in batch_data_samples]
img_feats, pts_feats = self.extract_feat(batch_inputs_dict,
batch_input_metas)
if pts_feats and self.with_pts_bbox: # false
results_list_3d = self.pts_bbox_head.predict(
pts_feats, batch_data_samples, **kwargs)
else:
results_list_3d = None
if img_feats and self.with_img_bbox:
# TODO check this for camera modality
results_list_2d = self.predict_imgs(img_feats, batch_data_samples,
**kwargs)
else:
results_list_2d = None
detsamples = self.add_pred_to_datasample(batch_data_samples,
results_list_3d,
results_list_2d)
return detsamples
??點云特征進入pts_bbox頭,調(diào)用函數(shù):if pts_feats and self.with_pts_bbox: results_list_3d = self.pts_bbox_head.predict( pts_feats, batch_data_samples, **kwargs)
見類Anchor3DHead:
def predict(self,
x: Tuple[Tensor],
batch_data_samples: SampleList,
rescale: bool = False) -> InstanceList:
"""Perform forward propagation of the 3D detection head and predict
detection results on the features of the upstream network.
Args:
x (tuple[Tensor]): Multi-level features from the
upstream network, each is a 4D-tensor.
batch_data_samples (List[:obj:`Det3DDataSample`]): The Data
Samples. It usually includes information such as
`gt_instance_3d`, `gt_pts_panoptic_seg` and
`gt_pts_sem_seg`.
rescale (bool, optional): Whether to rescale the results.
Defaults to False.
Returns:
list[:obj:`InstanceData`]: Detection results of each sample
after the post process.
Each item usually contains following keys.
- scores_3d (Tensor): Classification scores, has a shape
(num_instances, )
- labels_3d (Tensor): Labels of bboxes, has a shape
(num_instances, ).
- bboxes_3d (BaseInstance3DBoxes): Prediction of bboxes,
contains a tensor with shape (num_instances, C), where
C >= 7.
"""
batch_input_metas = [
data_samples.metainfo for data_samples in batch_data_samples
]
outs = self(x) # return multi_apply(self.forward_single, x)->return tuple(map(list, zip(*map_results)))
# 返回值為([cls_score], [bbox_pred], [dir_cls_pred])
predictions = self.predict_by_feat(
*outs, batch_input_metas=batch_input_metas, rescale=rescale) # rescale = false 一堆后處理,有anchor生成等,后續(xù)需要細看。
return predictions
??圖像特征進入圖像頭:源代碼中沒有圖像頭。 ??最后得出結(jié)果,調(diào)用函數(shù):detsamples = self.add_pred_to_datasample(batch_data_samples, results_list_3d, results_list_2d)
def add_pred_to_datasample(
self,
data_samples: SampleList,
data_instances_3d: OptInstanceList = None,
data_instances_2d: OptInstanceList = None,
) -> SampleList:
"""Convert results list to `Det3DDataSample`.
Subclasses could override it to be compatible for some multi-modality
3D detectors.
Args:
data_samples (list[:obj:`Det3DDataSample`]): The input data.
data_instances_3d (list[:obj:`InstanceData`], optional): 3D
Detection results of each sample.
data_instances_2d (list[:obj:`InstanceData`], optional): 2D
Detection results of each sample.
Returns:
list[:obj:`Det3DDataSample`]: Detection results of the
input. Each Det3DDataSample usually contains
'pred_instances_3d'. And the ``pred_instances_3d`` normally
contains following keys.
- scores_3d (Tensor): Classification scores, has a shape
(num_instance, )
- labels_3d (Tensor): Labels of 3D bboxes, has a shape
(num_instances, ).
- bboxes_3d (Tensor): Contains a tensor with shape
(num_instances, C) where C >=7.
When there are image prediction in some models, it should
contains `pred_instances`, And the ``pred_instances`` normally
contains following keys.
- scores (Tensor): Classification scores of image, has a shape
(num_instance, )
- labels (Tensor): Predict Labels of 2D bboxes, has a shape
(num_instances, ).
- bboxes (Tensor): Contains a tensor with shape
(num_instances, 4).
"""
assert (data_instances_2d is not None) or \
(data_instances_3d is not None),\
'please pass at least one type of data_samples'
if data_instances_2d is None: # 賦了一個空值
data_instances_2d = [
InstanceData() for _ in range(len(data_instances_3d))
]
if data_instances_3d is None:
data_instances_3d = [
InstanceData() for _ in range(len(data_instances_2d))
]
for i, data_sample in enumerate(data_samples):
data_sample.pred_instances_3d = data_instances_3d[i]
data_sample.pred_instances = data_instances_2d[i]
return data_samples
柚子快報邀請碼778899分享:MVX-net3D算法筆記
相關(guān)閱讀
本文內(nèi)容根據(jù)網(wǎng)絡(luò)資料整理,出于傳遞更多信息之目的,不代表金鑰匙跨境贊同其觀點和立場。
轉(zhuǎn)載請注明,如有侵權(quán),聯(lián)系刪除。