首先在train.py中给dataset加入log相关参数 cfg.data.train['log_level'] = cfg.log_level cfg.data.train['log_file'] = log_file 1. 2. 加入库: from mmcv.utils import print_log from mmdet.utils import get_root_logger 1. 2. 有两种log的方法: (): self.(f'Distributed trai...
这个参数会传入在mmdet/datasets/builder.py里 ifdist:# When model is :obj:`DistributedDataParallel`,# `batch_size` of :obj:`dataloader` is the# number of training samples on each GPU.batch_size = samples_per_gpu num_workers = workers_per_gpuelse:# When model is obj:`DataParallel`# the ...
'The training pipeline of the dataset wrapper' ' always return None.Please check the correctness ' 'of the dataset and its pipeline.') if 'mix_results' in results: results.pop('mix_results') return results def update_skip_type_keys(self, skip_type_keys): """Update skip_type...
Suitable for training on class imbalanced datasets like LVIS. Following the sampling strategy in the `paper <https://arxiv.org/abs/1908.03195>`_, in each epoch, an image may appear multiple times based on its "repeat factor". The repeat factor for an image is a function of the frequ...
Non-distributed training Please refer to tools/train.py for non-distributed training, which is not recommended and left for debugging. Even on a single machine, distributed training is preferred. Train on custom datasets We define a simple annotation format. The annotation of a dataset is a list...
Distributed training: False GPU number: 1 05/04 16:32:53 - mmengine - INFO - Config: voxel_size = [0.16, 0.16, 4] model = dict( type='VoxelNet', data_preprocessor=dict( type='Det3DDataPreprocessor', voxel=True, voxel_layer=dict( ...
'mmcv.device.mlu.distributed', 'mmcv.device.mlu', 'mmcv.device', 'mmcv.runner', 'mmcv.cnn.builder', 'mmcv.cnn.resnet', 'mmcv.cnn.vgg', 'mmcv.cnn', 'mmcv.ops.carafe', 'mmcv.ops.cc_attention', 'mmcv.ops.contour_expand', 'mmcv.ops.convex_iou', 'mmcv.ops.corner_pool', 'mmcv...
multi-gpu training(单机多卡) CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=1 ./tools/train.py ./configs/solov2/solov2_r50_fpn_1x_coco.py --launcher pytorch 3.本地测试结果 (1)可选参数 --out:输出结果的文件名是pickle格式 ...
return _run_code(code, main_globals, None, File "/home/zltjohn/anaconda3/envs/open-mmlab/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/zltjohn/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in ma...
register_training_hooks(cfg.lr_config, optimizer_config, cfg.checkpoint_config, cfg.log_config, cfg.get('momentum_config', None)) # 如果是分布式训练,并且用的是EpochBasedRunner,就要注册一个DistSamplerSeedHook, # 这个主要是用来设置分布式训练中data sampler中的随机种子的; if distributed: if ...