Describe the bug When working on a distributed setup wandb seems to be using the /tmp directory to store some things and the directory specified in the dir argument. Here is how I initialize the wandb run if global_rank == 0: import wand...
Describe the bug if i used run = wandb.init(entity="bart_tadev", project='GPT-4 in Python', name="test") in jupyter notebook error raised TypeError: _WandbInit._pause_backend() takes 1 positional argument but 2 were given run = wandb.ini...
不再使用 loss.backward() 函数,改用 self.manual_backward(loss, opt),就可以实现半精度训练。 忽略optimizer_idx参数 def training_step(self, batch, batch_idx, opt_idx): # 获取在configure_optimizers()中返回的优化器 (opt_d, opt_g) = self.optimizers() loss_g = self.acquire_loss_g() # 注...
Search before asking I have searched the YOLOv8 issues and discussions and found no similar questions. Question I want to choose hyperparameters for my metric (so far I have chosen the metric of the model itself). I use wandb sweep for t...
To upgrade, please run: wandb: $ pip install wandb --upgrade wandb: Network error (ReadTimeout), entering retry loop. wandb: Network error (ConnectTimeout), entering retry loop. wandb: ERROR Error communicating with wandb process wandb: ERROR try: wandb.init(settings=wandb.Settings(start_...
Hi, When I run the fine-tuning script ./scripts/run_finetune.sh Some errors occur: wandb: ERROR api_key not configured (no-tty). call wandb.login(key=[your_api_key]) Traceback (most recent call last): File "/home/yan/Documents/LMFlow/examples/finetune.py", line 70, in <module>...
Whenever I need to resume a run, I have to make sure it has been synced to the cloud before I try to resume it. It would be nice to be able to resume offline runs. The following code demonstrates the problem. If you are running on "dryru...
Describe the bug I run my experiments in the offline mode. That results in multiple wandb folders such as: ./wandb/offline-run-20230701_162008-giyrbe5p ./wandb/offline-run-20230705_154815-giyrbe5p which is expected. However, the second p...
sydholladded thety:feature_requesttype of the issue is a feature requestlabelJan 6, 2022 Contributor github-actionsbotcommentedMar 8, 2022 github-actionsbotadded thestalelabelMar 8, 2022 kptkinadded thec:stitchlabelMar 2, 2023 kptkinclosed this ascompletedMar 2, 2023...
elif weights.endswith('.pt') and os.path.isfile(weights) and opt.resume: # resuming training, using the same run_id run_id = torch.load(weights).get('wandb_id') else: # transfer learning using homebrew pt file. run_id = None ...