Hello, can someone please help Me get this set up?
I am running this in Google Colab
I am unable to get any data from my tacotron model. I was able to login/create multiple wandb runs and it tracks usage(log is connected, Utilization updates) but no data has been entered and the way it trains it runs one line forever so i dont even know how to begin.
This is the Script that currently runs the training
print('FP16 Run:', hparams.fp16_run)
print('Dynamic Loss Scaling:', hparams.dynamic_loss_scaling)
print('Distributed Run:', hparams.distributed_run)
print('cuDNN Enabled:', hparams.cudnn_enabled)
print('cuDNN Benchmark:', hparams.cudnn_benchmark)
from IPython.display import Javascript
display(Javascript('''google.colab.output.setIframeHeight(0, true, {maxHeight: 200})'''))
#for i in range(200):
# print(i)
train(output_directory, log_directory, checkpoint_path,
warm_start, n_gpus, rank, group_name, hparams, log_directory2)
I tried doing this but it didnt work either. here is my code to start the training with wandb
Install and login
#@markdown Login and start a new run
print('Installing wandb')
!pip -q install wandb
import wandb
print('Login To wanb!!!\n')
!wandb login
Capture a dictionary of hyperparameters
#@markdown Capture a dictionary of hyperparameters
wandb.config.p_attention_dropout=hparams.p_attention_dropout
wandb.config.p_decoder_dropout=hparams.p_decoder_dropout
wandb.config.decay_start=hparams.decay_start
wandb.config.A_=hparams.A_
wandb.config.B_=hparams.B_
wandb.config.C_=hparams.C_
wandb.config.min_learning_rate=hparams.min_learning_rate
wandb.config.batch_size=hparams.batch_size
wandb.config.epochs=hparams.epochs
wandb.config.generate_mels=generate_mels
wandb.config.show_alignments=hparams.show_alignments
wandb.config.alignment_graph_height=alignment_graph_height
wandb.config.alignment_graph_width=alignment_graph_width
wandb.config.load_mel_from_disk=hparams.load_mel_from_disk
wandb.config.ignore_layers=hparams.ignore_layers
wandb.config.checkpoint_path=checkpoint_path
Start wanb and get runID
wandb.init(project="tacotron", entity="gmirsky2")
start wandb run then Train
#Run
api = wandb.Api()
run = api.run("gmirsky2/tacotron/" + wandb.run.id)
#train
train(output_directory, log_directory, checkpoint_path,
warm_start, n_gpus, rank, group_name, hparams, log_directory2)
# save the metrics for the run to a csv file
metrics_dataframe = run.history()
metrics_dataframe.to_csv("metrics.csv")
When The Training Runs it Just goes to the train line and then never finishes.
I am looking for help with how to incorporate wandb with the tacotron train script…
train(output_directory, log_directory, checkpoint_path,
warm_start, n_gpus, rank, group_name, hparams, log_directory2)
I thought that was what the hyperparameters were for but i guess im wrong.
Any help would be welcome. Thanks a bunch!