Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best weight check between training sessions and exported set_batch_network #3465

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

keko950
Copy link

@keko950 keko950 commented Jun 21, 2019

Just modified detector.c in order to check the best weight map every time a training session is started as described in #3452.

If *best.weights exist in backup path, detector will do a map before start training, it will save it and initialize the best_map value, so now you can maintain your best weight across different training sessions :)

Edit:
Also updated network.h in order to add set_batch_network to the exported api, so you can use dynamic batching..

@skaldesh
Copy link

skaldesh commented Jul 8, 2019

@keko950 could you please tell me, is the best file after training then always named _final.weights?

@keko950
Copy link
Author

keko950 commented Jul 8, 2019

@keko950 could you please tell me, is the best file after training then always named _final.weights?

Nope, final and best weights are different.

@skaldesh
Copy link

skaldesh commented Jul 8, 2019

Ok, I tested your PR and trained a model for 1000 batches. When the training was done, I got these 3 files: dnn_train_1000.weights, dnn_train_final.weights, dnn_train_last.weights
What happened to the _best.weights?

@skaldesh
Copy link

skaldesh commented Jul 8, 2019

Could it be, that you first save a _best.weights, when the first comparison happened? So I should have used 2000 for max_batches.

@keko950
Copy link
Author

keko950 commented Jul 8, 2019

Could it be, that you first save a _best.weights, when the first comparison happened? So I should have used 2000 for max_batches.

A _best.weights file is created after a map is done, so yes, probably you should increase your max_batches.

@skaldesh
Copy link

skaldesh commented Jul 9, 2019

So I tried again, I have now the following files:

  • darknet53.conv.74
  • dnn_train_1000.weights
  • dnn_train_2000.weights
  • dnn_train_3000.weights
  • dnn_train_final.weights
  • dnn_train_last.weights

Still no _best.weights? My training is still running, does it need to be stopped to generate the is?

@keko950
Copy link
Author

keko950 commented Jul 9, 2019

So I tried again, I have now the following files:

  • darknet53.conv.74
  • dnn_train_1000.weights
  • dnn_train_2000.weights
  • dnn_train_3000.weights
  • dnn_train_final.weights
  • dnn_train_last.weights

Still no _best.weights? My training is still running, does it need to be stopped to generate the is?

Just tested it again, working fine, trained 2000 iterations:
cam-lat_1000.weights
cam-lat_2000.weights
cam-lat_best.weights
cam-lat_final.weights
cam-lat_last.weights

What command are you running? Show me your ./darknet argument line
Also make sure you are running master version

@skaldesh
Copy link

skaldesh commented Jul 9, 2019

What command are you running?

darknet detector train darknet.data dnn.cfg weights/darknet53.conv.74

Also make sure you are running master version

What do you mean? Of course I am running on your PR, since I want to test it

@keko950
Copy link
Author

keko950 commented Jul 9, 2019

What command are you running?

darknet detector train darknet.data dnn.cfg weights/darknet53.conv.74

Also make sure you are running master version

What do you mean? Of course I am running on your PR, since I want to test it

Add -map option in order to generate best weights
Example:
./darknet detector train x.data y.cfg z.weights -map

@skaldesh
Copy link

skaldesh commented Jul 9, 2019

got it! Thanks!

@skaldesh
Copy link

@AlexeyAB could you merge this please?

@skaldesh
Copy link

ping

@r0l1
Copy link

r0l1 commented Aug 15, 2019

@AlexeyAB would appreciate this too. Useful feature!

@keko950 keko950 mentioned this pull request Sep 16, 2019
@keko950 keko950 changed the title Best weight check between training sessions Best weight check between training sessions and exported set_batch_network Sep 16, 2019
@skaldesh
Copy link

@keko950 the appVeyor build failed, can you look into this?
@AlexeyAB is there interest in merging this? I personally do not want to constantly check my training output to find the best model

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants