What I am trying to do :-
I am trying to apply a bayesian hyperband sweep.
Now as mentioned in the docs, under early terminate we have to mention 4 params (generally), those are min_iter, s, eta and max_iter, it would look something like follows.
#______________________________________________________________________________________
My doubts summarized:-
In summary, what I want to know,
Given all 4 :- min_iter, s, eta, and max_iter
-
at which epochs will the hyperband algorithm check for improvement??
-
considering I am trying to do bayesian hyperband, how many runs will be evaluated in the first bracket, and how many runs will be evaluated in the consecutive brackets?
-
is there any way or rule(s) of thumb to decide what values are good to take for these 4 parameters(min_iter, s, eta, and max_iter) ?
-
please explain about the paramters s and eta (especially eta) in a bit more detail, i.e. with a bit or underlying maths (please keep it simple if possible).
#______________________________________________________________________________________
What is my doubt about?? (explained in a bit more detail/context):-
in the docs it is only somewhat explained that at which epochs (their) implementation of the hyperband algorithm checks for improvement and takes decision whether to terminate a run or not.
- When only the minimum number of iterations for each run are our concern
early_terminate:
type: hyperband
min_iter: 3
The brackets for this example are: [3, 3*eta, 3*eta*eta, 3*eta*eta*eta]
, which equals [3, 9, 27, 81]
.
- When only the maximum number of iterations for each run are our concern
early_terminate:
type: hyperband
max_iter: 27
s: 2
The brackets for this example are [27/eta, 27/eta/eta]
, which equals [9, 3]
.
But what about a case when both the minimum and maximum number of iterations for each run are our concern??
Like the one as follows…
early_terminate:
type: hyperband
min_iter: 10
s: 3
eta: 4
max_iter: 50