athena.utils.learning_rate
¶
base class for learning rate
Module Contents¶
Classes¶
WarmUp Learning rate schedule for Adam |
|
WarmUpAdam Implementation |
|
WarmUp Learning rate schedule for Adam and can initialize a learning rate |
|
WarmUpAdam Implementation |
|
ExponentialDecayLearningRateSchedule |
|
WarmUpAdam Implementation |
- class athena.utils.learning_rate.WarmUpLearningSchedule(model_dim=512, warmup_steps=4000, k=1.0, decay_steps=99999999, decay_rate=1.0)¶
Bases:
tensorflow.keras.optimizers.schedules.LearningRateSchedule
WarmUp Learning rate schedule for Adam
Example
>>> optimizer = tf.keras.optimizers.Adam(learning_rate = WarmUpLearningSchedule(512), >>> beta_1=0.9, beta_2=0.98, epsilon=1e-9)
Idea from the paper: Attention Is All You Need
- __call__(step)¶
- class athena.utils.learning_rate.WarmUpAdam(config=None, beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=False, name='WarmUpAdam', **kwargs)¶
Bases:
tensorflow.keras.optimizers.Adam
WarmUpAdam Implementation
- default_config¶
- class athena.utils.learning_rate.WarmUpLearningSchedule1(model_dim=512, warmup_steps=4000, k=1.0, decay_steps=99999999, decay_rate=1.0, lr=None)¶
Bases:
tensorflow.keras.optimizers.schedules.LearningRateSchedule
WarmUp Learning rate schedule for Adam and can initialize a learning rate
Example
>>> optimizer = tf.keras.optimizers.Adam(learning_rate = WarmUpLearningSchedule(512), >>> beta_1=0.9, beta_2=0.98, epsilon=1e-9)
Idea from the paper: Attention Is All You Need
- __call__(step)¶
- class athena.utils.learning_rate.WarmUpAdam1(config=None, beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=False, name='WarmUpAdam', **kwargs)¶
Bases:
tensorflow.keras.optimizers.Adam
WarmUpAdam Implementation
- default_config¶
- class athena.utils.learning_rate.ExponentialDecayLearningRateSchedule(initial_lr=0.005, decay_steps=10000, decay_rate=0.5, start_decay_steps=30000, final_lr=1e-05)¶
Bases:
tensorflow.keras.optimizers.schedules.LearningRateSchedule
ExponentialDecayLearningRateSchedule
Example
>>> optimizer = tf.keras.optimizers.Adam(learning_rate = ExponentialDecayLearningRate(0.01, 100))
- Parameters
initial_lr –
decay_steps –
- Returns
initial_lr * (0.5 ** (step // decay_steps))
- __call__(step)¶