MuonConfig

Struct MuonConfig 

Source
pub struct MuonConfig { /* private fields */ }
Expand description

Muon configuration.

Muon is an optimizer specifically designed for 2D parameters of neural network hidden layers (weight matrices). Other parameters such as biases and embeddings should be optimized using a standard method such as AdamW.

§Learning Rate Adjustment

Muon adjusts the learning rate based on parameter shape to maintain consistent RMS across rectangular matrices. Two methods are available:

  • Original: Uses sqrt(max(1, A/B)) where A and B are the first two dimensions. This is Keller Jordan’s method and is the default.

  • MatchRmsAdamW: Uses 0.2 * sqrt(max(A, B)). This is Moonshot’s method designed to match AdamW’s RMS, allowing direct reuse of AdamW hyperparameters.

§Example

use burn_optim::{MuonConfig, AdjustLrFn};

// Using default (Original) method
let optimizer = MuonConfig::new().init();

// Using MatchRmsAdamW for AdamW-compatible hyperparameters
let optimizer = MuonConfig::new()
    .with_adjust_lr_fn(AdjustLrFn::MatchRmsAdamW)
    .init();

§References

Implementations§

Source§

impl MuonConfig

Source

pub fn new() -> Self

Create a new instance of the config.

Source§

impl MuonConfig

Source

pub fn with_momentum(self, momentum: MomentumConfig) -> Self

Momentum config.

Source

pub fn with_ns_coefficients(self, ns_coefficients: (f32, f32, f32)) -> Self

Newton-Schulz iteration coefficients (a, b, c).

Source

pub fn with_epsilon(self, epsilon: f32) -> Self

Epsilon for numerical stability.

Source

pub fn with_ns_steps(self, ns_steps: usize) -> Self

Number of Newton-Schulz iteration steps.

Source

pub fn with_adjust_lr_fn(self, adjust_lr_fn: AdjustLrFn) -> Self

Learning rate adjustment method.

Source

pub fn with_weight_decay(self, weight_decay: Option<WeightDecayConfig>) -> Self

Set the default value for the field.

Source§

impl MuonConfig

Source

pub fn init<B: AutodiffBackend, M: AutodiffModule<B>>( &self, ) -> OptimizerAdaptor<Muon<B::InnerBackend>, M, B>

Initialize Muon optimizer.

§Returns

Returns an optimizer adaptor that can be used to optimize a module.

§Example
use burn_optim::{MuonConfig, AdjustLrFn, decay::WeightDecayConfig};

// Basic configuration with default (Original) LR adjustment
let optimizer = MuonConfig::new()
    .with_weight_decay(Some(WeightDecayConfig::new(0.01)))
    .init();

// With AdamW-compatible settings using MatchRmsAdamW
let optimizer = MuonConfig::new()
    .with_adjust_lr_fn(AdjustLrFn::MatchRmsAdamW)
    .with_weight_decay(Some(WeightDecayConfig::new(0.1)))
    .init();

// Custom momentum and NS settings
let optimizer = MuonConfig::new()
    .with_momentum(MomentumConfig {
        momentum: 0.9,
        dampening: 0.1,
        nesterov: false,
    })
    .with_ns_steps(7)
    .init();

Trait Implementations§

Source§

impl Clone for MuonConfig

Source§

fn clone(&self) -> Self

Returns a duplicate of the value. Read more
1.0.0§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Config for MuonConfig

Source§

fn save<P>(&self, file: P) -> Result<(), Error>
where P: AsRef<Path>,

Available on crate feature std only.
Saves the configuration to a file. Read more
Source§

fn load<P>(file: P) -> Result<Self, ConfigError>
where P: AsRef<Path>,

Available on crate feature std only.
Loads the configuration from a file. Read more
Source§

fn load_binary(data: &[u8]) -> Result<Self, ConfigError>

Loads the configuration from a binary buffer. Read more
Source§

impl Debug for MuonConfig

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<'de> Deserialize<'de> for MuonConfig

Source§

fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl Display for MuonConfig

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Serialize for MuonConfig

Source§

fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

Blanket Implementations§

§

impl<T> Any for T
where T: 'static + ?Sized,

§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
§

impl<T> Borrow<T> for T
where T: ?Sized,

§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
§

impl<T> BorrowMut<T> for T
where T: ?Sized,

§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
§

impl<T> CloneToUninit for T
where T: Clone,

§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
§

impl<T> From<T> for T

§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
§

impl<T, U> Into<U> for T
where U: From<T>,

§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

§

impl<T> ToOwned for T
where T: Clone,

§

type Owned = T

The resulting type after obtaining ownership.
§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
§

impl<T> ToString for T
where T: Display + ?Sized,

§

fn to_string(&self) -> String

Converts the given value to a String. Read more
§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,