fsrl.utils¶
BaseLogger¶
- class fsrl.utils.BaseLogger(log_dir=None, log_txt=True, name=None)[source]¶
Bases:
ABCThe base class for any logger which is compatible with trainer. All the loggers create four panels by default: train, test, loss, and update. Try to overwrite write() method to customize your own logger.
- Parameters:
log_dir (str) – the log directory. Default to None.
log_txt (bool) – whether to log data in
log_dirwith nameprogress.txt. Default to True.name (str) – the experiment name. If None, it will use the current time as the name. Default to None.
- setup_checkpoint_fn(checkpoint_fn: Callable | None = None) None[source]¶
Setup the function to obtain the model checkpoint, it will be called when using
`logger.save_checkpoint()`.- Parameters:
checkpoint_fn (Optional[Callable]) – the hook function to get the checkpoint dictionary, defaults to None.
- store(tab: str | None = None, **kwargs) None[source]¶
Store any values to the current epoch buffer with prefix tab/.
Example use:
logger = EpochLogger(**logger_kwargs) logger.save_config(locals())
- Parameters:
tab (str) – the prefix of the logging data, defaults to None.
- write(step: int, display: bool = False, display_keys: Iterable[str] | None = None) None[source]¶
Writing data to somewhere and reset the stored data.
- Parameters:
step (int) – the current training step or epochs
display (bool) – whether print the logged data in terminal, default to False
display_keys (Iterable[str]) – a list of keys to be printed. If None, print all stored keys, default to None.
- write_without_reset(*args, **kwarg) None[source]¶
Writing data to somewhere without resetting the current stored stats, for tensorboard and wandb logger usage.
- save_checkpoint(suffix: int | str | None = None) None[source]¶
Use writer to log metadata when calling
save_checkpoint_fnin trainer.- Parameters:
suffix (Optional[Union[int, str]]) – the suffix to be added to the stored checkpoint name, defaults to None.
- save_config(config: dict, verbose=True) None[source]¶
Log an experiment configuration.
Call this once at the top of your experiment, passing in all important config vars as a dict. This will serialize the config to JSON, while handling anything which can’t be serialized in a graceful way (writing as informative a string as possible).
Example use:
logger = BaseLogger(**logger_kwargs) logger.save_config(locals())
- Parameters:
config (dict) – the configs to be stored.
verbose (bool) – whether to print the saved configs, default to True.
- get_std(key: str) float[source]¶
Get the standard deviation of the queried data in storage.
- Parameters:
key (str) – the key of the queried data.
- Returns:
the standard deviation.
- get_mean(key: str) float[source]¶
Get the mean of the queried data in storage.
- Parameters:
key (str) – the key of the queried data.
- Returns:
the mean.
- get_mean_list(keys: Iterable[str]) list[source]¶
Get the list of queried data in storage.
- Parameters:
keys (Iterable[str]) – the keys of the queried data.
- Returns:
the list of mean values.
- get_mean_dict(keys: Iterable[str]) dict[source]¶
Get the dict of queried data in storage.
- Parameters:
keys (Iterable[str]) – the keys of the queried data.
- Returns:
the dict of mean values.
- property stats_mean: dict¶
- property logger_keys: Iterable¶
TensorboardLogger¶
- class fsrl.utils.TensorboardLogger(log_dir: str | None = None, log_txt: bool = True, name: str | None = None)[source]¶
Bases:
BaseLoggerA logger with tensorboard SummaryWriter to visualize and log statistics.
- Parameters:
log_dir (str) – the log directory. Default to None.
log_txt (bool) – whether to log data in
log_dirwith nameprogress.txt. Default to True.name (str) – the experiment name. If None, it will use the current time as the name. Default to None.
- write(step: int, display: bool = True, display_keys: Iterable[str] | None = None) None[source]¶
Writing data to somewhere and reset the stored data.
- Parameters:
step (int) – the current training step or epochs
display (bool) – whether print the logged data in terminal, default to False
display_keys (Iterable[str]) – a list of keys to be printed. If None, print all stored keys, default to None.
WandbLogger¶
- class fsrl.utils.WandbLogger(config: dict = {}, project: str = 'fsrl', group: str = 'test', name: str | None = None, log_dir: str = 'log', log_txt: bool = True)[source]¶
Bases:
BaseLoggerWeights and Biases logger that sends data to https://wandb.ai/.
A typical usage example:
config = {...} project = "test_cvpo" group = "SafetyCarCircle-v0" name = "default_param" log_dir = "logs" logger = WandbLogger(config, project, group, name, log_dir) logger.save_config(config) agent = CVPOAgent(env, logger=logger) agent.learn(train_envs)
- Parameters:
config (str) – experiment configurations. Default to an empty dict.
project (str) – W&B project name. Default to “fsrl”.
group (str) – W&B group name. Default to “test”.
name (str) – W&B experiment run name. If None, it will use the current time as the name. Default to None.
log_dir (str) – the log directory. Default to None.
log_txt (bool) – whether to log data in
log_dirwith nameprogress.txt. Default to True.
- write(step: int, display: bool = True, display_keys: Iterable[str] | None = None) None[source]¶
Writing data to somewhere and reset the stored data.
- Parameters:
step (int) – the current training step or epochs
display (bool) – whether print the logged data in terminal, default to False
display_keys (Iterable[str]) – a list of keys to be printed. If None, print all stored keys, default to None.
DummyLogger¶
- class fsrl.utils.DummyLogger(*args, **kwarg)[source]¶
Bases:
BaseLoggerA logger that inherent from the BaseLogger but does nothing. Used as the placeholder in trainer.
Net¶
- class fsrl.utils.net.common.ActorCritic(actor: Module, critics: List | Module)[source]¶
Bases:
ModuleAn actor-critic network for parsing parameters.
- Parameters:
actor (nn.Module) – the actor network.
critic (nn.Module) – the critic network.
- class fsrl.utils.net.continuous.DoubleCritic(preprocess_net1: ~torch.nn.modules.module.Module, preprocess_net2: ~torch.nn.modules.module.Module, hidden_sizes: ~typing.Sequence[int] = (), device: str | int | ~torch.device = 'cpu', preprocess_net_output_dim: int | None = None, linear_layer: ~typing.Type[~torch.nn.modules.linear.Linear] = <class 'torch.nn.modules.linear.Linear'>, flatten_input: bool = True)[source]¶
Bases:
ModuleDouble critic network. Will create an actor operated in continuous action space with structure of preprocess_net —> 1(q value).
- Parameters:
preprocess_net1 – a self-defined preprocess_net which output a flattened hidden state.
preprocess_net2 – a self-defined preprocess_net which output a flattened hidden state.
hidden_sizes – a sequence of int for constructing the MLP after preprocess_net. Default to empty sequence (where the MLP now contains only a single linear layer).
preprocess_net_output_dim (int) – the output dimension of preprocess_net.
linear_layer – use this module as linear layer. Default to nn.Linear.
flatten_input (bool) – whether to flatten input data for the last layer. Default to True.
For advanced usage (how to customize the network), please refer to tianshou’s build_the_network tutorial.
See also
Please refer to tianshou’s Net class as an instance of how preprocess_net is suggested to be defined.
- class fsrl.utils.net.continuous.SingleCritic(preprocess_net: ~torch.nn.modules.module.Module, hidden_sizes: ~typing.Sequence[int] = (), device: str | int | ~torch.device = 'cpu', preprocess_net_output_dim: int | None = None, linear_layer: ~typing.Type[~torch.nn.modules.linear.Linear] = <class 'torch.nn.modules.linear.Linear'>, flatten_input: bool = True)[source]¶
Bases:
CriticSimple critic network. Will create an actor operated in continuous action space with structure of preprocess_net —> 1(q value). It differs from tianshou’s original Critic in that the output will be a list to make the API consistent with
DoubleCritic.- Parameters:
preprocess_net – a self-defined preprocess_net which output a flattened hidden state.
hidden_sizes – a sequence of int for constructing the MLP after preprocess_net. Default to empty sequence (where the MLP now contains only a single linear layer).
preprocess_net_output_dim (int) – the output dimension of preprocess_net.
linear_layer – use this module as linear layer. Default to nn.Linear.
flatten_input (bool) – whether to flatten input data for the last layer. Default to True.
LagrangianOptimizer¶
- class fsrl.utils.LagrangianOptimizer(pid: tuple = (0.05, 0.0005, 0.1))[source]¶
Bases:
objectLagrangian multiplier optimizer based on the PID controller, according to https://proceedings.mlr.press/v119/stooke20a.html.
- Parameters:
pid (List) – the coefficients of the PID controller, kp, ki, kd.
Note
If kp and kd are 0, it reduced to a standard SGD-based Lagrangian optimizer.
ExperimentUtils¶
- fsrl.utils.exp_util.seed_all(seed=1029, others: list | None = None) None[source]¶
Fix the seeds of random, numpy, torch and the input others object.
- Parameters:
seed (int) – defaults to 1029
others (Optional[list]) – other objects that want to be seeded, defaults to None
- fsrl.utils.exp_util.load_config_and_model(path: str, best: bool = False)[source]¶
Load the configuration and trained model from a specified directory.
- Parameters:
path – the directory path where the configuration and trained model are stored.
best – whether to load the best-performing model or the most recent one. Defaults to False.
- Returns:
a tuple containing the configuration dictionary and the trained model.
- Raises:
ValueError – if the specified directory does not exist.
- fsrl.utils.exp_util.to_string(values)[source]¶
Recursively convert a sequence or dictionary of values to a string representation.
- Parameters:
values – the sequence or dictionary of values to be converted to a string.
- Returns:
a string representation of the input values.
- fsrl.utils.exp_util.auto_name(default_cfg: dict, current_cfg: dict, prefix: str = '', suffix: str = '', skip_keys: list = ['task', 'reward_threshold', 'logdir', 'worker', 'project', 'group', 'name', 'prefix', 'suffix', 'save_interval', 'render', 'verbose', 'save_ckpt', 'training_num', 'testing_num', 'epoch', 'device', 'thread'], key_abbre: dict = {'cost_limit': 'cost', 'estep_dual_lr': 'elr', 'estep_iter_num': 'enum', 'estep_kl': 'ekl', 'mstep_dual_lr': 'mlr', 'mstep_iter_num': 'mnum', 'mstep_kl_mu': 'kl_mu', 'mstep_kl_std': 'kl_std', 'update_per_step': 'update'}) str[source]¶
Automatic generate the name by comparing the current config with the default one.
- Parameters:
default_cfg (dict) – a dictionary containing the default configuration values.
current_cfg (dict) – a dictionary containing the current configuration values.
prefix (str) – (optional) a string to be added at the beginning of the generated name.
suffix (str) – (optional) a string to be added at the end of the generated name.
skip_keys (list) – (optional) a list of keys to be skipped when generating the name.
key_abbre (dict) – (optional) a dictionary containing abbreviations for keys in the generated name.
- Return str:
a string representing the generated experiment name.