Conversation
clessig
left a comment
There was a problem hiding this comment.
Thanks for the work. We probably need some discussion of how to write it and to make sure we don't need to modify too much code related to reading/writing config.
| overrides={}, | ||
| ): | ||
| model_creation_device = "meta" if with_ddp and with_fsdp else "cuda" | ||
| model_creation_device = "meta" if mcf.with_ddp and with_fsdp else "cuda" |
There was a problem hiding this comment.
This is inconsistent since with_ddp and with_fsdp are specified in the config. I think we should remove with_ddp in the config--DDP will always be used when possible and this is standard and correct behaviour.
We need to remove it in all configs then.
| halflife_steps=target_and_aux_calc_params.get("ema_halflife_in_thousands", 1e-3), | ||
| rampup_ratio=target_and_aux_calc_params.get("ema_ramp_up_ratio", 0.09), | ||
| is_model_sharded=(cf.with_ddp and cf.with_fsdp), | ||
| is_model_sharded=(with_ddp and cf.with_fsdp), |
There was a problem hiding this comment.
We re-determine this multiple times. I think we should introduce a flag in the mcf--this is more robust.
|
|
||
| _logger = logging.getLogger(__name__) | ||
|
|
||
| class MutableConfig(): |
There was a problem hiding this comment.
I don't think this is the right name. It's not mutable during the run (except for istep) but it's determined at run initialization time. If I start from istep then this is more the run state.
|
|
||
| class MutableConfig(): | ||
| """ | ||
| lalala |
| self.with_ddp = None | ||
| self.run_history = [] | ||
|
|
||
| print("Trolololo 1") |
| lalala | ||
| """ | ||
|
|
||
| def __init__(self): |
There was a problem hiding this comment.
Shouldn't this class have serialization and deserialization code? Also: a lot of code around the regular config is to ensure it is loaded and stored correctly. This has not been touched?
There was a problem hiding this comment.
cf is just a DictConfig from OmegaConf as far as I can tell. Is there any built in code in DictConfig to handle it?
Wondering if it would be easier to just make RunState (former MutableConfig) also a DictConfig? @tjhunter
There was a problem hiding this comment.
OmegaConf.DictConfig makes a lot of sense to me: it has some useful functionality (like merge()) that we want to have.
|
|
||
| # adjust len to split loading across all workers and ensure it is multiple of batch_size | ||
| len_chunk = ((self.len // cf.world_size) // self.batch_size) * self.batch_size | ||
| len_chunk = ((self.len // mcf.world_size) // self.batch_size) * self.batch_size |
| mcf.with_ddp = world_size > 1 | ||
|
|
||
| return cf | ||
| return cf, mcf |
There was a problem hiding this comment.
We don't need to return cf
There was a problem hiding this comment.
general.run_id and data_loader_rng_seed are also computed in this routine. Should they also be moved to runstate?
There was a problem hiding this comment.
There is the decision if we consider run_id as a mutable object or not. It's (usually) determined at init time but then always constant. I hence would prefer if it's part of the immutable config.
Description
Beginning of a mutable config class for values that change during a run.
Issue Number
Closes #1528
Is this PR a draft? Mark it as draft.
Checklist before asking for review
./scripts/actions.sh lint./scripts/actions.sh unit-test./scripts/actions.sh integration-testlaunch-slurm.py --time 60