Trainer¶
A convenient extension of the HuggingFace Trainer
and utility helpers for training and evaluation. It streamlines device placement, metrics computation, and generation result saving.
Class¶
Bases: Trainer
Extension of HuggingFace :class:~transformers.Trainer
.
The trainer adds task-specific helpers that simplify training generative
Transformer models. It accepts all the usual HTrainer
keyword arguments
and does not introduce new parameters - the default constructor is therefore forwarded verbatim.
Source code in src/calt/trainer/trainer.py
27 28 29 30 31 32 33 34 35 |
|
evaluate_and_save_generation ¶
evaluate_and_save_generation(max_length: int = 512)
Run greedy/beam-search generation on the evaluation set.
The helper decodes the model outputs into strings, stores the results in
eval_results.json
inside the trainer's output directory and finally computes
exact-match accuracy between the generated and reference sequences.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
max_length
|
int
|
Maximum generation length. Defaults to 512. |
512
|
Returns:
Name | Type | Description |
---|---|---|
float |
Exact-match accuracy in the [0, 1] interval. |
Source code in src/calt/trainer/trainer.py
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 |
|
Utilities¶
Count the number of CUDA devices visible to the current process.
The function first inspects the environment variable CUDA_VISIBLE_DEVICES
. When set,
only the GPU indices listed there are considered visible and contribute to the count.
When not set, the function falls back to torch.cuda.device_count
and returns the
total number of devices detected by the NVIDIA runtime.
Returns:
Name | Type | Description |
---|---|---|
int |
int
|
Number of GPUs that the current process is allowed to use. |
int
|
is available or that PyTorch was compiled without CUDA support. |
Source code in src/calt/trainer/utils.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
|
Initialise a Weights & Biases tracking session.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
project
|
str
|
Project name under which runs will appear in the WandB dashboard.
Defaults to |
'transformer-algebra'
|
entity
|
str | None
|
WandB entity (user or team) that owns the project.
When |
None
|
**extra_config
|
Additional key-value pairs inserted into the run configuration. Useful for hyper-parameter sweeps or ad-hoc experiments. |
{}
|
Source code in src/calt/trainer/utils.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
|