MyCaffe
1.12.2.41
Deep learning software for Windows C# programmers.
|
Specifies the parameters for the TransformerBlockLayer. More...
Public Types | |
enum | BLOCK_TYPE { CAUSAL_SELF_ATTENTION = 0 , ENCODER , DECODER } |
Defines the type of transformer block More... | |
enum | ACTIVATION { RELU = 0 , GELU = 1 , GELU_BERT = 2 } |
Defines the various activations supported by the TransformerBlock. More... | |
Public Types inherited from MyCaffe.param.LayerParameterBase | |
enum | LABEL_TYPE { NONE , SINGLE , MULTIPLE , ONLY_ONE } |
Defines the label type. More... | |
Public Member Functions | |
TransformerBlockParameter () | |
Constructor for the parameter. More... | |
override object | Load (System.IO.BinaryReader br, bool bNewInstance=true) |
Load the parameter from a binary reader. More... | |
override void | Copy (LayerParameterBase src) |
Copy on parameter to another. More... | |
override LayerParameterBase | Clone () |
Creates a new copy of this instance of the parameter. More... | |
override RawProto | ToProto (string strName) |
Convert the parameter into a RawProto. More... | |
Public Member Functions inherited from MyCaffe.param.LayerParameterBase | |
LayerParameterBase () | |
Constructor for the parameter. More... | |
virtual string | PrepareRunModelInputs () |
This method gives derivative classes a chance specify model inputs required by the run model. More... | |
virtual void | PrepareRunModel (LayerParameter p) |
This method gives derivative classes a chance to prepare the layer for a run-model. More... | |
void | Save (BinaryWriter bw) |
Save this parameter to a binary writer. More... | |
abstract object | Load (BinaryReader br, bool bNewInstance=true) |
Load the parameter from a binary reader. More... | |
Public Member Functions inherited from MyCaffe.basecode.BaseParameter | |
BaseParameter () | |
Constructor for the parameter. More... | |
virtual bool | Compare (BaseParameter p) |
Compare this parameter to another parameter. More... | |
Static Public Member Functions | |
static TransformerBlockParameter | FromProto (RawProto rp) |
Parses the parameter from a RawProto. More... | |
Static Public Member Functions inherited from MyCaffe.basecode.BaseParameter | |
static double | ParseDouble (string strVal) |
Parse double values using the US culture if the decimal separator = '.', then using the native culture, and if then lastly trying the US culture to handle prototypes containing '.' as the separator, yet parsed in a culture that does not use '.' as a decimal. More... | |
static bool | TryParse (string strVal, out double df) |
Parse double values using the US culture if the decimal separator = '.', then using the native culture, and if then lastly trying the US culture to handle prototypes containing '.' as the separator, yet parsed in a culture that does not use '.' as a decimal. More... | |
static float | ParseFloat (string strVal) |
Parse float values using the US culture if the decimal separator = '.', then using the native culture, and if then lastly trying the US culture to handle prototypes containing '.' as the separator, yet parsed in a culture that does not use '.' as a decimal. More... | |
static bool | TryParse (string strVal, out float f) |
Parse doufloatble values using the US culture if the decimal separator = '.', then using the native culture, and if then lastly trying the US culture to handle prototypes containing '.' as the separator, yet parsed in a culture that does not use '.' as a decimal. More... | |
Properties | |
bool | enable_layernorm_cuda_impl [getset] |
Specifies to use the low-level full cuda implementation of LayerNorm (default = false). More... | |
ACTIVATION | activation [getset] |
Specifies the activation type to use (default = RELU) More... | |
BLOCK_TYPE | block_type [getset] |
Specifies the type of transformer block to configure. More... | |
uint | layers [getset] |
The number of layers (transformer blocks) used. More... | |
uint | heads [getset] |
The number of heads used. More... | |
uint | embed [getset] |
Specifies size of the embed. More... | |
uint | block_size [getset] |
Specifies size of the block. More... | |
double | attn_dropout [getset] |
Specifies dropout probability used on the attention weights. More... | |
double | resid_dropout [getset] |
Specifies dropout probability used on the residual weights. More... | |
Specifies the parameters for the TransformerBlockLayer.
Definition at line 15 of file TransformerBlockParameter.cs.
Defines the various activations supported by the TransformerBlock.
Enumerator | |
---|---|
RELU | Specifies to use the RELU activation (default) |
GELU | Specifies to use the GELU activation. |
GELU_BERT | Specifies to use the special GELU activation used in BERT models. |
Definition at line 49 of file TransformerBlockParameter.cs.
Defines the type of transformer block
Definition at line 30 of file TransformerBlockParameter.cs.
MyCaffe.param.gpt.TransformerBlockParameter.TransformerBlockParameter | ( | ) |
Constructor for the parameter.
Definition at line 66 of file TransformerBlockParameter.cs.
|
virtual |
Creates a new copy of this instance of the parameter.
Implements MyCaffe.param.LayerParameterBase.
Definition at line 186 of file TransformerBlockParameter.cs.
|
virtual |
Copy on parameter to another.
src | Specifies the parameter to copy. |
Implements MyCaffe.param.LayerParameterBase.
Definition at line 170 of file TransformerBlockParameter.cs.
|
static |
Parses the parameter from a RawProto.
rp | Specifies the RawProto to parse. |
Definition at line 220 of file TransformerBlockParameter.cs.
override object MyCaffe.param.gpt.TransformerBlockParameter.Load | ( | System.IO.BinaryReader | br, |
bool | bNewInstance = true |
||
) |
Load the parameter from a binary reader.
br | Specifies the binary reader. |
bNewInstance | When true a new instance is created (the default), otherwise the existing instance is loaded from the binary reader. |
Definition at line 158 of file TransformerBlockParameter.cs.
|
virtual |
Convert the parameter into a RawProto.
strName | Specifies the name to associate with the RawProto. |
Implements MyCaffe.basecode.BaseParameter.
Definition at line 198 of file TransformerBlockParameter.cs.
|
getset |
Specifies the activation type to use (default = RELU)
Definition at line 86 of file TransformerBlockParameter.cs.
|
getset |
Specifies dropout probability used on the attention weights.
Definition at line 142 of file TransformerBlockParameter.cs.
|
getset |
Specifies size of the block.
Definition at line 133 of file TransformerBlockParameter.cs.
|
getset |
Specifies the type of transformer block to configure.
Definition at line 95 of file TransformerBlockParameter.cs.
|
getset |
Specifies size of the embed.
Definition at line 124 of file TransformerBlockParameter.cs.
|
getset |
Specifies to use the low-level full cuda implementation of LayerNorm (default = false).
The cuda implementation runs around 30% faster when using float base types.
Definition at line 77 of file TransformerBlockParameter.cs.
|
getset |
The number of heads used.
Definition at line 115 of file TransformerBlockParameter.cs.
|
getset |
The number of layers (transformer blocks) used.
Definition at line 105 of file TransformerBlockParameter.cs.
|
getset |
Specifies dropout probability used on the residual weights.
Definition at line 151 of file TransformerBlockParameter.cs.