MyCaffe  1.12.2.41
Deep learning software for Windows C# programmers.
MyCaffe.param.gpt.TransformerBlockParameter Class Reference

Specifies the parameters for the TransformerBlockLayer. More...

Inheritance diagram for MyCaffe.param.gpt.TransformerBlockParameter:
MyCaffe.param.LayerParameterBase MyCaffe.basecode.BaseParameter MyCaffe.basecode.IBinaryPersist

Public Types

enum  BLOCK_TYPE { CAUSAL_SELF_ATTENTION = 0 , ENCODER , DECODER }
 Defines the type of transformer block More...
 
enum  ACTIVATION { RELU = 0 , GELU = 1 , GELU_BERT = 2 }
 Defines the various activations supported by the TransformerBlock. More...
 
- Public Types inherited from MyCaffe.param.LayerParameterBase
enum  LABEL_TYPE { NONE , SINGLE , MULTIPLE , ONLY_ONE }
 Defines the label type. More...
 

Public Member Functions

 TransformerBlockParameter ()
 Constructor for the parameter. More...
 
override object Load (System.IO.BinaryReader br, bool bNewInstance=true)
 Load the parameter from a binary reader. More...
 
override void Copy (LayerParameterBase src)
 Copy on parameter to another. More...
 
override LayerParameterBase Clone ()
 Creates a new copy of this instance of the parameter. More...
 
override RawProto ToProto (string strName)
 Convert the parameter into a RawProto. More...
 
- Public Member Functions inherited from MyCaffe.param.LayerParameterBase
 LayerParameterBase ()
 Constructor for the parameter. More...
 
virtual string PrepareRunModelInputs ()
 This method gives derivative classes a chance specify model inputs required by the run model. More...
 
virtual void PrepareRunModel (LayerParameter p)
 This method gives derivative classes a chance to prepare the layer for a run-model. More...
 
void Save (BinaryWriter bw)
 Save this parameter to a binary writer. More...
 
abstract object Load (BinaryReader br, bool bNewInstance=true)
 Load the parameter from a binary reader. More...
 
- Public Member Functions inherited from MyCaffe.basecode.BaseParameter
 BaseParameter ()
 Constructor for the parameter. More...
 
virtual bool Compare (BaseParameter p)
 Compare this parameter to another parameter. More...
 

Static Public Member Functions

static TransformerBlockParameter FromProto (RawProto rp)
 Parses the parameter from a RawProto. More...
 
- Static Public Member Functions inherited from MyCaffe.basecode.BaseParameter
static double ParseDouble (string strVal)
 Parse double values using the US culture if the decimal separator = '.', then using the native culture, and if then lastly trying the US culture to handle prototypes containing '.' as the separator, yet parsed in a culture that does not use '.' as a decimal. More...
 
static bool TryParse (string strVal, out double df)
 Parse double values using the US culture if the decimal separator = '.', then using the native culture, and if then lastly trying the US culture to handle prototypes containing '.' as the separator, yet parsed in a culture that does not use '.' as a decimal. More...
 
static float ParseFloat (string strVal)
 Parse float values using the US culture if the decimal separator = '.', then using the native culture, and if then lastly trying the US culture to handle prototypes containing '.' as the separator, yet parsed in a culture that does not use '.' as a decimal. More...
 
static bool TryParse (string strVal, out float f)
 Parse doufloatble values using the US culture if the decimal separator = '.', then using the native culture, and if then lastly trying the US culture to handle prototypes containing '.' as the separator, yet parsed in a culture that does not use '.' as a decimal. More...
 

Properties

bool enable_layernorm_cuda_impl [getset]
 Specifies to use the low-level full cuda implementation of LayerNorm (default = false). More...
 
ACTIVATION activation [getset]
 Specifies the activation type to use (default = RELU) More...
 
BLOCK_TYPE block_type [getset]
 Specifies the type of transformer block to configure. More...
 
uint layers [getset]
 The number of layers (transformer blocks) used. More...
 
uint heads [getset]
 The number of heads used. More...
 
uint embed [getset]
 Specifies size of the embed. More...
 
uint block_size [getset]
 Specifies size of the block. More...
 
double attn_dropout [getset]
 Specifies dropout probability used on the attention weights. More...
 
double resid_dropout [getset]
 Specifies dropout probability used on the residual weights. More...
 

Detailed Description

Specifies the parameters for the TransformerBlockLayer.

Definition at line 15 of file TransformerBlockParameter.cs.

Member Enumeration Documentation

◆ ACTIVATION

Defines the various activations supported by the TransformerBlock.

Enumerator
RELU 

Specifies to use the RELU activation (default)

GELU 

Specifies to use the GELU activation.

GELU_BERT 

Specifies to use the special GELU activation used in BERT models.

Definition at line 49 of file TransformerBlockParameter.cs.

◆ BLOCK_TYPE

Defines the type of transformer block

Enumerator
CAUSAL_SELF_ATTENTION 

Specifies to configure a causal self attention block.

ENCODER 

Specifies to configure an encoder transformer block.

DECODER 

Specifies to configure a decoder transformer block

Definition at line 30 of file TransformerBlockParameter.cs.

Constructor & Destructor Documentation

◆ TransformerBlockParameter()

MyCaffe.param.gpt.TransformerBlockParameter.TransformerBlockParameter ( )

Constructor for the parameter.

Definition at line 66 of file TransformerBlockParameter.cs.

Member Function Documentation

◆ Clone()

override LayerParameterBase MyCaffe.param.gpt.TransformerBlockParameter.Clone ( )
virtual

Creates a new copy of this instance of the parameter.

Returns
A new instance of this parameter is returned.

Implements MyCaffe.param.LayerParameterBase.

Definition at line 186 of file TransformerBlockParameter.cs.

◆ Copy()

override void MyCaffe.param.gpt.TransformerBlockParameter.Copy ( LayerParameterBase  src)
virtual

Copy on parameter to another.

Parameters
srcSpecifies the parameter to copy.

Implements MyCaffe.param.LayerParameterBase.

Definition at line 170 of file TransformerBlockParameter.cs.

◆ FromProto()

static TransformerBlockParameter MyCaffe.param.gpt.TransformerBlockParameter.FromProto ( RawProto  rp)
static

Parses the parameter from a RawProto.

Parameters
rpSpecifies the RawProto to parse.
Returns
A new instance of the parameter is returned.

Definition at line 220 of file TransformerBlockParameter.cs.

◆ Load()

override object MyCaffe.param.gpt.TransformerBlockParameter.Load ( System.IO.BinaryReader  br,
bool  bNewInstance = true 
)

Load the parameter from a binary reader.

Parameters
brSpecifies the binary reader.
bNewInstanceWhen true a new instance is created (the default), otherwise the existing instance is loaded from the binary reader.
Returns
Returns an instance of the parameter.

Definition at line 158 of file TransformerBlockParameter.cs.

◆ ToProto()

override RawProto MyCaffe.param.gpt.TransformerBlockParameter.ToProto ( string  strName)
virtual

Convert the parameter into a RawProto.

Parameters
strNameSpecifies the name to associate with the RawProto.
Returns
The new RawProto is returned.

Implements MyCaffe.basecode.BaseParameter.

Definition at line 198 of file TransformerBlockParameter.cs.

Property Documentation

◆ activation

ACTIVATION MyCaffe.param.gpt.TransformerBlockParameter.activation
getset

Specifies the activation type to use (default = RELU)

Definition at line 86 of file TransformerBlockParameter.cs.

◆ attn_dropout

double MyCaffe.param.gpt.TransformerBlockParameter.attn_dropout
getset

Specifies dropout probability used on the attention weights.

Definition at line 142 of file TransformerBlockParameter.cs.

◆ block_size

uint MyCaffe.param.gpt.TransformerBlockParameter.block_size
getset

Specifies size of the block.

Definition at line 133 of file TransformerBlockParameter.cs.

◆ block_type

BLOCK_TYPE MyCaffe.param.gpt.TransformerBlockParameter.block_type
getset

Specifies the type of transformer block to configure.

Definition at line 95 of file TransformerBlockParameter.cs.

◆ embed

uint MyCaffe.param.gpt.TransformerBlockParameter.embed
getset

Specifies size of the embed.

Definition at line 124 of file TransformerBlockParameter.cs.

◆ enable_layernorm_cuda_impl

bool MyCaffe.param.gpt.TransformerBlockParameter.enable_layernorm_cuda_impl
getset

Specifies to use the low-level full cuda implementation of LayerNorm (default = false).

The cuda implementation runs around 30% faster when using float base types.

Definition at line 77 of file TransformerBlockParameter.cs.

◆ heads

uint MyCaffe.param.gpt.TransformerBlockParameter.heads
getset

The number of heads used.

Definition at line 115 of file TransformerBlockParameter.cs.

◆ layers

uint MyCaffe.param.gpt.TransformerBlockParameter.layers
getset

The number of layers (transformer blocks) used.

Definition at line 105 of file TransformerBlockParameter.cs.

◆ resid_dropout

double MyCaffe.param.gpt.TransformerBlockParameter.resid_dropout
getset

Specifies dropout probability used on the residual weights.

Definition at line 151 of file TransformerBlockParameter.cs.


The documentation for this class was generated from the following file: