Specifies the parameters for the MultiheadAttentionLayer. More...

Inheritance diagram for MyCaffe.param.gpt.MultiheadAttentionParameter:

Public Types
enum	WEIGHT_INIT { GPT , ENCODER_DECODER }
	Defines the weight initialization strategy. More...

Public Types inherited from MyCaffe.param.LayerParameterBase
enum	LABEL_TYPE { NONE , SINGLE , MULTIPLE , ONLY_ONE }
	Defines the label type. More...

Public Member Functions
	MultiheadAttentionParameter ()
	Constructor for the parameter. More...

override object	Load (System.IO.BinaryReader br, bool bNewInstance=true)
	Load the parameter from a binary reader. More...

override void	Copy (LayerParameterBase src)
	Copy on parameter to another. More...

override LayerParameterBase	Clone ()
	Creates a new copy of this instance of the parameter. More...

override RawProto	ToProto (string strName)
	Convert the parameter into a RawProto. More...

Public Member Functions inherited from MyCaffe.param.LayerParameterBase
	LayerParameterBase ()
	Constructor for the parameter. More...

virtual string	PrepareRunModelInputs ()
	This method gives derivative classes a chance specify model inputs required by the run model. More...

virtual void	PrepareRunModel (LayerParameter p)
	This method gives derivative classes a chance to prepare the layer for a run-model. More...

void	Save (BinaryWriter bw)
	Save this parameter to a binary writer. More...

abstract object	Load (BinaryReader br, bool bNewInstance=true)
	Load the parameter from a binary reader. More...

Public Member Functions inherited from MyCaffe.basecode.BaseParameter
	BaseParameter ()
	Constructor for the parameter. More...

virtual bool	Compare (BaseParameter p)
	Compare this parameter to another parameter. More...

Static Public Member Functions
static MultiheadAttentionParameter	FromProto (RawProto rp)
	Parses the parameter from a RawProto. More...

Static Public Member Functions inherited from MyCaffe.basecode.BaseParameter
static double	ParseDouble (string strVal)
	Parse double values using the US culture if the decimal separator = '.', then using the native culture, and if then lastly trying the US culture to handle prototypes containing '.' as the separator, yet parsed in a culture that does not use '.' as a decimal. More...

static bool	TryParse (string strVal, out double df)
	Parse double values using the US culture if the decimal separator = '.', then using the native culture, and if then lastly trying the US culture to handle prototypes containing '.' as the separator, yet parsed in a culture that does not use '.' as a decimal. More...

static float	ParseFloat (string strVal)
	Parse float values using the US culture if the decimal separator = '.', then using the native culture, and if then lastly trying the US culture to handle prototypes containing '.' as the separator, yet parsed in a culture that does not use '.' as a decimal. More...

static bool	TryParse (string strVal, out float f)
	Parse doufloatble values using the US culture if the decimal separator = '.', then using the native culture, and if then lastly trying the US culture to handle prototypes containing '.' as the separator, yet parsed in a culture that does not use '.' as a decimal. More...

Properties
uint	layers `[getset]`
	The number of layers (transformer blocks) used. More...

uint	heads `[getset]`
	The number of heads used. More...

uint	embed `[getset]`
	Specifies size of the embed. More...

uint	block_size `[getset]`
	Specifies size of the block. More...

double	attn_dropout `[getset]`
	Specifies dropout probability used on the attention weights. More...

double	resid_dropout `[getset]`
	Specifies dropout probability used on the residual weights. More...

WEIGHT_INIT	weight_init `[getset]`
	Specifies the weight initialization strategy (default = ENCODER_DECODER). More...

Detailed Description

Specifies the parameters for the MultiheadAttentionLayer.

Definition at line 15 of file MultiheadAttentionParameter.cs.

Member Enumeration Documentation

◆ WEIGHT_INIT

enum MyCaffe.param.gpt.MultiheadAttentionParameter.WEIGHT_INIT

Defines the weight initialization strategy.

Enumerator
GPT	Specifies to use the GPT style weight strategy.
ENCODER_DECODER	Specifies to use the XAVIER initialization on both weight and bias.

Definition at line 28 of file MultiheadAttentionParameter.cs.

Constructor & Destructor Documentation

◆ MultiheadAttentionParameter()

MyCaffe.param.gpt.MultiheadAttentionParameter.MultiheadAttentionParameter ( )

Constructor for the parameter.

Definition at line 41 of file MultiheadAttentionParameter.cs.

Member Function Documentation

◆ Clone()

override LayerParameterBase MyCaffe.param.gpt.MultiheadAttentionParameter.Clone ( )

virtual

Creates a new copy of this instance of the parameter.

Returns: A new instance of this parameter is returned.

Implements MyCaffe.param.LayerParameterBase.

Definition at line 137 of file MultiheadAttentionParameter.cs.

◆ Copy()

override void MyCaffe.param.gpt.MultiheadAttentionParameter.Copy ( LayerParameterBase src )

virtual

Copy on parameter to another.

Parameters

src	Specifies the parameter to copy.

Implements MyCaffe.param.LayerParameterBase.

Definition at line 123 of file MultiheadAttentionParameter.cs.

◆ FromProto()

static MultiheadAttentionParameter MyCaffe.param.gpt.MultiheadAttentionParameter.FromProto ( RawProto rp )

static

Parses the parameter from a RawProto.

Parameters

rp	Specifies the RawProto to parse.

Returns: A new instance of the parameter is returned.

Definition at line 169 of file MultiheadAttentionParameter.cs.

◆ Load()

override object MyCaffe.param.gpt.MultiheadAttentionParameter.Load	(	System.IO.BinaryReader	br,
		bool	bNewInstance = `true`
	)

Load the parameter from a binary reader.

Parameters

br	Specifies the binary reader.
bNewInstance	When true a new instance is created (the default), otherwise the existing instance is loaded from the binary reader.

Returns: Returns an instance of the parameter.

Definition at line 111 of file MultiheadAttentionParameter.cs.

◆ ToProto()

override RawProto MyCaffe.param.gpt.MultiheadAttentionParameter.ToProto ( string strName )

virtual

Convert the parameter into a RawProto.

Parameters

strName Specifies the name to associate with the RawProto.

Returns: The new RawProto is returned.

Implements MyCaffe.basecode.BaseParameter.

Definition at line 149 of file MultiheadAttentionParameter.cs.

Property Documentation

◆ attn_dropout

double MyCaffe.param.gpt.MultiheadAttentionParameter.attn_dropout

getset

Specifies dropout probability used on the attention weights.

Definition at line 86 of file MultiheadAttentionParameter.cs.

◆ block_size

uint MyCaffe.param.gpt.MultiheadAttentionParameter.block_size

getset

Specifies size of the block.

Definition at line 77 of file MultiheadAttentionParameter.cs.

◆ embed

uint MyCaffe.param.gpt.MultiheadAttentionParameter.embed

getset

Specifies size of the embed.

Definition at line 68 of file MultiheadAttentionParameter.cs.

◆ heads

uint MyCaffe.param.gpt.MultiheadAttentionParameter.heads

getset

The number of heads used.

Definition at line 59 of file MultiheadAttentionParameter.cs.

◆ layers

uint MyCaffe.param.gpt.MultiheadAttentionParameter.layers

getset

The number of layers (transformer blocks) used.

Definition at line 49 of file MultiheadAttentionParameter.cs.

◆ resid_dropout

double MyCaffe.param.gpt.MultiheadAttentionParameter.resid_dropout

getset

Specifies dropout probability used on the residual weights.

Definition at line 95 of file MultiheadAttentionParameter.cs.

◆ weight_init

WEIGHT_INIT MyCaffe.param.gpt.MultiheadAttentionParameter.weight_init

getset

Specifies the weight initialization strategy (default = ENCODER_DECODER).

Definition at line 104 of file MultiheadAttentionParameter.cs.

The documentation for this class was generated from the following file:

C:/Data/Data/SS_Projects/Intelligence/GitHub/MyCaffe/MyCaffe/param.gpt/MultiheadAttentionParameter.cs

Public Types

Public Member Functions

Static Public Member Functions

Properties

Detailed Description

Member Enumeration Documentation

◆ WEIGHT_INIT

Constructor & Destructor Documentation

◆ MultiheadAttentionParameter()

Member Function Documentation

◆ Clone()

◆ Copy()

◆ FromProto()

◆ Load()

◆ ToProto()

Property Documentation

◆ attn_dropout

◆ block_size

◆ embed

◆ heads

◆ layers

◆ resid_dropout

◆ weight_init