MyCaffe
1.12.2.41
Deep learning software for Windows C# programmers.
|
The CustomData supports external data input via an external Assembly DLL that supports the ICustomTokenInput interface. More...
Public Member Functions | |
CustomListData (CancelEvent evtCancel, Log log, string strCustomDllFile, string strVocabInfo, int nBlockSizeSrc, int? nRandomSeed=null, Phase phase=Phase.NONE) | |
The constructor. More... | |
override bool | GetDataAvailabilityAt (int nIdx, bool bIncludeSrc, bool bIncludeTrg) |
Returns true if data is available at the given index. More... | |
override Tuple< float[], float[]> | GetData (int nBatchSize, int nBlockSize, InputData trgData, out int[] rgnIdx) |
Retrieve random blocks from the source data where the data and target are the same but offset by one element where the target is offset +1 from the data. More... | |
override Tuple< float[], float[]> | GetDataAt (int nBatchSize, int nBlockSize, int[] rgnIdx) |
Fill a batch of data from a specified array of indexes. More... | |
override List< int > | Tokenize (string str, bool bAddBos, bool bAddEos) |
Tokenize an input string using the internal vocabulary. More... | |
override string | Detokenize (float[] rgfTokIdx, int nStartIdx, int nCount, bool bIgnoreBos, bool bIgnoreEos) |
Detokenize an array into a string. More... | |
override string | Detokenize (int nTokIdx, bool bIgnoreBos, bool bIgnoreEos) |
Detokenize a single token. More... | |
Public Member Functions inherited from MyCaffe.layers.gpt.InputData | |
InputData (int? nRandomSeed=null) | |
The constructor. More... | |
Properties | |
override List< string > | RawData [get] |
Returns the raw data. More... | |
override uint | TokenSize [get] |
Returns the token size. More... | |
override uint | VocabularySize [get] |
Returns the vocabulary size. More... | |
override char | BOS [get] |
Return the special begin of sequence character. More... | |
override char | EOS [get] |
Return the special end of sequence character. More... | |
Properties inherited from MyCaffe.layers.gpt.InputData | |
abstract List< string > | RawData [get] |
Returns the raw data. More... | |
abstract uint | TokenSize [get] |
Returns the size of a single token (e.g. 1 for character data) More... | |
abstract uint | VocabularySize [get] |
Returns the size of the vocabulary. More... | |
abstract char | BOS [get] |
Return the special begin of sequence character. More... | |
abstract char | EOS [get] |
Return the special end of sequence character. More... | |
Additional Inherited Members | |
Protected Attributes inherited from MyCaffe.layers.gpt.InputData | |
Random | m_random |
Specifies the random object made available to the derived classes. More... | |
The CustomData supports external data input via an external Assembly DLL that supports the ICustomTokenInput interface.
Definition at line 937 of file TokenizedDataPairsLayer.cs.
MyCaffe.layers.gpt.CustomListData.CustomListData | ( | CancelEvent | evtCancel, |
Log | log, | ||
string | strCustomDllFile, | ||
string | strVocabInfo, | ||
int | nBlockSizeSrc, | ||
int? | nRandomSeed = null , |
||
Phase | phase = Phase.NONE |
||
) |
The constructor.
evtCancel | Specifies the cancel event. |
log | Specifies the output log. |
strCustomDllFile | Specifies the path to the custom assembly DLL. |
strVocabInfo | Specifies the vocab info and shoudl be set to "ENC" or "DEC" |
nBlockSizeSrc | Specifies the block size. |
nRandomSeed | Specifies a random see.d |
phase | Specifies the running phase. |
Exception | An exception is thrown on error. |
Note the source and target token sets must have matching DateTime[] arrays.
Definition at line 963 of file TokenizedDataPairsLayer.cs.
|
virtual |
Detokenize an array into a string.
rgfTokIdx | Specifies the array of tokens to detokenize. |
nStartIdx | Specifies the starting index where detokenizing begins. |
nCount | Specifies the number of tokens to detokenize. |
bIgnoreBos | Specifies to ignore the BOS token. |
bIgnoreEos | Specifies to ignore the EOS token. |
Implements MyCaffe.layers.gpt.InputData.
Definition at line 1200 of file TokenizedDataPairsLayer.cs.
|
virtual |
Detokenize a single token.
nTokIdx | Specifies an index to the token to be detokenized. |
bIgnoreBos | Specifies to ignore the BOS token. |
bIgnoreEos | Specifies to ignore the EOS token. |
Implements MyCaffe.layers.gpt.InputData.
Definition at line 1212 of file TokenizedDataPairsLayer.cs.
|
virtual |
Retrieve random blocks from the source data where the data and target are the same but offset by one element where the target is offset +1 from the data.
nBatchSize | Specifies the batch size. |
nBlockSize | Specifies teh block size. |
trgData | Specifies the matching target data used to verify that both source and target have data at each chosen index. |
rgnIdx | Returns an array of the indexes of the data returned. |
Implements MyCaffe.layers.gpt.InputData.
Definition at line 1064 of file TokenizedDataPairsLayer.cs.
|
virtual |
Fill a batch of data from a specified array of indexes.
nBatchSize | Specifies the number of blocks in the batch. |
nBlockSize | Specifies the size of each block. |
rgnIdx | Specifies the array of indexes to the data to be retrieved. |
Implements MyCaffe.layers.gpt.InputData.
Definition at line 1133 of file TokenizedDataPairsLayer.cs.
|
virtual |
Returns true if data is available at the given index.
nIdx | Specifies the index to check |
bIncludeSrc | Specifies to include the source in the check. |
bIncludeTrg | Specifies to include the target in the check. |
Implements MyCaffe.layers.gpt.InputData.
Definition at line 1044 of file TokenizedDataPairsLayer.cs.
|
virtual |
Tokenize an input string using the internal vocabulary.
str | Specifies the string to tokenize. |
bAddBos | Add the begin of sequence token. |
bAddEos | Add the end of sequence token. |
Implements MyCaffe.layers.gpt.InputData.
Definition at line 1186 of file TokenizedDataPairsLayer.cs.
|
get |
Return the special begin of sequence character.
Definition at line 1220 of file TokenizedDataPairsLayer.cs.
|
get |
Return the special end of sequence character.
Definition at line 1228 of file TokenizedDataPairsLayer.cs.
|
get |
Returns the raw data.
Definition at line 1016 of file TokenizedDataPairsLayer.cs.
|
get |
Returns the token size.
Definition at line 1024 of file TokenizedDataPairsLayer.cs.
|
get |
Returns the vocabulary size.
Definition at line 1032 of file TokenizedDataPairsLayer.cs.