MyCaffe
1.12.2.41
Deep learning software for Windows C# programmers.
|
The VocabularyWords class manages the data vocabulary of words. More...
Public Member Functions | |
VocabularyWord (Random random, bool bAddBos, bool bAddEos) | |
The constructor. More... | |
void | Add (string str) |
Adds a new character to the vocabulary. More... | |
int | Build () |
Builds the vocabulary from all words added. More... | |
int | BuildFromString (string strData) |
Build the vocabulary from a string. More... | |
int[] | CreateTarget (int[] rgSrc) |
Create a target that is offset from the source by one and ends with a EOS. More... | |
List< int > | Tokenize (string strWord, bool bMustExist=true) |
Tokenize a character into its corresponding index token. More... | |
int[] | Tokenize (string str, bool bAddBos, bool bAddEos) |
Tokenize a string of data. More... | |
string | Detokenize (int nIdxToken, bool bIgnoreBos, bool bIgnoreEos) |
Detokenize an index token into its corresponding character. More... | |
string | Detokenize (float[] rgf, bool bIgnoreBos, bool bIgnoreEos) |
Detokenize an array into a string. More... | |
Properties | |
int | Count [get] |
Returns the size of the vocabulary. More... | |
char | BOS [get] |
Returns the special BOS character. More... | |
char | EOS [get] |
Returns the special EOS character. More... | |
Properties inherited from MyCaffe.layers.gpt.IVocabulary | |
int | Count [get] |
Returns the size of the vocabulary. More... | |
char | BOS [get] |
Returns the special BOS character. More... | |
char | EOS [get] |
Returns the special EOS character. More... | |
The VocabularyWords class manages the data vocabulary of words.
Definition at line 13 of file VocabularyWord.cs.
MyCaffe.layers.gpt.VocabularyWord.VocabularyWord | ( | Random | random, |
bool | bAddBos, | ||
bool | bAddEos | ||
) |
The constructor.
random | Specifies the random number generator used. |
bAddBos | Specifies to include the special BOS character in the vocabulary. |
bAddEos | Specifies to include the special EOS character in the vocabulary. |
Definition at line 27 of file VocabularyWord.cs.
void MyCaffe.layers.gpt.VocabularyWord.Add | ( | string | str | ) |
Adds a new character to the vocabulary.
str | Specifies the sentence or word to add. |
Implements MyCaffe.layers.gpt.IVocabulary.
Definition at line 87 of file VocabularyWord.cs.
int MyCaffe.layers.gpt.VocabularyWord.Build | ( | ) |
Builds the vocabulary from all words added.
Implements MyCaffe.layers.gpt.IVocabulary.
Definition at line 135 of file VocabularyWord.cs.
int MyCaffe.layers.gpt.VocabularyWord.BuildFromString | ( | string | strData | ) |
Build the vocabulary from a string.
strData | Specifies the data to build the vocabulary from. |
Implements MyCaffe.layers.gpt.IVocabulary.
Definition at line 157 of file VocabularyWord.cs.
int[] MyCaffe.layers.gpt.VocabularyWord.CreateTarget | ( | int[] | rgSrc | ) |
Create a target that is offset from the source by one and ends with a EOS.
rgSrc | Specifies the source to create the target from. |
Implements MyCaffe.layers.gpt.IVocabulary.
Definition at line 189 of file VocabularyWord.cs.
string MyCaffe.layers.gpt.VocabularyWord.Detokenize | ( | float[] | rgf, |
bool | bIgnoreBos, | ||
bool | bIgnoreEos | ||
) |
Detokenize an array into a string.
rgf | Specifies the array of tokens to detokenize. |
bIgnoreBos | Specifies to ignore the BOS token. |
bIgnoreEos | Specifies to ignore the EOS token. |
Implements MyCaffe.layers.gpt.IVocabulary.
Definition at line 320 of file VocabularyWord.cs.
string MyCaffe.layers.gpt.VocabularyWord.Detokenize | ( | int | nIdxToken, |
bool | bIgnoreBos, | ||
bool | bIgnoreEos | ||
) |
Detokenize an index token into its corresponding character.
nIdxToken | Specifies the token to detokenize. |
bIgnoreBos | Specifies to ignore the BOS token. |
bIgnoreEos | Specifies to ignore the EOS token. |
Implements MyCaffe.layers.gpt.IVocabulary.
Definition at line 281 of file VocabularyWord.cs.
int[] MyCaffe.layers.gpt.VocabularyWord.Tokenize | ( | string | str, |
bool | bAddBos, | ||
bool | bAddEos | ||
) |
Tokenize a string of data.
str | Specifies the string to tokenize. |
bAddBos | Specifies to add the BOS at the start of the tokenized data. |
bAddEos | Specifies to add the EOS to the end of the tokenized data. |
Implements MyCaffe.layers.gpt.IVocabulary.
Definition at line 255 of file VocabularyWord.cs.
List< int > MyCaffe.layers.gpt.VocabularyWord.Tokenize | ( | string | strWord, |
bool | bMustExist = true |
||
) |
Tokenize a character into its corresponding index token.
strWord | Specifies a single word to tokenize. |
bMustExist | Optionally, specifies to throw an error if the item is not in the vocabulary (default = true). |
Implements MyCaffe.layers.gpt.IVocabulary.
Definition at line 205 of file VocabularyWord.cs.
|
get |
Returns the special BOS character.
Definition at line 171 of file VocabularyWord.cs.
|
get |
Returns the size of the vocabulary.
Definition at line 43 of file VocabularyWord.cs.
|
get |
Returns the special EOS character.
Definition at line 179 of file VocabularyWord.cs.