MyCaffe  1.12.2.41
Deep learning software for Windows C# programmers.
MyCaffe.layers.gpt.VocabularyCharacter Class Reference

The VocabularyCharacters class manages the data vocabulary of characters. More...

Inheritance diagram for MyCaffe.layers.gpt.VocabularyCharacter:
MyCaffe.layers.gpt.IVocabulary

Public Member Functions

 VocabularyCharacter (Random random, bool bAddBos, bool bAddEos, bool bEnablePad)
 The constructor. More...
 
void Add (char ch)
 Adds a new character to the vocabulary. More...
 
void Add (string str)
 Add a string of characters to the vocabulary. More...
 
int Build ()
 Builds the vocabulary from all characters added. More...
 
int BuildFromString (string strData)
 Build the vocabulary from a string. More...
 
int[] CreateTarget (int[] rgSrc)
 Create a target that is offset from the source by one and ends with a EOS. More...
 
List< int > Tokenize (string str1, bool bMustExist=true)
 Tokenize a character into its corresponding index token. More...
 
int[] Tokenize (string str, bool bAddBos, bool bAddEos)
 Tokenize a string of data. More...
 
string Detokenize (int nIdxToken, bool bIgnoreBos, bool bIgnoreEos)
 Detokenize an index token into its corresponding character. More...
 
string Detokenize (float[] rgf, bool bIgnoreBos, bool bIgnoreEos)
 Detokenize an array into a string. More...
 

Properties

int? Count [get]
 Returns the size of the vocabulary. More...
 
char BOS [get]
 Returns the special BOS character. More...
 
char EOS [get]
 Returns the special EOS character. More...
 
- Properties inherited from MyCaffe.layers.gpt.IVocabulary
int Count [get]
 Returns the size of the vocabulary. More...
 
char BOS [get]
 Returns the special BOS character. More...
 
char EOS [get]
 Returns the special EOS character. More...
 

Detailed Description

The VocabularyCharacters class manages the data vocabulary of characters.

Definition at line 12 of file VocabularyCharacter.cs.

Constructor & Destructor Documentation

◆ VocabularyCharacter()

MyCaffe.layers.gpt.VocabularyCharacter.VocabularyCharacter ( Random  random,
bool  bAddBos,
bool  bAddEos,
bool  bEnablePad 
)

The constructor.

Parameters
randomSpecifies the random number generator used.
bAddBosSpecifies to include the special BOS character in the vocabulary.
bAddEosSpecifies to include the special EOS character in the vocabulary.
bEnablePadSpecifies to enable the 0 based padding by adding the 0 pad key to the vocabulary.

Definition at line 28 of file VocabularyCharacter.cs.

Member Function Documentation

◆ Add() [1/2]

void MyCaffe.layers.gpt.VocabularyCharacter.Add ( char  ch)

Adds a new character to the vocabulary.

Parameters
chSpecifies the character

Definition at line 54 of file VocabularyCharacter.cs.

◆ Add() [2/2]

void MyCaffe.layers.gpt.VocabularyCharacter.Add ( string  str)

Add a string of characters to the vocabulary.

Parameters
strSpecifies the string to add.

Implements MyCaffe.layers.gpt.IVocabulary.

Definition at line 64 of file VocabularyCharacter.cs.

◆ Build()

int MyCaffe.layers.gpt.VocabularyCharacter.Build ( )

Builds the vocabulary from all characters added.

Returns
The vocabulary size is returned.

Implements MyCaffe.layers.gpt.IVocabulary.

Definition at line 76 of file VocabularyCharacter.cs.

◆ BuildFromString()

int MyCaffe.layers.gpt.VocabularyCharacter.BuildFromString ( string  strData)

Build the vocabulary from a string.

Parameters
strDataSpecifies the data to build the vocabulary from.
Returns
The vocabulary size is returned.

Implements MyCaffe.layers.gpt.IVocabulary.

Definition at line 100 of file VocabularyCharacter.cs.

◆ CreateTarget()

int[] MyCaffe.layers.gpt.VocabularyCharacter.CreateTarget ( int[]  rgSrc)

Create a target that is offset from the source by one and ends with a EOS.

Parameters
rgSrcSpecifies the source to create the target from.
Returns
The tokenized target is returned.

Implements MyCaffe.layers.gpt.IVocabulary.

Definition at line 131 of file VocabularyCharacter.cs.

◆ Detokenize() [1/2]

string MyCaffe.layers.gpt.VocabularyCharacter.Detokenize ( float[]  rgf,
bool  bIgnoreBos,
bool  bIgnoreEos 
)

Detokenize an array into a string.

Parameters
rgfSpecifies the array of tokens to detokenize.
bIgnoreBosSpecifies to ignore the BOS token.
bIgnoreEosSpecifies to ignore the EOS token.
Returns
The detokenized string is returned.

Implements MyCaffe.layers.gpt.IVocabulary.

Definition at line 235 of file VocabularyCharacter.cs.

◆ Detokenize() [2/2]

string MyCaffe.layers.gpt.VocabularyCharacter.Detokenize ( int  nIdxToken,
bool  bIgnoreBos,
bool  bIgnoreEos 
)

Detokenize an index token into its corresponding character.

Parameters
nIdxTokenSpecifies the token to detokenize.
bIgnoreBosSpecifies to ignore the BOS token.
bIgnoreEosSpecifies to ignore the EOS token.
Returns
The detokenized string is returned (which may just be a character).

Implements MyCaffe.layers.gpt.IVocabulary.

Definition at line 199 of file VocabularyCharacter.cs.

◆ Tokenize() [1/2]

int[] MyCaffe.layers.gpt.VocabularyCharacter.Tokenize ( string  str,
bool  bAddBos,
bool  bAddEos 
)

Tokenize a string of data.

Parameters
strSpecifies the string to tokenize.
bAddBosSpecifies to add the BOS at the start of the tokenized data.
bAddEosSpecifies to add the EOS to the end of the tokenized data.
Returns
The array of tokens is returned.

Implements MyCaffe.layers.gpt.IVocabulary.

Definition at line 174 of file VocabularyCharacter.cs.

◆ Tokenize() [2/2]

List< int > MyCaffe.layers.gpt.VocabularyCharacter.Tokenize ( string  str1,
bool  bMustExist = true 
)

Tokenize a character into its corresponding index token.

Parameters
str1Specifies a single element (character or word) to tokenize.
bMustExistOptionally, specifies to throw an error if the item is not in the vocabulary (default = true).
Returns
A list of tokens corresponding to the character is returned (typically just a single token).

Implements MyCaffe.layers.gpt.IVocabulary.

Definition at line 147 of file VocabularyCharacter.cs.

Property Documentation

◆ BOS

char MyCaffe.layers.gpt.VocabularyCharacter.BOS
get

Returns the special BOS character.

Definition at line 113 of file VocabularyCharacter.cs.

◆ Count

int? MyCaffe.layers.gpt.VocabularyCharacter.Count
get

Returns the size of the vocabulary.

Definition at line 45 of file VocabularyCharacter.cs.

◆ EOS

char MyCaffe.layers.gpt.VocabularyCharacter.EOS
get

Returns the special EOS character.

Definition at line 121 of file VocabularyCharacter.cs.


The documentation for this class was generated from the following file: