When training any deep learning AI model, knowing your data is critical. This is especially important when training a transformer-based encoder/decoder model for data ordering is important. In this post, we analyze the Python open-source language translation encoder/decoder transformer model by Jaewoo (Kyle) Song [1] which is based on the ‘Attention Is All You Need‘ …
Continue reading “Data Flow when Training an Encoder/Decoder Model for Language Translation”