Blog – Page 3

February 22, 2023February 22, 2023

ChatGPT architecture now supported with Encoder/Decoder Transformer Models using CUDA 11.8 and cuDNN 8.8!

In our latest release, version 1.12.0.60, we now support ChatGPT type architectures with Encoder/Decoder Transformer Models based on the open-source transformer-translator-pytorch GitHub project by Song [1]. ChatGPT uses encoder/decoder transformer models to learn the context of the input query, the context of the likely responses and a mapping between the two via attention layers. The …

Continue reading “ChatGPT architecture now supported with Encoder/Decoder Transformer Models using CUDA 11.8 and cuDNN 8.8!”

February 9, 2023February 9, 2023

Visually Walking Through a Transformer Model

With GPT and ChatGPT, transformer models have been proven to be very powerful AI models. However, how do they work on the inside? With this post, we use the SignalPop AI Designer to visually walk through the forward pass of a transformer model used for language translation. Before showing a visual walk-through we wanted to …

Continue reading “Visually Walking Through a Transformer Model”

February 5, 2023February 5, 2023

Debugging Difficult AI Models

While completing the MyCaffe implementation of the transformer encoder/decoder model for language translation, we ran into a very difficult bug to fix – in fact it was the kind of bug feared most when developing a model for with this bug everything appeared to work as expected when training. Yet, the model would train up …

Continue reading “Debugging Difficult AI Models”

December 17, 2022December 18, 2022

Data Flow when Training an Encoder/Decoder Model for Language Translation

When training any deep learning AI model, knowing your data is critical. This is especially important when training a transformer-based encoder/decoder model for data ordering is important. In this post, we analyze the Python open-source language translation encoder/decoder transformer model by Jaewoo (Kyle) Song [1] which is based on the ‘Attention Is All You Need‘ …

Continue reading “Data Flow when Training an Encoder/Decoder Model for Language Translation”

December 6, 2022December 6, 2022

Converting a GPT Model into a full Encoder/Decoder Transformer Model

GPT is a great transformer model used to solve many natural language problems, however GPT only implements the encoder side of a full encoder/decoder transformer model as described by Vaswani et al. [1]. Only a few changes are needed to implement a full encoder/decoder transformer model as shown below (GPT portion inspired by the minGPT …

Continue reading “Converting a GPT Model into a full Encoder/Decoder Transformer Model”

November 23, 2022February 22, 2023

GPT now supported with Transformer Models using CUDA 11.8 and cuDNN 8.6!

In our latest release, version 1.11.8.27, we now support GPT and Transformer Models based on the open source minGPT GitHub project by Karpathy [1]. GPT uses transformer models to learn the context of the input data via attention layers. Stacking up a set of transformer blocks tends to learn context at several different levels from …

Continue reading “GPT now supported with Transformer Models using CUDA 11.8 and cuDNN 8.6!”

August 8, 2022August 8, 2022

New Release with New Samples

In our latest release, version 1.11.7.7, we showcase several new loss samples that demonstrate binary classification, multi-class classification, multi-label classification and regression with the new MSE and MAE layers – all using the latest NVIDIA CUDA 11.7.1 / cuDNN 8.4.1 release. Binary Classification The binary classification sample solves a simple 2-class classification problem, where the …

Continue reading “New Release with New Samples”

July 6, 2022

Using MyCaffe AI Platform in real-time inferencing

The SignalPop Trading Studio is Windows Store App that provides short-term option traders with real-time analytics geared to help the trader better understand what the market is doing during each intra-day trading session. Part of the analytics provided by the SignalPop Trading Studio include real-time, AI driven price directional predictions which when taken together give …

Continue reading “Using MyCaffe AI Platform in real-time inferencing”

June 16, 2022June 16, 2022

minGPT – How It Works

minGPT, created by Andrej Karpathy, is a simplified implementation of the original OpenAI GPT-2 open-source project. GPT has proven very useful in solving many Natural Language Processing problems (NLP) and as shown by Karpathy and others, also used to solve tasks outside of the NLP domain such as generative image processing and classification. One of …

Continue reading “minGPT – How It Works”

June 10, 2022

Three Big Version 1.0 Releases!

The MyCaffe AI Platform, SignalPop AI Designer and new SignalPop Trading Studio all release as 1.+ versions! All of our products use the MyCaffe AI Platform to provide fast AI inferencing solutions on low-cost NVIDIA GPUs, some of these GPUs can be purchased for under $250 yet still run AI inferencing loads very quickly! For …

Continue reading “Three Big Version 1.0 Releases!”