While completing the MyCaffe implementation of the transformer encoder/decoder model for language translation, we ran into a very difficult bug to fix – in fact it was the kind of bug feared most when developing a model for with this bug everything appeared to work as expected when training. Yet, the model would train up …
Data Flow when Training an Encoder/Decoder Model for Language Translation
When training any deep learning AI model, knowing your data is critical. This is especially important when training a transformer-based encoder/decoder model for data ordering is important. In this post, we analyze the Python open-source language translation encoder/decoder transformer model by Jaewoo (Kyle) Song [1] which is based on the ‘Attention Is All You Need‘ …
Continue reading “Data Flow when Training an Encoder/Decoder Model for Language Translation”
Converting a GPT Model into a full Encoder/Decoder Transformer Model
GPT is a great transformer model used to solve many natural language problems, however GPT only implements the encoder side of a full encoder/decoder transformer model as described by Vaswani et al. [1]. Only a few changes are needed to implement a full encoder/decoder transformer model as shown below (GPT portion inspired by the minGPT …
Continue reading “Converting a GPT Model into a full Encoder/Decoder Transformer Model”
GPT now supported with Transformer Models using CUDA 11.8 and cuDNN 8.6!
In our latest release, version 1.11.8.27, we now support GPT and Transformer Models based on the open source minGPT GitHub project by Karpathy [1]. GPT uses transformer models to learn the context of the input data via attention layers. Stacking up a set of transformer blocks tends to learn context at several different levels from …
Continue reading “GPT now supported with Transformer Models using CUDA 11.8 and cuDNN 8.6!”
New Release with New Samples
In our latest release, version 1.11.7.7, we showcase several new loss samples that demonstrate binary classification, multi-class classification, multi-label classification and regression with the new MSE and MAE layers – all using the latest NVIDIA CUDA 11.7.1 / cuDNN 8.4.1 release. Binary Classification The binary classification sample solves a simple 2-class classification problem, where the …
Using MyCaffe AI Platform in real-time inferencing
The SignalPop Trading Studio is Windows Store App that provides short-term option traders with real-time analytics geared to help the trader better understand what the market is doing during each intra-day trading session. Part of the analytics provided by the SignalPop Trading Studio include real-time, AI driven price directional predictions which when taken together give …
Continue reading “Using MyCaffe AI Platform in real-time inferencing”
minGPT – How It Works
minGPT, created by Andrej Karpathy, is a simplified implementation of the original OpenAI GPT-2 open-source project. GPT has proven very useful in solving many Natural Language Processing problems (NLP) and as shown by Karpathy and others, also used to solve tasks outside of the NLP domain such as generative image processing and classification. One of …
Three Big Version 1.0 Releases!
The MyCaffe AI Platform, SignalPop AI Designer and new SignalPop Trading Studio all release as 1.+ versions! All of our products use the MyCaffe AI Platform to provide fast AI inferencing solutions on low-cost NVIDIA GPUs, some of these GPUs can be purchased for under $250 yet still run AI inferencing loads very quickly! For …
Lots of Upgrades! Visual Studio 2022, .NET 4.8, and CUDA 11.6 with cuDNN 8.3.2!
In our latest release, version 0.11.6.86, we have made a lot of upgrades including now supporting both Windows 10 and Windows 11 with Visual Studio 2022 and the latest CUDA 11.6 and cuDNN 8.3.2 from NVIDIA. New Features The following new features have been added to this release. CUDA 11.6.0.511/cuDNN 8.3.2.44/nvapi 510/driver 511.65 Windows 11 …
Continue reading “Lots of Upgrades! Visual Studio 2022, .NET 4.8, and CUDA 11.6 with cuDNN 8.3.2!”
Using MyCaffe to mine the EDGAR Database
The US Securities and Exchange Commission’s EDGAR database contains the public filings of public US companies, including quarterly (10Q) and annual (10K) filings as well as 13F filings that list the positions held by investment-based companies at the time of each filing. Using the MyCaffe AI Platform to analyze each of these filings, we were …