MyCaffe now supports Liquid Neural Networks!

In our latest release of the MyCaffe AI Platform, version 1.12.2.41, we now support Liquid Neural Networks as described in [1], [2], [3] and [4]. Liquid neural networks, first introduced by [1], are dynamic networks constructed “of linear first-order dynamical systems modulated via nonlinear interlinked gates,” resulting in models that “represent dynamical systems with varying …

Closed-form Continuous-time Liquid Neural Net Models – A Programmer’s Perspective

Liquid neural networks, first introduced by [1], are networks constructed “of linear first-order dynamical systems modulated via nonlinear interlinked gates,” resulting in models that “represent dynamical systems with varying (i.e., liquid) time-constants coupled to their hidden state, with outputs being computed by numerical differential equation solvers.” The Closed-form Continuous-time Models (CfC) ‘are powerful sequential liquid …

MyCaffe now supports Temporal Fusion Transformer Models!

In our latest release of the MyCaffe AI Platform, version 1.12.1.82, we now support Temporal Fusion Transformer (TFT) Models as described in [1] and [2].  These powerful models provide multi-horizon time-series predictions while outperforming DeepAR from Amazon, Deep State Space Models, MQRNN, TRMF, and traditional models such as ARIMA and ETS according to [1]. The …

Temporal Fusion Transformers – Model Data Flow

In our last post, we looked at the organization of the data used by Temporal Fusion Transformer models as described in the Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting article by Lim et al. [1]. In this post we take a deeper dive into the architecture of the Temporal Fusion Transformer model and how …

Temporal Fusion Transformers – Data Organization

Temporal Fusion Transformers, as described in the Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting article by Lim et al. [1], use a complex mix of inputs to provide multi-horizon forecasting for timeseries data. The first step in understanding these models is to understand the data inputs fed into the model and the predicted outputs …

ChatGPT architecture now supported with Encoder/Decoder Transformer Models using CUDA 11.8 and cuDNN 8.8!

In our latest release, version 1.12.0.60, we now support ChatGPT type architectures with Encoder/Decoder Transformer Models based on the open-source transformer-translator-pytorch GitHub project by Song [1]. ChatGPT uses encoder/decoder transformer models to learn the context of the input query, the context of the likely responses and a mapping between the two via attention layers.  The …

Data Flow when Training an Encoder/Decoder Model for Language Translation

When training any deep learning AI model, knowing your data is critical.  This is especially important when training a transformer-based encoder/decoder model for data ordering is important. In this post, we analyze the Python open-source language translation encoder/decoder transformer model by Jaewoo (Kyle) Song [1] which is based on the ‘Attention Is All You Need‘ …

Converting a GPT Model into a full Encoder/Decoder Transformer Model

GPT is a great transformer model used to solve many natural language problems, however GPT only implements the encoder side of a full encoder/decoder transformer model as described by Vaswani et al. [1]. Only a few changes are needed to implement a full encoder/decoder transformer model as shown below (GPT portion inspired by the minGPT …