Understanding Instruct Llama2 and Fine-Tuning with LoRA – A Visual Design Walkthrough

In this post we explore the Instruct Llama2 created by Sridykhan [1] who altered Karpathy’s Baby Llama [2] model to “follow instructions and write tiny stories accordingly.”  This new Instruct Llama2 model uses the same model and dataset as the original used by Karpathy. However, with Instruct Llama2, the model is trained with different inputs …

Understanding Baby Llama2 Training – A Visual Design Walkthrough

In this post we explore how to train the Llama2 model using the Baby Llama2 created by Andrej Karpathy [1] which is based on his original minGPT model [2] and has the same basic transformer architecture that other generative AI models use, such as ChatGPT [3].  Generative transformer models employ a stack of transformer blocks …

Understanding Llama2.c And ChatGPT Inferencing – A Visual Design Walkthrough

ChatGPT is an amazing technology that has taken the world by storm. Under the hood, a highly trained large language model (LLM) creates the response to each query sent to the service. In February 2023 Meta released the open-source Llama2 LLM and on September 29, 2023, Meta released an open-source Llama2-Long LLM [1] which appears …

Understanding LLM Fine Tuning with Low-Rank Adaptation (LoRA)

In this blog post we discuss a MyCaffe implementation design of the paper “LoRA: Low-Rank Adaptation of Large Language Models” by Hu et al. [1] and describe how LoRA helps leverage the knowledge of the trained LLM to solve new specific problems in an efficient manner through fine-tuning.  LLMs are immensely powerful but are created at …

Using Synthetic Data (change points) to enhance the Momentum Transformer for High(er) Sharpe Ratios

In this post we describe a method of calculating change points using gaussian processes as described in the paper “Slow Momentum with Fast Reversion: A Trading Strategy Using Deep Learning and Changepoint Detection” by Wood et. al. [1], published in 2021.  In addition, we show how the change point synthetic data enhances the Momentum Transformer …

Understanding TFT Momentum Rebalancing for High Sharpe Ratios

In this post we describe the Temporal Fusion Transformer based Momentum Rebalancing Transformer described in the paper, “Trading with the Momentum Transformer: An Intelligent and Interpretable Architecture” by Wood et. al. [1], published in 2022.  The original code analyzed can be found on GitHub at [2]. Time-series momentum (TSMOM) strategies such as ‘buying the winners and …

Understanding Adaptive LSTM-Autoencoder Change Point Detection

As discussed in our previous post, Change Point Detection (CPD) is an important part of time-series analysis used in numerous fields such as Medicine, Aerospace, Finance, Business, Metrology and Entertainment. In this post, we expand our analysis of an adaptive, online algorithm for Change Point Detection based on the paper “Memory-free Online Change-point Detection: A …

Understanding Contrastive Change Point Detection

Change Point Detection (CPD) is an important field of time-series analysis that provides methods of detecting changes in mean, variance, and distribution structure within time-series data. It has many applications in different fields, such as: Medicine: Change point detection can help monitor the health condition of patients, detect anomalies in vital signs, diagnose diseases, and …

Understanding the PatchTST Model for Time Series Prediction

In this blog post, we evaluate from a programmer’s perspective, the PatchTST model described in “A Time Series is Worth 64 Words: Long-term Forecasting with Transformers” by Nie, et. al., 2022. The PatchTST is a transformer-based model for multivariate time-series prediction that separates the input data into ‘patches’ that are then fed into a standard …

Understanding FSNets Learning Fast and Slow for Online Time Series Forecasting

In this blog post, we evaluate from a programmer’s perspective, the FSNet described in “Learning Fast and Slow for Online Time Series Forecasting” by Pham et. al., 2022.[1]  The authors of FSNet describe the model as inspired by “Complementary Learning Systems (CLS) theory” to provide “a novel framework to address the challenges of online forecasting” …