| Title: | Implementation of Transformer Deep Neural Network with Vignettes |
|---|---|
| Description: | Transformer is a Deep Neural Network Architecture based i.a. on the Attention mechanism (Vaswani et al. (2017) <doi:10.48550/arXiv.1706.03762>). |
| Authors: | Bastiaan Quast [aut, cre] (ORCID: <https://orcid.org/0000-0002-2951-3577>) |
| Maintainer: | Bastiaan Quast <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.2.0 |
| Built: | 2026-05-28 10:17:39 UTC |
| Source: | https://github.com/bquast/transformer |
Feed Forward Layer
feed_forward(x, dff, d_model)feed_forward(x, dff, d_model)
x |
inputs |
dff |
dimensions of feed-forward model |
d_model |
dimensions of the model |
output of the feed-forward layer
Layer Normalization
layer_norm(x, epsilon = 1e-06)layer_norm(x, epsilon = 1e-06)
x |
inputs |
epsilon |
scale |
outputs of layer normalization
Multi-Headed Attention
multi_head(Q, K, V, d_model, num_heads, mask = NULL)multi_head(Q, K, V, d_model, num_heads, mask = NULL)
Q |
queries |
K |
keys |
V |
values |
d_model |
dimensions of the model |
num_heads |
number of heads |
mask |
optional mask |
multi-headed attention outputs
Row Means
row_means(x)row_means(x)
x |
matrix |
vector with the mean of each of row of the input matrix
row_means(t(matrix(1:5)))row_means(t(matrix(1:5)))
Row Variances
row_vars(x)row_vars(x)
x |
matrix |
vector with the variance of each of row of the input matrix
row_vars(t(matrix(1:5)))row_vars(t(matrix(1:5)))
Transformer
transformer(x, d_model, num_heads, dff, mask = NULL)transformer(x, d_model, num_heads, dff, mask = NULL)
x |
inputs |
d_model |
dimensions of the model |
num_heads |
number of heads |
dff |
dimensions of feed-forward model |
mask |
optional mask |
output of the transformer layer
x <- matrix(rnorm(50 * 512), 50, 512) d_model <- 512 num_heads <- 8 dff <- 2048 output <- transformer(x, d_model, num_heads, dff)x <- matrix(rnorm(50 * 512), 50, 512) d_model <- 512 num_heads <- 8 dff <- 2048 output <- transformer(x, d_model, num_heads, dff)