Title: | Self-Attention Algorithm |
---|---|
Description: | Self-Attention algorithm helper functions and demonstration vignettes of increasing depth on how to construct the Self-Attention algorithm, this is based on Vaswani et al. (2017) <doi:10.48550/arXiv.1706.03762>, Dan Jurafsky and James H. Martin (2022, ISBN:978-0131873216) <https://web.stanford.edu/~jurafsky/slp3/> "Speech and Language Processing (3rd ed.)" and Alex Graves (2020) <https://www.youtube.com/watch?v=AIiwuClvH6k> "Attention and Memory in Deep Learning". |
Authors: | Bastiaan Quast [aut, cre] |
Maintainer: | Bastiaan Quast <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.4.0 |
Built: | 2025-01-07 03:59:25 UTC |
Source: | https://github.com/bquast/attention |
Attnention mechanism
attention(Q, K, V, mask = NULL)
attention(Q, K, V, mask = NULL)
Q |
queries |
K |
keys |
V |
values |
mask |
optional mask |
attention values
SoftMax sigmoid function
ComputeWeights(scores)
ComputeWeights(scores)
scores |
input value (numeric) |
output value (numeric)
# Set up a scores matrix scores <- matrix(c( 6, 4, 10, 5, 4, 6, 10, 6, 10, 10, 20, 11, 3, 1, 4, 2), nrow = 4, ncol = 4, byrow = TRUE) # Compute the weights based on the scores matrix ComputeWeights(scores) # this outputs # [,1] [,2] [,3] [,4] # [1,] 0.10679806 0.03928881 0.7891368 0.06477630 # [2,] 0.03770440 0.10249120 0.7573132 0.10249120 # [3,] 0.00657627 0.00657627 0.9760050 0.01084244 # [4,] 0.27600434 0.10153632 0.4550542 0.16740510
# Set up a scores matrix scores <- matrix(c( 6, 4, 10, 5, 4, 6, 10, 6, 10, 10, 20, 11, 3, 1, 4, 2), nrow = 4, ncol = 4, byrow = TRUE) # Compute the weights based on the scores matrix ComputeWeights(scores) # this outputs # [,1] [,2] [,3] [,4] # [1,] 0.10679806 0.03928881 0.7891368 0.06477630 # [2,] 0.03770440 0.10249120 0.7573132 0.10249120 # [3,] 0.00657627 0.00657627 0.9760050 0.01084244 # [4,] 0.27600434 0.10153632 0.4550542 0.16740510
Maximum of Matrix Rows
RowMax(x)
RowMax(x)
x |
input value (numeric) |
output value (numeric)
# generate a matrix of integers (also works for floats) set.seed(0) M = matrix(floor(runif(9, min=0, max=3)), nrow=3, ncol=3) print(M) # this outputs # [,1] [,2] [,3] # [1,] 2 1 2 # [2,] 0 2 2 # [3,] 1 0 1 # apply RowMax() to the matrix M, reformat output as matrix again # to keep the maxs on their corresponding rows RowMax(M) # this outputs # [,1] # [1,] 2 # [2,] 2 # [3,] 1
# generate a matrix of integers (also works for floats) set.seed(0) M = matrix(floor(runif(9, min=0, max=3)), nrow=3, ncol=3) print(M) # this outputs # [,1] [,2] [,3] # [1,] 2 1 2 # [2,] 0 2 2 # [3,] 1 0 1 # apply RowMax() to the matrix M, reformat output as matrix again # to keep the maxs on their corresponding rows RowMax(M) # this outputs # [,1] # [1,] 2 # [2,] 2 # [3,] 1
SoftMax sigmoid function
SoftMax(x)
SoftMax(x)
x |
input value (numeric) |
output value (numeric)
# create a vector of integers (also works for non-integers) set.seed(0) V = c(floor(runif(9, min=-3, max=3))) print(V) # this outputs # [1] 2 -2 -1 0 2 -2 2 2 0 # apply the SoftMax() function to V sV <- SoftMax(V) print(sV) # this outputs # [1] 0.229511038 0.004203641 0.011426682 0.031060941 # 0.229511038 0.004203641 0.229511038 0.229511038 0.031060941
# create a vector of integers (also works for non-integers) set.seed(0) V = c(floor(runif(9, min=-3, max=3))) print(V) # this outputs # [1] 2 -2 -1 0 2 -2 2 2 0 # apply the SoftMax() function to V sV <- SoftMax(V) print(sV) # this outputs # [1] 0.229511038 0.004203641 0.011426682 0.031060941 # 0.229511038 0.004203641 0.229511038 0.229511038 0.031060941