Metaprogramming, or also known as "computing on the language" simply meaning that one can write some code to write code. Since everything entered as a valid code in R are "expressions", R has great capabilities in metaprogramming2.
Things can get pretty complicated and be fragile with metaprogramming. If you chose to use it in your code, be sure that you have a valid reason.
There are four main types to describe the capabilities:
-
Constants are like
NULL
or length-1 atomic vectors1, e.g."a"
or1L
. -
Symbols are also called as names. For instance,
var
invar <- 1
. Access it withis.name()
oris.symbol()
but the latter is better for consistency. -
Calls are like function calls that are in a special form where the first element is the symbol name. Access it by
is.call()
. -
Pairlists only exist in the function call arguments of functions.
We look at the available functions next.
Expression
-
Expressions are statements that form the R language.
-
Create expressions with
expression()
orvector("expression")
.
expression()
is.expression()
as.expression()
- An expression object is also a list under the hood. Therefore, it can be
subsetted by using the standard indexing operators namely
[
,[[
and$
. The set form of these operators,[[<-
and$<-
, are used to replace or remove elements.
Substitute
substitute
replaces variables with values in the expressions. They can be
used to template the expressions.
substitute(x * y, list(x = 2, y = 5))
deparse(substitute(x))
is a common trick to get an argument name as a character within a function.
Quote
quote
and expression
are pretty the same when you evaluate them with
eval()
. However, the difference is that expression()
wraps the statements
as an expression object, therefore returns a vector of unevaluated
expressions whereas quote()
just returns an unevaluated expression.
as.list(quote(x <- 2 + 3))
as.list(expression(y <- 5 * 8))
bquote
is just like quote but it allows partial substitution in expressions.
Only the expressions wrapped between .()
are evaluated. bquote
is the only
form of "quasiquotation" available in base R (Wickham, 2019).
Using bquote
can sometimes be more flexible than using substitute()
. For
example:
n <- 5
substitute(p + x, list(x = n))
bquote(p + .(n))
And this is how enquote
works:
z <- 5
enquote(z == 1)
If you want to return the quote
itself, wrap the quote inside substitute
.
substitute(quote(a = 2))
Symbols
name
and symbol
mean the same, that refers to the name of the R objects.
as.symbol()
is.symbol()
as.name()
is.name()
While class
and mode
say name
, the rest implies symbol
.
e <- expression(fun <- function(x) x)
e[[1]]
# fun <- function(x) x
e[[1]][[1]]
# `<-`
e[[1]][[2]]
# fun
mmy::object_types(e[[1]][[2]])
# __type__ __value__
# 1 class name
# 2 typeof symbol
# 3 mode name
# 4 storage.mode symbol
# 5 sexp.type SYMSXP
Unfortunately, R interface is full of legacy stuff, at some point in time,
they are called as names. Although, that sounds technically correct, I see
that created a confusing with the actual names
command.
Symbols have a "name" mode, "symbol" storage mode and a "symbol"
type.
There's a note in the documentation in the ?name
:
The term ‘symbol’ is from the LISP background of R, whereas ‘name’ has been the standard S term for this.
I'd prefer to stick to the "symbol" as it also seems to be more common among the other programming languages.
Call
call()
is used to construct a call object.
call("convolve")
call("convolve", x = 3, y = 5)
- You can "call" the expressions by wrapping them between the
parentheses because
(
is the operator for calling (see?Paren
).
(cconv <- call("convolve", x = 3, y = 5))
as.list(cconv)
eval(cconv)
N.B. do.call()
calls a function by a name on a given argument list.
N.B. There’s a bunch of functions to access and manipulate the call
stack. See ?sys.parent
documentation for more information.
Function
Functions (or in R, they are also all closures) have three components:
-
Formals (or arguments). The argument list can be a symbol or special dot-dot-dot (
...
) type. -
Body
-
Environment
square <- function(x) {
x ^ 2
}
formals(square)
# $x
body(square)
# {
# x^2
# }
environment(square)
# <environment: R_GlobalEnv>
Language
R considers calls, expressions and symbols as language
.
e <- expression(x <- 1)
is.language(e)
# [1] TRUE
mmy::object_types(e)
# __type__ __value__
# 1 class expression
# 2 typeof expression
# 3 mode expression
# 4 storage.mode expression
# 5 sexp.type EXPRSXP
e[[1]][[1]]
# `<-`
mmy::object_types(e[[1]][[1]])
# __type__ __value__
# 1 class name
# 2 typeof symbol
# 3 mode name
# 4 storage.mode symbol
# 5 sexp.type SYMSXP
Note that objects returned by quote
are “not” considered as the
language.
is.language(quote(1))
# [1] FALSE
Parsing
utils::getParseData()
can be used to parse the R code at a low level.
e <- expression({
x <- 10
y <- "char"
z <<- 2
# some comment here..
lapply(mtcars, function(i) {
pnorm(mtcars[i, i], log.p = TRUE)
}) -> res
paste(y, res, sep = ":")
})
prs <- parse(text = e)
parsed <- getParseData(prs)
head(parsed)
# line1 col1 line2 col2 id parent token terminal text
# 127 1 1 9 1 127 0 expr FALSE
# 1 1 1 1 1 1 127 '{' TRUE {
# 9 2 5 2 11 9 127 expr FALSE
# 3 2 5 2 5 3 5 SYMBOL TRUE x
# 5 2 5 2 5 5 9 expr FALSE
# 4 2 7 2 8 4 9 LEFT_ASSIGN TRUE <-
Token | Example | Note |
---|---|---|
COMMENT |
# |
|
LEFT_ASSIGN |
<- , <<- |
right assign -> turned into left assign |
SYMBOL |
mtcars , x , ... |
|
FUNCTION |
function |
|
SYMBOL_FORMALS |
i |
|
SYMBOL_FUNCTION_CALL |
lapply , pnorm , ... |
|
SYMBOL_SUB |
log.p |
specified arg. names in function calls |
EQ_ASSIGN |
= |
(equality assignment with equal sign e.g. x = 2 ) |
EQ SUB |
= |
function argument with value (e.g. in square(x = 4) ) |
STR_CONST |
"char" |
|
NUM_CONST |
10 |
There are also some tokens such as '{'
, '('
and ','
. Right assign
operator ->
is turned into the commonly used left assign operator <-
when
R parsing expressions.
expression(lapply(mtcars, mean) -> res)
Resources
-
The R language definition documents the whole language.
-
Chambers, J. (2008). Software for data analysis: programming with R. Springer Science & Business Media.
-
Wickham, H. (2019). Advanced R (second edition). CRC Press.
-
T. Mailund. (2017). Metaprogramming in R. DOI 10.1007/978-1-4842-2881-4_1
-
Kalibera, T., Maj, P., Morandat, F., & Vitek, J. (2014, March). A fast abstract syntax tree interpreter for R. In ACM SIGPLAN Notices (Vol. 49, No. 7, pp. 89-102). ACM.