[ ... ] represents an optional entry. Here,
<forall declarator>is theforalldeclarator, which declares that the function is polymorphic. This is optional.<return type>is the return type of the function.<function name>is the function name.<comma separated function args>is a comma separated list of function arguments, each argument consisting on a type and the argument’s name.<specifiers>are specifiers that instruct the compiler on how to process the function.<function body>is the actual function body, which can be of three kinds: an empty body, an assembler body, or a standard body.
Return type
The return type can be any atomic or composite type, as described in the Types section. For example, the following functions are valid:_ as the return type.
For example:
divAndMod has the inferred type (int, int) -> (int, int). The function computes the division and modulo of the parameters m and n by using the division and modulo operator /%, which always returns a two-element tensor (int, int).
Function name
A function name can be any valid identifier. Additionally, it may start with the symbols. or ~, which have specific meanings explained in the special function call notation section. Specifically, refer to this section to understand how the symbols . or ~ affect the function name.
For example, udict_add_builder?, dict_set, and ~dict_set are all valid function names, and each is distinct. These functions are defined in stdlib.fc.
FunC reserves several function names. See the reserved functions article for more details.
Function arguments
A function can receive zero or more argument declarations, each declaration separated by a comma. The following kinds of argument declarations are allowed:-
Ordinary declaration: an argument is declared using its type followed by its name. Example:
Here,
int xdeclares an argument namedxof typeintin functionfoo. An example that declares multiple arguments:An example that declares no arguments: -
Unused argument declaration: only its type needs to be specified. Example:
This is a valid function of type
(int, int) -> int, but the function does not use its second argument. -
Argument with inferred type declaration: If an argument’s type is not explicitly declared, it is inferred by the type-checker. For example,
This defines a function
incwith the inferred typeint -> int, meaningxis automatically recognized as anint.
Even though a function may appear to take multiple arguments, it takes a single tensor type argument. For more details on this distinction, refer to the function call section.However, for convenience, the individual components of this tensor are conventionally referred to as function arguments.
Specifiers
In FunC, function specifiers modify the behavior of functions. There are three types:impure- Either
inlineorinline_ref, but not both method_id
impure must come before inline and method_id, inline_ref must come before method_id, etc.
impure specifier
The impure specifier indicates that a function has side effects, such as modifying contract storage, sending messages, or throwing exceptions. If a function is not marked as impure and its result is unused, the FunC compiler may delete the function call for optimization.
For example, the stdlib.fc function random changes the internal state of the random number generator:
impure keyword prevents the compiler from removing calls to this function:
Inline specifier
A function marked asinline is directly substituted into the code wherever it is called, eliminating the function call overhead. Recursive calls are not allowed for inline functions.
For example:
add function is marked with the inline specifier, the compiler substitutes add(a, b) with a + b directly in the code.
For instance, the compiler will replace the following code:
inline_ref specifier
When a function is marked with the inline_ref specifier, its code is stored in a separate cell. Each time the function is called, the TVM executes a CALLREF command, which loads the code stored in the referenced cell and executes the function code.
To give you a very high level idea on how to visualize this, think how programs are stored in the blockchain. Anything in the blockchain is a cell. A program is a directed acyclic graph (DAG) of cells. Each cell stores TVM instructions, and can have up to 4 references to other cells. Each one of those references represent code that the TVM can jump to.
So, you can picture a program like this:
Reference to cell A, and Reference to cell B are references to other cells containing further code of the program. When the TVM executes the instruction call reference A, the TVM loads the cell referenced by Reference to cell A and executes the cell.
When a function is marked as inline_ref, its code is placed in a separate cell, name it C. Then, everywhere the function is called in the original program,
it is replaced with a call reference C. Then, the reference to C is added to the original program as a cell reference.
More concretely, imagine the following program:
main function, call it cell M; and another cell storing the code of the foo function, because it is marked as inline_ref, call it cell F. The two calls to foo inside main will be replaced by reference calls to F.
And the reference to F is added as a reference in cell M:
call reference to F executes, the TVM loads the cell for F and executes it.
As the example suggests, contrary to the inline specifier, the code for foo is not duplicated, because the two calls for foo are loading the same cell. As such, inline_ref is generally more efficient regarding code size.
The only case where inline might be preferable is if the function is called just once, because loading cell references costs gas.
However, recursive calls to inline_ref functions remain impossible, as TVM cells do not support cyclic references.
method_id specifier
In a TVM program, every function has an internal integer ID that identifies it uniquely. These IDs are necessary because of the way the TVM calls functions within a program: it uses a dictionary where each key is a function ID that maps to the corresponding function code. When the TVM needs to invoke a particular function, the TVM looks up the ID in the dictionary and executes the corresponding code.
By default, functions are assigned sequential numbers starting from 1. If a function has the method_id specifier, the compiler will compute an ID using the formula (crc16(<function_name>) & 0xffff) | 0x10000 instead. Additionally, such function becomes a get-method (or getter method), which are functions that can be invoked by its name in lite client or TON explorer.
The method_id specifier has the variant method_id(<some_number>), which allows you to set a function’s ID to a specific number manually.
For example, this defines a function whose ID is computed by the compiler and the function is available as a get-method in TON blockchain explorers:
65536. Again, the function is available as a get-method in TON explorers.
Important limitations and recommendations19-bit limitation: Method IDs are limited to signed 19-bit integers, meaning the valid range is from
-2^18 (inclusive) to (2^18 - 1) (inclusive),
i.e., from -262,144 to 262,143.Reserved ranges:- -4 to 0 for special functions:
mainorrecv_internal(id = 0),recv_external(id = -1),run_ticktock(id = -2),split_prepare(id = -3),split_install(id = -4) - 1 to 999 for additional functions (approximate range).
- 65536 and above: default range for user functions when using automatic generation:
(crc16(function_name) & 0xffff) | 0x10000
Function body
Empty body
An empty body, marked with a single semicolon; indicates that the function is declared but not yet defined. Its definition must appear later in the same file or a different file processed before the current one by the FunC compiler. A function with an empty body is also called a function declaration.
For example:
add with type (int, int) -> int but does not define it.
In FunC, all functions must be defined or declared before using them in other functions, which explains the need for function declarations.
For example, the following code calls function foo inside the main function, but foo is defined after main.
Hence, the compiler rejects the code:
foo before main:
foo before main:
Assembler body
An assembler body defines the function using low-level TVM primitives for use in a FunC program. The body consists on the keywordasm, followed a list of TVM instructions, and ending with symbol ;.
For example:
add of type (int, int) -> int, using the TVM instruction ADD.
Refer to the assembler functions article for more details.
Standard body
A standard body uses a block statement, i.e., the body of the function is defined inside curly braces{ }.
For example:
forall declarator
The forall declarator has the following syntax:
forall keyword and finishes with the symbol ->. Each element in the comma separated list must be a type variable name. A type variable name can be any identifier, but capital letters are commonly used.
The forall declarator makes the function a polymorphic function, meaning that when the function is called, the type variables get replaced with actual types.
For example:
X and Y. The function uses these two type variables to declare an argument pair of type [X, Y], i.e., a tuple where the first component is of type X and the second component of type Y. The function then swaps the components of the tuple and returns a tuple of type [Y, X].
That pair_swap is polymorphic means that it can be called with tuples of type [int, int], [int, cell], [cell, slice], [[int, int], cell], etc.
For instance:
pair_swap([2, 3])returns[3, 2]. In this case, both type variablesXandYget substituted withint.pair_swap([1, [2, 3, 4]])returns[[2, 3, 4], 1]. In this case, type variableXgets substituted withint, andYwith[int, int, int].
ad-hoc polymorphism with type classes, are not supported.
At the moment, type variables in polymorphic functions cannot be instantiated with tensor types. There is only one exception: the tensor type
(a), where a is not a tensor type itself, since the compiler treats (a) as if it was a.This means you can’t use pair_swap on a tuple of type [(int, int), int] because type (int, int) is a tensor type.