DSL test suite reference
In TESTed, a test suite specifies which test cases are executed against a submission. TESTed differs from other test frameworks in that its test suites are independent of any programming language. As a result, a single test suite is sufficient to check submissions for the same exercise in different programming languages.
While TESTed has an advanced format for the test suites, we have also developed a small domain-specific language (DSL), to make creating common exercises much easier. This document is the reference for the DSL test suite format, and contains all options and possibilities.
DSL test suites are written in YAML. A JSON Schema of the format is available in the TESTed repository, which can enable checks and autocompletion in your editor.
Structure
The structure of a DSL test suites follows the general Dodona structure, and consists of three levels:
Below we describe objects of each level. Mandatory attributes are indicated with a star (*). At the end of this document, there is a full example of a test suite.
Root of the test suite
The test suite starts with either a root object, or a list of tabs. The root object contains three attributes:
tabs
*: a list of tab objectsnamespace
: the "namespace" for the code of the submission, such as the class name in Java.config
: the global configuration optionslanguage
: the language of the expressions and statements. If this attribute is not set to"tested"
, all expressions and statements (except for return values) will be programming-language-specific expressions or statements.
Tabs
A tab object maps onto a tab in the output on Dodona. It has four possible attributes:
tab
*: the name of the tab to be displayed in Dodonacontexts
*: a list of contexts (if this is given, you cannot use the attributetestcases
)testcases
*: a list of test cases (if this is given, you cannot use the attributecontexts
)config
: the configuration options for this tab and all children
In a lot of exercises, you have precisely one testcase per context. This is exactly what you can do using the testcases
attribute: behind the scenes, each testcase will be placed in its own context.
Hint
While there are four possible attributes, each tab object can only have three, since contexts
and testcases
are mutually exclusive.
Contexts
A context is a group of test cases that depend on each other. The context object has three attributes:
testcases
*: a list of test casesconfig
: the configuration options for this context and all childrencontext
: an optional description of the contextfiles
: optional list of files
In most cases, it is fine to leave the description empty.
Each context must have at least one test case. Since each context is executed separately, the following two constraints apply:
- Only the first test case may have a "main call", i.e. command line arguments or stdin.
- Only the last test case may have a test for the program's exit code.
Do note that the first and last test case may be the same one: if you only have one test case, it may be a main call and have a check for the exit code.
Test cases
Test cases are the building blocks of a test suite, and contain some input and the expected outputs (the tests). Within each context, the following constraints apply to test cases:
- Only the first test case may have a "main call", i.e. command line arguments or stdin.
- Only the last test case may have a test for the program's exit code.
Do note that the first and last test case may be the same one: if you only have one test case, it may be a main call and have a check for the exit code.
A test case can have the following attributes:
config
: the configuration options for this test case and all childrenfiles
: optional list of files
Additionally, a test case can have all attributes described below, but do note:
- A test case can only have one "input", meaning the
arguments
/stdin
,expression
andstatement
attributes are mutually exclusive. - The attribute
return
requires the attributeexpression
.
stdin
The data to provide to the standard input.
If this attribute is used, you cannot specify expression
or statement
as input, nor can you use return
as tests.
arguments
A list of strings to pass to the program as the command line arguments.
If this attribute is used, you cannot specify expression
or statement
as input, nor can you use return
as tests.
expression
/ statement
This attribute can take two values: a string or an object.
A string contains the expression to evaluate or statement to execute during this test case. For a statement, in contrast to for an expression, all return values are ignored if there are any.
Expressions and statements use the Python syntax, with some restrictions, which are detailed here.
If the value is an object, it must be a mapping of programming language to a language-specific expressions or statement.
return:
python: "submission.the_function()"
java: "Submission.theFunction()"
stdout
/ stderr
Specifies the expected output on standard output and standard error respectively.
The attribute is either a string (in which case the string is the expected value), or an object for more advanced cases. The object has the following attributes:
data
: the expected data, same as using a stringconfig
: the configuration options
exception
Specifies the expected message of an expected exception. Note that TESTed currently does not allow checking the exception type or class. For example, you cannot check that an assertion error or exception happened.
return
Specifies the expected return value.
By default, this attribute is interpreted as a YAML value. For example, a YAML string will result in a literal string value.
If you need more advanced return values, there are two options:
- string tagged as
!expression
use the same Python syntax as for expressions and statements - objects tagged as
!oracle
denote the return value oracle (a custom check function) (see below)
exit_code
Specifies the expected exit code of the program.
Note that only the last test case of a context can have this attribute, although the last test case can also be the first test case if needed.
Custom check function (oracles)
The following attributes can have a custom check function: return
, stdout
and stderr
.
An object for a custom check function has the following attributes:
oracle
: the type of check function. Currently, this can becustom_check
orbuiltin
.builtin
uses the built-in oracle. Forcustom_check
the parent object (thereturn
,stdout
, orstderr
above it) should be tagged with!oracle
value
(withreturn
) ordata
(withstdout
/stderr
): the expected value (for advanced values see!expression
above)file
: the name of the file containing the custom check function (relative to theevaluation
folder)name
: the name of the check function (in snake case)arguments
: a list of values that are arguments to the check function
For a return value:
return: !oracle
value: "27-08-2023"
oracle: "custom_check"
file: "test.py"
name: "evaluate_test"
arguments: [5, 6]
The check function must have the following signature:
from evaluation_utils import EvaluationResult, ConvertedOracleContext
def check_function(context: ConvertedOracleContext, *) -> EvaluationResult
The first argument of the check function is always a ConvertedOracleContext
. This object has a few attributes:
expected
: the expected value of the oracle as defined by the keyvalue
in the test suiteactual
: the value that was actually generated by the submissionexecution_directory
: path to the folder where the submission was judgedevaluation_directory
: path to theevaluation
folder of the exercise (that contains the test suite)programming_language
: the programming language of the submissionnatural_language
: the natural language of the user that submitted this submission
The other arguments are the same as the arguments
attribute from the test suite. In this example, the check function would have three arguments: the context and the two numbers from the test suite.
The return value is a class of the type EvaluationResult
from the module evaluation_utils
. The constructor of this class has the following parameters:
result
: A boolean indicating if the generated value is correct or not.readable_expected
, optional: The expected value to show on Dodona.readable_actual
, optional: The generated value to show on Dodona.messages
, optional: A list of messages (Message
s or strings).dsl_expected
, optional: The expected value as string value. TESTed will convert this to the programming language of the submission before showing it on Dodona.dsl_actual
, optional: The generated value as strong value. TESTed will convert this to the programming language of the submission before showing it on Dodona.
In most cases, and especially when preparing programming-language-independent exercises, it is better to use dsl_expected
and dsl_actual
: otherwise the check function itself is responsible for displaying the expected and actual value in the correct programming language.
The list of messages must be strings or Message
s. A Message
is a class from the evluation_utils
module and has the following attributes:
description
: the message to show.format
: the format of the message, liketext
,code
orhtml
.permission
: who can see the message:staff
,student
orzeus
.
This becomes:
from evaluation_utils import EvaluationResult, Message
def evaluate_test(context):
return EvaluationResult(
result=True,
dsl_expected=repr("hallo"),
dsl_actual=repr("hallo"),
messages=[Message(
description="Hallo",
format="html",
permission="staff"
)]
)
Files
Some parameters or other strings are a name of a file. If you want that parameter to link to the actual file, it needs to be added to the list of files. Each object in this list has two attributes:
name
: the name of the file as it appears in the inputurl
: the location where the link should point to, relative to the exercise folder
Configuration options
The configuration object can be specified on any level and applies to all levels below it. For example, specifying the config on the tab level means it will apply to all contexts and, in turn, all test cases within that tab.
The configuration option has two attributes:
stdout
: the configuration options for standard outputstderr
: the configuration options for standard errorreturn
: the configuration options for the expected return value
Test options
This object contains a set of configuration options that influence how the test results are checked by TESTed. The following options are available:
applyRounding
: apply rounding when comparing values as float point numbersroundTo
: the number of decimals to round to, ifapplyRouding
is truecaseInsensitive
: ignore the case of text when comparing stringsignoreWhitespace
: ignore leading and trailing whitespacetryFloatingPoint
: try comparing text as floating point numbers
Expressions and statements
In the test suite, expressions and statements are written as YAML strings, using the Python syntax. For example, a function call with one argument "hello"
:
expression: "a_function_name('hello')"
Since the Python syntax does not have a separate syntax for all features supported by TESTed, there are some conventions:
- Function calls whose name begins with a capital are considered constructors, e.g.
Constructor(56)
. - Identifiers that are in all caps are considered global constants, e.g.
VERY_LONG_NAME
. - Casts are done using the normal Python way. For example, to cast a number to
int64
:int64(56)
.
Additionally, most of the syntax is not supported, since TESTed only has support for limited expressions and statements. The following is supported:
- Simple values, such as
5
,-9.3
or"Hello world"
. - Complex values, such as
[5, 6, 7]
,{5, "Hello"}
or{"key": "value"}
. - Function calls, including named arguments
the_function(5, named=6)
. Do note that named arguments are converted to positional arguments in programming languages that do not support named arguments. - Constructors (using our convention).
- Assignments, such as
some_variabel = 5
. - Referencing variables, such as
the_function(some_variable)
.
Notably, absent are any type of function or class definitions and all operators.
Language-specific expressions and statements
If language-specific expressions or statements are used (either by setting the language globally or by using an object to an attribute expression
or statement
), the string will be used literally in the test code.
This has the advantage that all language features of the programming language can be used. On the other hand, this causes exercises to no longer be programming language independent. You have to use the correct namespace yourself, and it will not work for functions with return type void
.
Since TESTed cannot analyse these strings, it is necessary to use the namespace
yourself. This is the name of the submitted solution or class (configurable with the attribute namespace
). This name is programming language dependent:
- tab: "My tab"
testcases:
- expression:
c: "to_string(1+1)"
haskell: "Submission.toString (1+1)"
runhaskell: "Submission.toString (1+1)"
java: "Submission.toString(1+1)"
javascript: "submission.toString(1+1)"
kotlin: "toString(1+1)"
python: "submission.to_string(1+1)"
csharp: "Submission.toString(1+1)"
return: "2"
Supported tags
TESTed supports the following standard YAML types:
!!set
to denote a set.
Finally, all TESTed types can also be used as tags. For example !int64
or !double
. Note that custom types use one exclamation mark, while standard types use two.
Newlines for textual results
For the result of stdout
and stderr
, TESTed follows this convention: either the text should be empty, or the text should end with a newline. TESTed will enforce this convention: if the text in the test plan does not end with a newline, TESTed will add a newline.
This is the same convention as in POSIX, and is also applied in many programming languages. For example, print
in Python will add a newline by default.
YAML cheat sheet
This section contains a very brief overview of the YAML features used in the DSL.
Objects
Objects in YAML are key-value pairs, where the key (the attribute) and value are separated by a colon:
key: value
Nested objects are created using indentation:
root:
child0:
subchild0: "leaf"
subchild1: "leaf"
child1:
subchild0: "leaf"
Lists
Lists in YAML can be written either on one line (using the JSON syntax) or with one value per line. For example, a list on one line
["Item 0", "Item 1", "Item 2", "Item 3"]
When using one value per line, each value must be prefixed with a dash (-) and space:
- "Item 0"
- "Item 1"
- "Item 2"
- "Item 3"
You can also combine lists and objects:
list:
- name: "Item 0"
items: 5
- name: "Item 1"
- name: "Item 2"
items: 3
- name: "Item 3"
Strings
Ordinary strings in YAML are written using double quotes:
description: "Hello"
However, doing multi-line strings is rather ugly:
description: "Hello\nWorld"
YAML supports special syntax for multi-line strings. Writing the same string as the last example, we get:
description: |
Hello
World
The reverse is also possible, which are called "folded strings". With this syntax, YAML will remove newlines:
description: >
Hello
World
This is equivalent to writing:
description: "Hello World"
Tags
YAML supports tags to give values another type:
!!set [1, 2, 3]
Full example
# A tab on Dodona.
- tab: "Name of the tab"
contexts:
# The files used in this context.
- files:
- name: "file.txt"
url: "media/workdir/file.txt"
testcases:
# An assignment of the variable data.
- statement: 'data = ["list\nline", "file.txt"]'
# Function call that uses the variable.
- expression: 'function(data, 0.5)'
# Expected return value of the function.
return: [ 0, 0 ]
- testcases:
# A function call where the value is cast to "uint8".
- expression: 'echo(uint8(5))'
# The expected return value is also cast to "uint8".
return_raw: "uint8(5)"
# A second tab in the same test suite.
- tab: "Exception"
contexts:
- testcases:
# Another function call.
- statement: 'function_error()'
# The expected text on stdout.
stdout: "Invalid"
# The expected text on stderr.
stderr: "Error"
# We expect an error or exception with the message "Unknown".
exception: "Unknown"
# A third tab.
- tab: "Arguments"
testcases:
# This program gets input via stdin.
- stdin: "Alice"
# There are also command line arguments.
arguments: [ "stdin" ]
# The expected text on stdout.
stdout: "Hello Alice"
# A fourth tab.
- tab: "Config"
# We configure everything on the tab level.
config:
stdout:
# First try to compare text on stdout as float.
tryFloatingPoint: true
# When comparing floats, round to 2 decimals.
applyRounding: true
roundTo: 2
# On stderr we ignore white space and make it case insensitive.
stderr:
ignoreWhitespace: true
caseInsensitive: true
contexts:
- config:
stdout:
# We override the tab configuration for this context.
roundTo: 0
testcases:
- statement: 'diff(5, 2)'
stdout: "2"
- statement: 'diff(5, 2)'
stdout:
data: "2.5"
# We override the context configuration in this test.
config:
roundTo: 4