This guide covers writing VDF implementations and running regression tests for VillageSQL extensions. It is the companion to Creating Extensions, which covers the end-to-end build steps.Documentation Index
Fetch the complete documentation index at: https://villagesql.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
If you are contributing to the VillageSQL server itself (not building an extension), see Build from Source, which covers the full server developer workflow including running tests with
mysql-test-run.pl directly.Setting Up Your Environment
To develop and test extensions, you need a built VillageSQL server. Follow the Clone and Build from Source guide to compile the server binaries. Once you have a build, use thevillagesql CLI to manage a local dev server instance. Run all commands from the directory where VillageSQL was installed.
Starting a Local Dev Server
Initialize and start a server instance:--dir <path> before any command to manage multiple independent instances, or use --here to create a server directory in the current working directory:
Managing Extension Files
Before installing an extension via SQL, its.veb file must be present on the server. The CLI manages the server’s lib/veb/ directory:
.veb files placed in lib/veb/ before init are seeded automatically. After adding a file, install the extension via SQL:
Writing Extension Functions
Extension functions are written in C++ and registered with VEF. Include a single header to access the full SDK:Typed Wrappers (Recommended)
Typed wrappers provide a type-safe interface for VDF parameters and results. The framework detects wrapper types in your function signature and adapts automatically — themake_func registration syntax is unchanged.
Input wrappers: IntArg, RealArg, StringArg, CustomArg — each
provides is_null() and value(). For parameterized custom types,
CustomArgWith<P> adds a params() accessor that returns the cached
parsed params struct (see Parameterized Types).
Result wrappers: IntResult, RealResult, StringResult, CustomResult
— each provides set_null(), warning(msg), and error(msg). Scalar
results also provide set(value). Buffer results provide buffer() and
set_length(len). StringResult additionally provides
set(std::string_view), which copies up to buffer().size() bytes from the
view and sets the length in one call. For parameterized custom types,
CustomResultWith<P> adds a params() accessor.
Span type: value() and buffer() on the byte-oriented wrappers
return a vsql::Span<T> — a non-owning view over a contiguous run of T
with data(), size(), empty(), begin()/end(), and operator[].
Under C++20 it is an alias for std::span<T>; under C++17 the SDK provides
a minimal compatible implementation, so the same code compiles in either
standard. It is available through <villagesql/vsql.h>.
warning(msg) returns SQL NULL for the row and appends a SQL warning. In strict mode (STRICT_TRANS_TABLES), MySQL promotes it to a statement error on INSERT/UPDATE, so it behaves like error(msg) in strict contexts. Use it for recoverable bad input, such as an unparseable string in an encode function. Use error(msg) for corrupt stored data or any condition where continuing is unsafe. The message for both is truncated to fit the server’s internal error buffer if necessary.
Scalar example — add two integers:
StringResult and CustomResult, write into buffer(), then call
set_length() with the number of bytes written. buffer().size() is the
maximum capacity.
For VDFs that return a custom type (returns(CUSTOM(MYTYPE))), the server
sizes the result buffer to the resolved return type’s persisted_length
automatically — extension authors do not need to declare .buffer_size(...)
on the function builder for this case. If prerun grows the buffer further,
that larger size is preserved. This is what lets, for example,
SVECTOR::from_string('[…1024 floats…]') encode a wide vector without the
result wrapper running out of room.
You can use different styles across functions in the same extension — each
function’s style is determined by its own signature.
Aggregate VDFs
Aggregate VDFs accumulate state across rows within eachGROUP BY group and
return a single result per group, like SQL SUM or COUNT. Use
make_aggregate_func<State, &result_fn>("name") to register one. The State
type is the per-group accumulation buffer; prerun and postrun are
auto-generated to allocate and delete it.
The result function must have the signature void(const State&, ResultWrapper)
where ResultWrapper is one of IntResult, RealResult, StringResult,
CustomResult, or CustomResultWith<P>. Call out.set(value) to return a
value or out.set_null() to return SQL NULL.
Both .clear<>() and .accumulate<>() are required. The builder enforces
this at compile time (via build()), and the server validates it again at
INSTALL EXTENSION time — clear resets state, accumulate folds rows, and
the result function reads the final state.
make_aggregate_func<State, &result_fn>()auto-generatesprerunandpostrun(value-initializes and deletesState)..clear<&fn>()wrapsvoid(State&)→vef_vdf_clear_func_t.accumulate<&fn>()wrapsvoid(State&, TypedArgs...)→vef_vdf_accumulate_func_t.TypedArgsare deduced from the function signature (IntArg,StringArg, etc.).- The
ResultWrappertype (IntResult,RealResult, etc.) is deduced from the result function signature.
Per-Statement State (Prerun and Postrun)
Some VDFs need state that spans every row a single query touches — a call counter, a cached result, an open resource. Allocate it in a prerun hook, access it from the VDF body, and release it in a postrun hook. Both hooks run once per statement; the VDF body runs once per row. Register them with.prerun<&Hook>() and .postrun<&Hook>(). The required
signatures are:
| Hook | Required signature |
|---|---|
| Prerun | void(vsql::PrerunArgs, vsql::PrerunResult) |
| Postrun | void(vsql::PostrunArgs) |
PrerunResult::set_user_data(void*) to stash state; use PostrunArgs::delete_state<T>() to release it. If prerun calls set_user_data(new T{}), postrun must call delete_state<T>() — the SDK does not auto-free.
PrerunArgs::type_at(i) exposes the declared SQL type of each argument before any rows are read; the predicates is_int(), is_real(), is_str(), is_custom() on the returned PrerunArgType mirror the column types. Use this in prerun to validate argument types or call PrerunResult::request_buffer_size(n) to size the result buffer.
Varargs VDFs
A varargs VDF accepts any number of arguments of any SQL type. Declare one with.varargs() on the func builder, which is mutually exclusive with
.no_params() and .param(TYPE). The body receives a vsql::VarArgs argument
instead of the usual fixed-arity wrappers.
The framework cannot validate argument count or types for varargs VDFs. Pair
every varargs registration with a prerun hook that calls PrerunResult::error()
on invalid input or PrerunResult::request_buffer_size(n) to size the result
buffer.
Iterate over arguments with range-for. Each AnyArg element requires a type
check before reading its value:
| Predicate | Accessor | Return type |
|---|---|---|
is_int() | as_int() | long long |
is_real() | as_real() | double |
is_str() | as_str() | std::string_view |
is_custom() | as_custom() | vsql::Span<const unsigned char> |
is_null() before any accessor — all four are undefined on a null argument.
VEF_GENERATE_REGISTRATION
VEF_GENERATE_REGISTRATION creates an internal _vef_do_register() helper
that performs extension registration but does not define the extern "C" entry
points. Use it when you need to customize vef_register behavior — for
example, to patch descriptors after registration in a test build. For normal
extensions, use VEF_GENERATE_ENTRY_POINTS instead.
Type Operation Builders
Only needed if your extension defines a custom column type. If you’re writing functions only, skip ahead to Running Regression Tests. Custom types require three operations the engine calls internally: encode (string to binary), decode (binary to string), and compare. Hash is optional. Implement them against these C++ signatures (all available via<villagesql/vsql.h>):
Fixed-Length Types
vsql::make_type<kTypeName>(). The type name
is passed as a non-type template parameter (NTTP) — a static constexpr const char[]
array. The builder auto-generates VDF names in the TYPE::method format
(e.g., "MYTYPE::from_string") from this NTTP, so no manual string matching
is required. Pass the built type object to .type() on the extension builder;
separate .func() calls for type operations are not needed.
build() fails at compile time if from_string, to_string, or compare
is missing. Each template method checks the function pointer signature via
static_assert.
Intrinsic Default
When aNOT NULL custom-type column receives NULL under IGNORE mode
(e.g., INSERT IGNORE or UPDATE IGNORE), the server calls the intrinsic
default to produce a fallback value rather than raising an error. The
intrinsic default provides a string representation; the server converts it
to binary using the type’s from_string function.
If you omit both For fixed-length types, the default string must encode to exactly
.intrinsic_default_str() and .intrinsic_default_vdf(),
the server calls from_string("") as a fallback. This happens when the type
is first used (at table creation), not at INSTALL EXTENSION. If your
encode function rejects empty string — or encodes it to the wrong number of
bytes — type initialization fails with an error visible in the SQL client:persisted_length bytes. Set an explicit default for any type where
empty string is not a valid input..intrinsic_default_str()
For a constant default, pass the string directly on the type builder (as
shown in the fixed-length example above with .intrinsic_default_str("0")).
VDF-based: .intrinsic_default_vdf() + make_intrinsic_default
When the default value depends on type parameters, implement a function
against one of these signatures (available via <villagesql/vsql.h>):
std::string representation of the default value. On error,
write a message to error_msg and return any value (the SDK checks
error_msg[0] != '\0' to detect errors). Register with
make_intrinsic_default<&fn>("vdf_name") (one argument: the VDF name) and
reference that name on the type builder with .intrinsic_default_vdf().
The parameterized types example below shows the full registration pattern.
Parameterized Types
Variable-length types need the column’s declared parameters at encode, decode, compare, and hash time to determine allocation sizes and layout. Define a params struct with a parse function and an inverseto_strings
function, register both on the type builder with
.params<P, &ParseFunc, &ToStringsFunc>(), and use const P& as the first
argument of your type operation functions. The SDK caches the parse result
per unique parameter combination, so the parse function runs at most once
per type instantiation. The to_strings function is the inverse of parse:
it writes a typed P back into the canonical key/value string form so the
server can publish inferred params in the same shape parse consumes.
.params<>() on the type builder. Use .int_to_params<&mytype_int_to_params_fn>()
to handle MYTYPE(N) integer syntax and .resolve_params<&mytype_resolve_params_fn>() to
validate parameters and compute storage sizes. Call .max_persisted_length(N) with an
upper bound on the persisted byte size across all valid parameterizations; the server
uses this only on the type parameter inference path, where it has not yet inferred
the params and so cannot consult resolve_params to size the encode buffer.
For a VDF-based intrinsic default, use .intrinsic_default_vdf() with the VDF name and
register the VDF separately via make_intrinsic_default<&mytype_default>().
TypeEncodeWithParamsFunc<P>,
TypeDecodeWithParamsFunc<P>, TypeCompareWithParamsFunc<P>, and
TypeHashWithParamsFunc<P> — together with ParamsToStringsFunc<P>
(void fn(const P&, std::map<std::string,std::string>&)) are available via
<villagesql/vsql.h>.
The vsql::make_type template methods detect the params argument and route
through the params cache automatically. Encode functions take
vsql::MaybeParams<P> & as the first argument; is_known() is always true
at runtime, and value() returns const P&. Decode, compare, and hash
variants take vsql::CustomArgWith<P>, whose params() accessor returns
const P&.
Custom Types in Stored Procedures
Custom extension types can be used as stored procedure parameter types and inDECLARE variable declarations. The server resolves the custom type at
routine execution time using the installed extension’s type metadata.
Extension System Variables
Extension system variables are a preview capability — see Preview Capabilities for the full API reference, factory functions, SQL access, and a complete example.Extension Status Variables
Extension status variables are a preview capability — see Preview Capabilities for the full API reference, factory functions, SQL access, and a complete example.Keyring Access
Keyring access is a preview capability — see Preview Capabilities for the full API reference, result codes, and a complete example.Column Storage
Column storage is a preview capability — see Preview Capabilities for the full API reference and a complete example.Inspecting Extension Registration Metadata
INFORMATION_SCHEMA.EXTENSION_REGISTRATION exposes the in-memory VEF
registration struct for each loaded extension as a JSON document. Use it to
verify that the server parsed your extension’s functions, types, and system
variables correctly after INSTALL EXTENSION.
| Column | Type | Description |
|---|---|---|
EXTENSION_NAME | VARCHAR(64) | Name of the installed extension. |
NEGOTIATED_PROTOCOL | BIGINT UNSIGNED | VEF protocol version negotiated between the extension and the server. |
REGISTRATION_JSON | TEXT | JSON serialization of the vef_registration_t struct, including funcs and types arrays. |
Running Regression Tests
Run extension regression tests using the MySQL Test Runner from your VillageSQL build directory.Running the Full Suite
To run all tests for your extension:Running Individual Tests
To run a single test case, specify the suite path and test name:Creating New Tests
When adding new features or fixing bugs, you should add corresponding regression tests.Test Location
Extension tests live in the extension’s own repository under atest/ directory — not in the VillageSQL server’s mysql-test/suite/ tree.
- Test files end with
.testand go intest/t/. - Expected result files end with
.resultand go intest/r/.
my_extension:
test/t/my_new_test.testtest/r/my_new_test.result
Test File Conventions
A typical extension test installs the extension, runs SQL, and uninstalls:.test file to normalize them — without it, recorded results contain absolute paths that break on other machines:
Steps to Add a Test
- Create the
.testfile in your extension’stest/t/directory. - Create an empty
.resultfile in your extension’stest/r/directory. - Run the test with
--recordto generate the expected output: - Verify the output in the generated
.resultfile to ensure it matches your expectations.
Debugging Tests
If a test fails, the test framework provides detailed logs.- Test output: Check
mysql-test/var/log/mysqltest.log(combined) ormysql-test/var/log/<test_name>/(per-test directory). - Server error log: Check
mysql-test/var/log/mysqld.1.err. VillageSQL-specific log messages (emitted viaLogVSQL()) only appear when the server runs with--log-error-verbosity=3. - Diff: The framework outputs a diff between the actual output and the expected
.resultfile.
See Also
- Creating Extensions — end-to-end build steps, CMake setup, and installation
- Extension API Reference — VDF contracts, null handling, and buffer sizing
- Extension Architecture — lifecycle, Victionary caching, performance patterns, and security model

