Creating Extensions

Overview

VillageSQL’s extension framework allows you to add custom functionality to the database server. Build custom extensions using the VEF SDK and extension template.

What is a VillageSQL Extension?

A VillageSQL extension is packaged as a VEB file (VillageSQL Extension Bundle) containing:

Manifest - Metadata about the extension (name, version, description)
Shared library - Compiled C++ code implementing the functionality
Optional metadata - Additional resources or configuration

Extensions are built using the VEF SDK (VillageSQL Extension Framework), which provides:

C++ API for defining types and functions
Automatic registration without SQL scripts
Type-safe function wrappers
Builder pattern for extension definition

VDFs vs Traditional UDFs: Functions registered through the VEF SDK are called VDFs (VillageSQL Defined Functions). VillageSQL also supports traditional MySQL UDFs registered via CREATE FUNCTION ... SONAME, but the VEF SDK approach is recommended for new extensions.

Calling VDFs in SQL

VDFs can be called with or without the extension prefix:

-- Unqualified (preferred for cleaner code)
SELECT complex_abs(impedance) FROM signals;

-- Qualified with extension name (explicit)
SELECT vsql_complex.complex_abs(impedance) FROM signals;

Function Resolution Order: When you call a function without qualification, VillageSQL resolves it in this order:

System functions (built-in MySQL functions like NOW(), CONCAT())
UDFs (traditional MySQL user-defined functions)
VDFs (extension functions) - only if there’s exactly one function with that name
Stored functions (created with CREATE FUNCTION)

When to Use Qualified Names:

Use extension.function_name when multiple extensions provide functions with the same name
Use unqualified names for cleaner code when there’s no ambiguity
Qualification is never required if only one extension provides that function name

Extensions can add:

Custom functions (VDFs) - SQL functions with automatic type checking and validation
Custom data types - New column types like COMPLEX, UUID, or VECTOR that work with ORDER BY and indexes
Type operations - Encode, decode, compare, and hash functions for custom types

Prerequisites

Before you begin, build VillageSQL from source — extensions link against the server’s SDK headers and build tree. Follow the Clone and Build from Source guide first. You also need:

Git - For cloning and version control
CMake 3.16 or higher - Build system
C++ Compiler - GCC 8+, Clang 8+, or MSVC 2019+ with C++11 support
Basic C++ knowledge - Understanding of C++ and function pointers

Step 1: Get the Extension Template

You can start with the extension template in two ways:

Option A: Use Template from VillageSQL Source

If you have VillageSQL source code, the template is included:

cd /path/to/villagesql-source
cp -r villagesql/sdk/template my-extension
cd my-extension

Option B: Fork from GitHub

Start by forking the VillageSQL extension template repository:

Visit the template repository on GitHub:

https://github.com/villagesql/vsql-extension-template

Click the “Fork” button to create your own copy

Clone your fork locally:

git clone https://github.com/YOUR_USERNAME/vsql-extension-template.git
cd vsql-extension-template

Alternatively, use the “Use this template” button on GitHub to create a new repository based on the template without forking history.

Step 2: Update the Manifest

Edit manifest.json to define your extension’s metadata:

{
  "$schema": "https://raw.githubusercontent.com/villagesql/villagesql-docs/main/mysql-8.4/0.0.3/manifest.json.schema.json",
  "name": "my_awesome_extension",
  "version": "1.0.0",
  "description": "My custom VillageSQL extension that does amazing things",
  "author": "Your Name",
  "license": "GPL-2.0"
}

The $schema field is optional but enables IDE autocomplete and inline validation for all manifest fields.

manifest.json Schema

Field	Required	Format	Description
`name`	✅ Yes	letters, digits, `_`, `-`	Unique identifier. Must match `INSTALL EXTENSION` name.
`version`	✅ Yes	MAJOR.MINOR.PATCH	Semantic version string
`description`	No	String	Brief explanation of functionality
`author`	No	String	Author name or organization
`license`	No	String	License identifier (GPL-2.0 recommended)

Validation rules:

name: Must start with a letter and end with a letter or digit. May contain lowercase letters, digits, underscores, and hyphens. Max 64 characters. Use underscores — hyphens require backtick quoting in SQL.
version: Must follow semantic versioning (e.g., 1.0.0, 0.2.1)
Invalid manifest will cause INSTALL EXTENSION to fail

Example:

-- manifest.json has "name": "my_awesome_extension"
INSTALL EXTENSION my_awesome_extension;  -- ✅ Correct: no quoting needed
INSTALL EXTENSION `my-awesome-extension`;  -- ⚠️ Works, but requires backtick quoting

For the full naming convention across SQL, filenames, and repository names, see Extension Naming Conventions.

Step 3: Implement Your Extension with VEF SDK

The VEF SDK provides a C++ API for defining extensions using a fluent builder pattern:

Type-safe function definitions with compile-time checking
Automatic argument validation and type conversion
Support for custom types with compare/hash functions (enables ORDER BY and indexes)

Include VillageSQL Headers

Create your main extension file (e.g., src/extension.cc) and include the VEF SDK:

#include <villagesql/extension.h>
#include <villagesql/func_builder.h>
#include <villagesql/type_builder.h>

// Your implementation code here

Define Your Extension

Use the VEF_GENERATE_ENTRY_POINTS() macro to define your extension:

VEF_GENERATE_ENTRY_POINTS(
  make_extension("my_extension", "1.0.0")  // name, version
    .func(make_func<&my_reverse_impl>("my_reverse")
      .returns(STRING)
      .param(STRING)
      .build())
    .func(make_func<&count_vowels_impl>("count_vowels")
      .returns(INT)
      .param(STRING)
      .build())
);

Function Builder Methods:

make_func<&impl>("name") - Create function with implementation pointer
.returns(type) - Set return type (STRING, INT, REAL, or custom type name)
.param(type) - Add parameter (maximum 8 parameters)
.buffer_size(size_t) - Request specific output buffer size for STRING/CUSTOM returns
.deterministic(bool = true) - Declare that this function always returns the same output for the same inputs and has no side effects. Default is non-deterministic.
.prerun<func>() - Set per-statement setup function (optional)
.postrun<func>() - Set per-statement cleanup function (optional)
.build() - Finalize function registration

Parameter Limit: Functions support a maximum of 8 parameters (defined by kMaxParams). If you need more, consider using structured types or multiple functions.

VDFs with Custom Type Arguments and Return Values

A VDF can take and return custom type values using .param(TYPE_NAME) and .returns(TYPE_NAME) in the builder. The implementation uses CustomArg for input and CustomResult for output — the same wrappers used for type operations:

void complex_conjugate_impl(CustomArg in, CustomResult out) {
    if (in.is_null()) { out.set_null(); return; }
    auto src = in.value();    // Span<const unsigned char> — raw binary
    auto dst = out.buffer();  // Span<unsigned char>
    // ... read src, write result to dst ...
    out.set_length(src.size());
}

.func(make_func<&complex_conjugate_impl>("complex_conjugate")
          .returns(COMPLEX)
          .param(COMPLEX)
          .build())

See the development guide for the full CustomArg/CustomResult API, including CustomArgWith<P> and CustomResultWith<P> for parameterized types.

Deterministic Functions

By default, VDFs are registered as non-deterministic. A non-deterministic function is blocked from three SQL contexts: generated columns, CHECK constraints, and expression default values (DEFAULT (expr) on a column) — using any of these features with a non-deterministic VDF returns an error. If your function always produces the same output for the same inputs and has no side effects, you can declare it deterministic by adding .deterministic() to the builder chain. The optimizer may use this information to evaluate the function once per statement and reuse the value across rows, rather than calling it per row. Incorrectly marking a non-deterministic function as deterministic can therefore cause the server to return the same result for inputs that should produce different outputs. Only add .deterministic() when your function truly has no dependency on external state, randomness, or time. See the MySQL UDF Characteristics documentation for more details on why and when determinism is required. Builder signature: .deterministic(bool d = true) — the zero-argument form defaults to true. Example:

.func(make_func<&complex_add_impl>("complex_add")
          .returns(COMPLEX)
          .param(COMPLEX)
          .param(COMPLEX)
          .deterministic()
          .build())

Because complex_add is declared deterministic, it can be used in a generated column definition:

-- Deterministic VDFs can be used in generated columns
CREATE TABLE t (
    a COMPLEX,
    b COMPLEX,
    result COMPLEX GENERATED ALWAYS AS (complex_add(a, b)) STORED
);
-- STORED vs VIRTUAL follows standard MySQL generated column rules

Custom Buffer Sizes

For functions returning variable-length data, request a specific buffer size:

make_func<&large_result_impl>("large_result")
  .returns(STRING)
  .param(INT)
  .buffer_size(65536)  // Request 64KB buffer
  .build()

Check available buffer space before writing:

void large_result_impl(vef_context_t* ctx, vef_invalue_t* input,
                       vef_vdf_result_t* result) {
    size_t needed = calculate_output_size(input);

    if (needed > result->max_str_len) {
        result->type = VEF_RESULT_ERROR;
        strcpy(result->error_msg, "Output exceeds buffer size");
        return;
    }

    // Write output to result->str_buf
    result->type = VEF_RESULT_VALUE;
    result->actual_len = actual_output_length;
}

Request sufficient buffer size via .buffer_size() based on your function’s maximum output size.

Critical API Contracts

These contracts govern how VDF implementation functions interact with the VEF runtime. Every function registered via make_func<> must follow them. The types referenced below are defined in #include <villagesql/extension.h>.

Part A: VDF Function Contracts

1. VDF implementation functions are void — they never return a value. The server calls your function through this typedef:

typedef void (*vef_vdf_func_t)(vef_context_t *ctx, vef_vdf_args_t *args,
                                vef_vdf_result_t *result);

Extension authors write the per-argument form that the SDK wrapper bridges to this ABI:

void my_func_impl(vef_context_t* ctx, vef_invalue_t* input,
                  vef_vdf_result_t* result) {
    // ... compute result ...
    return;  // always void -- no return value
}

Communicate success, NULL, or error exclusively through result->type. 2. Set result->type to exactly one of the four result constants. Four constants exist in vef_return_value_type_t:

Constant	Value	Meaning
`VEF_RESULT_VALUE`	0	Success — the output is in the appropriate union field
`VEF_RESULT_NULL`	1	The result is SQL NULL
`VEF_RESULT_WARNING`	2	Row-level warning — execution continues, NULL is returned for this row, and a SQL warning is added. In strict mode, MySQL promotes this to an error on INSERT/UPDATE.
`VEF_RESULT_ERROR`	3	Fatal error — statement execution is aborted; message is in `result->error_msg`

There are no type-specific variants. VEF_RESULT_VALUE is the single success constant for strings, integers, reals, and custom types alike. The type of output is determined by which union field you write to (str_buf, int_value, real_value, bin_buf). 3. Check input->is_null before reading any other field on an input argument. If is_null is true, every other field on that vef_invalue_t is undefined. Accessing str_value, int_value, or any other field without checking is_null first is undefined behavior.

void my_func_impl(vef_context_t* ctx, vef_invalue_t* input,
                  vef_vdf_result_t* result) {
    if (input->is_null) {
        result->type = VEF_RESULT_NULL;
        return;
    }
    // Safe to read input->str_value, input->int_value, etc.
}

4. For string results, write to result->str_buf and set result->actual_len. Check result->max_str_len before writing.

result->str_buf is a server-managed buffer. Write your output here.
result->actual_len must be set to the number of bytes written. There is no field named str_len on the result struct.
result->max_str_len is the size of str_buf in bytes. Always check it before writing. Do not use sizeof(result->str_buf).

void upper_impl(vef_context_t* ctx, vef_invalue_t* input,
                vef_vdf_result_t* result) {
    if (input->is_null) {
        result->type = VEF_RESULT_NULL;
        return;
    }

    size_t len = input->str_len;
    if (len > result->max_str_len) {
        result->type = VEF_RESULT_ERROR;
        snprintf(result->error_msg, VEF_MAX_ERROR_LEN,
                 "Input length %zu exceeds buffer size %zu",
                 len, result->max_str_len);
        return;
    }

    for (size_t i = 0; i < len; i++) {
        result->str_buf[i] = toupper(input->str_value[i]);
    }
    result->type = VEF_RESULT_VALUE;
    result->actual_len = len;
}

5. Write error messages to result->error_msg, not to result->str_buf. Use VEF_MAX_ERROR_LEN (512 bytes) as the snprintf size limit. error_msg is a separate, caller-provided buffer dedicated to error text. It is independent of str_buf. The maximum size including the null terminator is VEF_MAX_ERROR_LEN (512).

// Correct -- error goes to error_msg
result->type = VEF_RESULT_ERROR;
snprintf(result->error_msg, VEF_MAX_ERROR_LEN,
         "Invalid input: expected positive integer, got %lld",
         input->int_value);
return;

Part B: Encode/Decode Return Convention

The bool return convention applies only to encode and decode function pointers used by custom types. It does not apply to VDF-name type operations (which use the standard VDF void convention) or to VDF implementation functions.

typedef bool (*vef_encode_func_t)(unsigned char *buffer, size_t buffer_size,
                                  const char *from, size_t from_len,
                                  size_t *length);

typedef bool (*vef_decode_func_t)(const unsigned char *buffer,
                                  size_t buffer_size, char *to, size_t to_size,
                                  size_t *to_length);

Both functions return bool where:

false = success (the operation completed without error)
true = error (the conversion failed)

The encode/decode return convention is inverted from what most developers expect. false means success, true means error. This follows the general convention of MySQL internal C/C++ APIs, which frequently return 0 (false) for success. Double-check every return statement in your encode and decode implementations.

Implement Wrapper Functions

Implementation functions use the VEF API signature:

#include <algorithm>
#include <string>
#include <cstring>

// String reverse implementation
void my_reverse_impl(vef_context_t* ctx, vef_invalue_t* input,
                     vef_vdf_result_t* result) {
    if (input->is_null) {
        result->type = VEF_RESULT_NULL;
        return;
    }

    // Reverse the string
    size_t len = input->str_len;
    for (size_t i = 0; i < len; i++) {
        result->str_buf[i] = input->str_value[len - 1 - i];
    }
    result->type = VEF_RESULT_VALUE;
    result->actual_len = len;
}

// Count vowels implementation
void count_vowels_impl(vef_context_t* ctx, vef_invalue_t* input,
                       vef_vdf_result_t* result) {
    if (input->is_null) {
        result->type = VEF_RESULT_NULL;
        return;
    }

    int64_t count = 0;
    for (size_t i = 0; i < input->str_len; i++) {
        char lower = std::tolower(input->str_value[i]);
        if (lower == 'a' || lower == 'e' || lower == 'i' ||
            lower == 'o' || lower == 'u') {
            count++;
        }
    }
    result->int_value = count;
    result->type = VEF_RESULT_VALUE;
}

Handling NULL Values

Check for NULL via the is_null flag and return NULL by setting the result type:

void my_func_impl(vef_context_t* ctx, vef_invalue_t* input,
                  vef_vdf_result_t* result) {
    // Check if input is NULL
    if (input->is_null) {
        result->type = VEF_RESULT_NULL;
        return;
    }

    // Access input data
    const char* str = input->str_value;
    size_t len = input->str_len;

    // Process and write result to result->str_buf
    // ...

    result->type = VEF_RESULT_VALUE;
    result->actual_len = output_length;
}

NULL handling options:

Input NULL check: if (input->is_null)
Return NULL: result->type = VEF_RESULT_NULL
Return value: result->type = VEF_RESULT_VALUE + write to type-specific buffer
Return warning: result->type = VEF_RESULT_WARNING + warning message in result->error_msg (returns NULL for this row, adds SQL warning, continues execution; in strict mode, MySQL promotes this to an error on INSERT/UPDATE)
Return error: result->type = VEF_RESULT_ERROR + error message in result->error_msg

Error Handling

Return errors with custom messages for validation failures or invalid input:

void validate_age_impl(vef_context_t* ctx, vef_invalue_t* age_input,
                       vef_vdf_result_t* result) {
    if (age_input->is_null) {
        result->type = VEF_RESULT_NULL;
        return;
    }

    long long age = age_input->int_value;

    // Validate input range
    if (age < 0 || age > 150) {
        result->type = VEF_RESULT_ERROR;
        // Write error message
        const char* msg = "Age must be between 0 and 150";
        strcpy(result->error_msg, msg);
        return;
    }

    // Return valid result
    result->type = VEF_RESULT_VALUE;
    result->int_value = age;
}

Result types:

VEF_RESULT_VALUE - Success
VEF_RESULT_NULL - NULL value
VEF_RESULT_WARNING - Row-level warning (returns NULL, adds SQL warning, continues execution; strict mode promotes to error on INSERT/UPDATE)
VEF_RESULT_ERROR - Fatal error (aborts statement execution)

Step 4: Creating Custom Types

Custom types allow you to define new column types that work seamlessly with SQL operations like ORDER BY, GROUP BY, and indexes.

Define Type Operations

Every custom type needs encode, decode, and compare operations, and optionally a hash operation. There are two ways to provide these: VDF names (VEF_PROTOCOL_2 or later, recommended) or function pointers. For each operation, provide one or the other — not both. VDF names — reference VDFs from the same extension by name (see VDF-Name Type Operations):

// Encode: string -> binary. false = success, true = error.
// Set *length = SIZE_MAX to produce SQL NULL.
bool mytype_from_string(std::string_view from,
                        villagesql::Span<unsigned char> buf,
                        size_t *length) { /* ... */ }

// Decode: binary -> string. false = success, true = error.
bool mytype_to_string(villagesql::Span<const unsigned char> data,
                      villagesql::Span<char> out,
                      size_t *out_len) { /* ... */ }

// Compare: returns <0, 0, or >0.
int mytype_compare(villagesql::Span<const unsigned char> a,
                   villagesql::Span<const unsigned char> b) { /* ... */ }

// Hash: returns hash code (optional).
size_t mytype_hash(villagesql::Span<const unsigned char> data) { /* ... */ }

Function pointers (false = success, true = error — see Part B):

// Encode: Convert string representation to binary
bool encode_mytype(unsigned char* buffer, size_t buffer_size,
                   const char* from, size_t from_len, size_t* length) {
    // Parse 'from' string and write binary to 'buffer'
    // Write binary to 'buffer' and set *length to bytes written
    // Return false on success, true on error (e.g., set *length = 0)
}

// Decode: Convert binary to string representation
bool decode_mytype(const unsigned char* buffer, size_t buffer_size,
                   char* to, size_t to_size, size_t* to_length) {
    // Read binary from 'buffer' and write string to 'to'
    // Set *to_length to string length
    // Return false on success, true on error
}

// Compare: enables ORDER BY and indexing (required)
int compare_mytype(const unsigned char* data1, size_t len1,
                   const unsigned char* data2, size_t len2) {
    // Return: negative if data1<data2, 0 if equal, positive if data1>data2
}

// Hash: custom hash (optional, uses default binary hash if omitted)
size_t hash_mytype(const unsigned char* data, size_t len) {
    // Return hash value for the binary data
}

Register the Type

Using VDF names (VEF_PROTOCOL_2 required; see VDF-Name Type Operations):

VEF_GENERATE_ENTRY_POINTS(
  make_extension("my_extension", "1.0.0")
    .type(make_type(MYTYPE)
      .persisted_length(16)
      .max_decode_buffer_length(64)
      .encode("mytype_from_string")      // VDF name instead of &func
      .decode("mytype_to_string")
      .compare("mytype_compare")         // Enables ORDER BY and indexes
      .hash("mytype_hash")              // Optional
      .build())
    .func(make_type_encode<&mytype_from_string>("mytype_from_string", MYTYPE))
    .func(make_type_decode<&mytype_to_string>("mytype_to_string", MYTYPE))
    .func(make_type_compare<&mytype_compare>("mytype_compare", MYTYPE))
    .func(make_type_hash<&mytype_hash>("mytype_hash", MYTYPE))
);

The named VDFs must also be registered with .func() in the same extension. Using function pointers:

VEF_GENERATE_ENTRY_POINTS(
  make_extension("my_extension", "1.0.0")
    .type(make_type(MYTYPE)
      .persisted_length(16)              // Fixed storage size in bytes
      .max_decode_buffer_length(64)      // Max string representation size
      .encode(&encode_mytype)
      .decode(&decode_mytype)
      .compare(&compare_mytype)          // Enables ORDER BY and indexes
      .hash(&hash_mytype)                // Optional custom hash
      .build())
    .func(make_func<&mytype_constructor>("MYTYPE")  // Constructor function
      .returns(MYTYPE)
      .param(REAL)
      .param(REAL)
      .build())
);

VDF-Name Type Operations

Each VDF named via .encode(), .decode(), .compare(), or .hash() must be registered with the matching make_type_* builder in the same extension:

Operation	Builder	VDF SQL Signature
`encode`	`make_type_encode<&f>("name", TYPE)`	`(STRING) -> CUSTOM(this type)`
`decode`	`make_type_decode<&f>("name", TYPE)`	`(CUSTOM(this type)) -> STRING`
`compare`	`make_type_compare<&f>("name", TYPE)`	`(CUSTOM(this type), CUSTOM(this type)) -> INT`
`hash`	`make_type_hash<&f>("name", TYPE)`	`(CUSTOM(this type)) -> INT`

VDF-name operations use the bool return convention (false = success, true = error), same as function-pointer encode/decode. See development.mdx — Type Operation Builders for the full C++ signatures.

ALTER TABLE and Custom Types

ALTER TABLE ... MODIFY COLUMN and CHANGE COLUMN enforce these rules when custom types are involved:

From	To	Result
Non-custom	Custom	Error: `Cannot convert column 'col' to custom type 'MYTYPE'`
Custom	String type	Allowed
Custom	Non-string type	Error: `Cannot convert custom type column 'col' to non-string type`
Custom	Different custom type	Error if incompatible: `Cannot convert between incompatible custom types 'A' and 'B'`

Type Conversion Functions

Expose your encode/decode functions as SQL functions for explicit type conversion:

VEF_GENERATE_ENTRY_POINTS(
  make_extension("my_extension", "1.0.0")
    .type(make_type(MYTYPE)
      .persisted_length(16)
      .max_decode_buffer_length(64)
      .encode(&encode_mytype)
      .decode(&decode_mytype)
      .compare(&compare_mytype)
      .build())
    // String → Custom Type conversion
    .func(make_func("mytype_from_string")
      .from_string<&encode_mytype>(MYTYPE))
    // Custom Type → String conversion
    .func(make_func("mytype_to_string")
      .to_string<&decode_mytype>(MYTYPE))
    .build()
);

Usage in SQL:

-- Convert string to custom type
SELECT mytype_from_string('(1.0,2.0)');

-- Convert custom type to string
SELECT mytype_to_string(my_column) FROM my_table;

-- Explicit conversion in INSERT
INSERT INTO my_table (id, value)
VALUES (1, mytype_from_string('(3.0,4.0)'));

The .from_string<>() and .to_string<>() builders automatically create SQL-callable conversion functions using your encode/decode implementations.

When explicit conversion is required. VillageSQL implicitly converts a string literal to a custom type on direct column assignment, so INSERT INTO t (val) VALUES ('(1.0,2.0)') works without an explicit call. But expressions that resolve to STRING type — CASE expressions, CONCAT, and similar — are not implicitly coerced. Wrap them with the from_string VDF:

UPDATE my_table
SET val = mytype_from_string(
  CASE (pk MOD 2)
    WHEN 0 THEN '(1.0,2.0)'
    ELSE '(0.0,0.0)'
  END
);

Example: COMPLEX Type

Here’s a complete example implementing a COMPLEX number type:

#include <cstdio>
#include <cstring>

// Encode complex number from "(real,imag)" to 16 bytes
bool encode_complex(std::string_view from,
                    villagesql::Span<unsigned char> buf,
                    size_t *length) {
    if (buf.size() < 16) return true;

    double real, imag;
    if (sscanf(from.data(), "(%lf,%lf)", &real, &imag) != 2) return true;

    memcpy(buf.data(), &real, 8);
    memcpy(buf.data() + 8, &imag, 8);
    *length = 16;
    return false;
}

// Decode 16 bytes to "(real,imag)" string
bool decode_complex(villagesql::Span<const unsigned char> data,
                    villagesql::Span<char> out,
                    size_t *out_len) {
    if (data.size() < 16) return true;

    double real, imag;
    memcpy(&real, data.data(), 8);
    memcpy(&imag, data.data() + 8, 8);

    int len = snprintf(out.data(), out.size(), "(%.6f,%.6f)", real, imag);
    if (len < 0 || len >= (int)out.size()) return true;

    *out_len = len;
    return false;
}

// Compare for ORDER BY: compare real part first, then imaginary
int compare_complex(villagesql::Span<const unsigned char> a,
                    villagesql::Span<const unsigned char> b) {
    if (a.size() < 16 || b.size() < 16) return 0;

    double a_real, a_imag, b_real, b_imag;
    memcpy(&a_real, a.data(), 8);
    memcpy(&a_imag, a.data() + 8, 8);
    memcpy(&b_real, b.data(), 8);
    memcpy(&b_imag, b.data() + 8, 8);

    if (a_real < b_real) return -1;
    if (a_real > b_real) return 1;
    if (a_imag < b_imag) return -1;
    if (a_imag > b_imag) return 1;
    return 0;
}

After defining these operations, users can create tables with your custom type:

CREATE TABLE signals (
    id INT PRIMARY KEY,
    impedance COMPLEX,
    frequency_response COMPLEX
);

INSERT INTO signals VALUES (1, '(50.0,10.0)', '(0.95,0.31)');

-- ORDER BY works because we provided compare_complex!
SELECT * FROM signals ORDER BY impedance;

-- Prepared statements work with custom types
PREPARE stmt FROM 'SELECT * FROM signals WHERE impedance = ?';
SET @val = '(50.0,10.0)';
EXECUTE stmt USING @val;

-- Aggregate operations work with custom types
SELECT COUNT(DISTINCT impedance), MIN(impedance), MAX(impedance),
       GROUP_CONCAT(impedance ORDER BY impedance) FROM signals;

VDFs in Generated Columns

VDFs can be used in generated column expressions. The VDF must be declared .deterministic() in the extension builder — the server blocks non-deterministic functions in this context.

CREATE TABLE signals (
    id INT PRIMARY KEY,
    impedance COMPLEX,
    -- Generated column computed by a VDF
    magnitude DOUBLE GENERATED ALWAYS AS (complex_abs(impedance)) STORED
);

complex_abs must be registered with .deterministic(). Traditional MySQL UDFs are not permitted in generated columns.

See vsql_complex Example for the complete implementation.

VDFs in Functional Indexes

VDFs can be used in functional index expressions. The same .deterministic() requirement from generated columns applies here because MySQL implements functional indexes as hidden generated columns.

CREATE TABLE signals (
    id INT PRIMARY KEY,
    sig COMPLEX,
    INDEX idx_magnitude ((COMPLEX_ABS(sig)))
);

The optimizer uses the index when the same VDF expression appears in WHERE, ORDER BY, or GROUP BY. Cast the comparison value to the VDF’s return type so the optimizer matches the expression:

SELECT id FROM signals WHERE COMPLEX_ABS(sig) > CAST(20.0 AS DOUBLE);

Step 5: Update Build Configuration

Edit CMakeLists.txt to build your extension as a VEB file:

cmake_minimum_required(VERSION 3.16)
project(my_extension)

# Find VillageSQL Extension Framework
find_package(VillageSQLExtensionFramework QUIET
    PATHS "${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake"
)

# The framework detects build flags via 4 methods (in order):
# 1. Explicit MYSQL_INCLUDE_FLAGS/MYSQL_CXXFLAGS
# 2. VillageSQL_BUILD_DIR (reads CMakeCache.txt from VillageSQL build)
# 3. VSQL_BASE_DIR (uses mysql_config)
# 4. Default - mysql_config from PATH

# Build shared library with your source files
add_library(extension SHARED
    src/extension.cc
    src/my_functions.cc
)

# Create VEB archive
VEF_CREATE_VEB(
    NAME my_extension
    LIBRARY_TARGET extension
    MANIFEST ${CMAKE_CURRENT_SOURCE_DIR}/manifest.json
)

# Install VEB to VillageSQL extensions directory
install(FILES ${VEB_OUTPUT} DESTINATION ${INSTALL_DIR})

Configuration notes:

VillageSQLExtensionFramework provides CMake helpers for building extensions
VEF_CREATE_VEB() packages your library, manifest, and metadata into a .veb archive
The framework automatically detects MySQL/VillageSQL build flags
Library target name is typically extension (can be anything)
VEB name must match your manifest.json name
By default, extensions build against the stable ABI headers. Set -DVSQL_USE_DEV_ABI=ON to build against the unstable dev headers instead

Step 6: Create a Build Directory

Create a separate build directory:

mkdir build
cd build

Step 7: Build with CMake and Make

Configure and build your extension:

# Configure the build
cmake ..

# Or, if building against VillageSQL source:
cmake .. -DVillageSQL_BUILD_DIR=/path/to/villagesql/build

# Or, to build against the unstable dev ABI headers:
cmake .. -DVSQL_USE_DEV_ABI=ON

# Build the extension
make

This creates:

Compiled shared library (.so file)
VEB package (.veb file) - a tar archive containing manifest and library

Verify the Build

Check the contents of your VEB file:

make show_veb

You should see:

manifest.json
lib/myext.so

Step 8: Install and Test

Option A: Install to VillageSQL Extensions Directory

Use the install target to copy the VEB to your VillageSQL installation:

make install

This copies the .veb file to the directory configured via VillageSQL_VEB_INSTALL_DIR.

Option B: Manual Installation

Copy the VEB file manually:

# Find the VEF directory
mysql -u root -p -e "SHOW VARIABLES LIKE 'veb_dir';"

# Copy the VEB file
sudo cp my-awesome-extension.veb /path/to/veb_dir/

Test Your Extension

Connect to VillageSQL:
```
mysql -u root -p
```

Install the extension:

INSTALL EXTENSION my_awesome_extension;

Verify installation:

SELECT * FROM INFORMATION_SCHEMA.EXTENSIONS;

Test your functions:

SELECT my_reverse('Hello, World!');
-- Output: !dlroW ,olleH

SELECT count_vowels('VillageSQL');
-- Output: 3

Creating Tests

Add test files to validate your extension works correctly:

Create a test file in test/t/:

-- test/t/my_basic.test
SELECT my_reverse('abc');
SELECT my_reverse('');
SELECT my_reverse(NULL);

Generate expected results:

cd /path/to/villagesql/build/mysql-test
perl mysql-test-run.pl --suite=/path/to/your/extension/test --record

Run tests:

perl mysql-test-run.pl --suite=/path/to/your/extension/test

Advanced Topics

Per-Statement State with Prerun/Postrun

For VEF SDK functions that need setup/cleanup per SQL statement (not per row):

// Called once before the first row
void my_prerun(vef_context_t* ctx, vef_prerun_args_t* args,
               vef_prerun_result_t* result) {
    // Allocate state, open connections, etc.
    void* state = malloc(sizeof(MyState));
    result->user_data = state;
}

// Called once after all rows processed
void my_postrun(vef_context_t* ctx, vef_postrun_args_t* args,
                vef_postrun_result_t* result) {
    // Clean up state allocated in prerun
    if (args->user_data) {
        free(args->user_data);
    }
}

// Register with extension
VEF_GENERATE_ENTRY_POINTS(
  make_extension("my_ext", "1.0.0")
    .func(make_func<&my_func_impl>("my_func")
      .returns(STRING)
      .param(STRING)
      .prerun(&my_prerun)
      .postrun(&my_postrun)
      .build())
);

Most extensions don’t need prerun/postrun hooks. The VEF SDK automatically handles common cases like type checking and buffer allocation. Use prerun/postrun only when you need expensive per-statement setup (like opening connections) that shouldn’t happen per-row.If you find you need prerun/postrun for your use case, please share your scenario on the VillageSQL Discord - the team may be able to add SDK support to handle it automatically.

Aggregate Functions

Only COUNT(DISTINCT), MIN, MAX, and GROUP_CONCAT are supported for custom types. SUM, AVG, and other aggregate functions are not.

Supported aggregate operations with custom types:

-- COUNT(DISTINCT) works with custom types
SELECT COUNT(DISTINCT impedance) FROM signals;

-- MIN and MAX work with custom types (requires compare function)
SELECT MIN(impedance), MAX(impedance) FROM signals;

-- GROUP_CONCAT works with custom types
SELECT GROUP_CONCAT(impedance ORDER BY impedance SEPARATOR ', ') FROM signals;

Extension functions are called in a per-row execution model:

Each function call processes one row with its own result buffer (thread-safe)
prerun/postrun provide per-statement setup/teardown
Avoid global state - use function parameters and return values instead
If you must use global state, protect it with mutexes/locks

Best practice: Design functions to be stateless for simplicity and safety.

Window Functions

The following window functions work with custom types:

SELECT
    id,
    impedance,
    LAG(impedance)  OVER (ORDER BY id) AS prev_impedance,
    LEAD(impedance) OVER (ORDER BY id) AS next_impedance
FROM signals;

SELECT
    id,
    impedance,
    FIRST_VALUE(impedance) OVER w AS first_impedance,
    LAST_VALUE(impedance)  OVER w AS last_impedance,
    NTH_VALUE(impedance, 2) OVER w AS second_impedance
FROM signals
WINDOW w AS (ORDER BY id ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING);

Temporary Tables

Custom types work in temporary tables. CREATE TEMPORARY TABLE, INSERT, and ALTER TABLE behave the same as with permanent tables.

CREATE TEMPORARY TABLE tmp_signals (
    id        INT PRIMARY KEY,
    impedance COMPLEX
);

INSERT INTO tmp_signals VALUES (1, '(10,5)'), (2, '(20,0)');
SELECT id, impedance FROM tmp_signals;

Triggers

Triggers fire on tables with custom type columns. The trigger body can reference non-custom-type columns from NEW and OLD. Accessing custom type column values inside a trigger body is not yet supported.

CREATE TABLE signals (
    id        INT PRIMARY KEY,
    impedance COMPLEX,
    label     VARCHAR(50)
);
CREATE TABLE signal_log (
    id        INT,
    label     VARCHAR(50),
    logged_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TRIGGER signals_after_insert
AFTER INSERT ON signals
FOR EACH ROW
    INSERT INTO signal_log (id, label) VALUES (NEW.id, NEW.label);

INSERT INTO signals VALUES (1, '(10,5)', 'sensor_a');
SELECT id, label FROM signal_log;

Troubleshooting

Extension Won’t Load

Check the error log and verify the VEB contents:

make show_veb
tail -f /var/log/mysql/error.log

Function Not Found

Verify installation and registration:

SELECT * FROM INFORMATION_SCHEMA.EXTENSIONS;

Build Errors

# Verify mysql_config is available
which mysql_config
mysql_config --version

# Check compiler version
gcc --version  # or clang --version

# Verify CMake version (3.16+ required)
cmake --version

Example Extensions

Learn from existing VillageSQL extensions:

vsql_complex

Complex number data type implementation

vsql_extension_template

Minimal template for creating extensions

Next Steps

Using Extensions

Learn how to install and manage extensions

Clone and Build from Source

Build VillageSQL from source code

GitHub

Contribute to VillageSQL Server

MySQL UDF API

MySQL UDF API reference documentation

Getting Started

Extensions

Using Extensions

Documentation Index

​Overview

​What is a VillageSQL Extension?

​Calling VDFs in SQL

​Prerequisites

​Step 1: Get the Extension Template

​Option A: Use Template from VillageSQL Source

​Option B: Fork from GitHub

​Step 2: Update the Manifest

​manifest.json Schema

​Step 3: Implement Your Extension with VEF SDK

​Include VillageSQL Headers

​Define Your Extension

​VDFs with Custom Type Arguments and Return Values

​Deterministic Functions

​Custom Buffer Sizes

​Critical API Contracts

​Part A: VDF Function Contracts

​Part B: Encode/Decode Return Convention

​Implement Wrapper Functions

​Handling NULL Values

​Error Handling

​Step 4: Creating Custom Types

​Define Type Operations

​Register the Type

​VDF-Name Type Operations

​ALTER TABLE and Custom Types

​Type Conversion Functions

​Example: COMPLEX Type

​VDFs in Generated Columns

​VDFs in Functional Indexes

​Step 5: Update Build Configuration

​Step 6: Create a Build Directory

​Step 7: Build with CMake and Make

​Verify the Build

​Step 8: Install and Test

​Option A: Install to VillageSQL Extensions Directory

​Option B: Manual Installation

​Test Your Extension

​Creating Tests

​Advanced Topics

​Per-Statement State with Prerun/Postrun

​Aggregate Functions

​Window Functions

​Temporary Tables

​Triggers

​Troubleshooting

​Extension Won’t Load

​Function Not Found

​Build Errors

​Example Extensions

vsql_complex

vsql_extension_template

​Next Steps

Using Extensions

Clone and Build from Source

GitHub

MySQL UDF API

​Resources

Overview

What is a VillageSQL Extension?

Calling VDFs in SQL

Prerequisites

Step 1: Get the Extension Template

Option A: Use Template from VillageSQL Source

Option B: Fork from GitHub

Step 2: Update the Manifest

manifest.json Schema

Step 3: Implement Your Extension with VEF SDK

Include VillageSQL Headers

Define Your Extension

VDFs with Custom Type Arguments and Return Values

Deterministic Functions

Custom Buffer Sizes

Critical API Contracts

Part A: VDF Function Contracts

Part B: Encode/Decode Return Convention

Implement Wrapper Functions

Handling NULL Values

Error Handling

Step 4: Creating Custom Types

Define Type Operations

Register the Type

VDF-Name Type Operations

ALTER TABLE and Custom Types

Type Conversion Functions

Example: COMPLEX Type

VDFs in Generated Columns

VDFs in Functional Indexes

Step 5: Update Build Configuration

Step 6: Create a Build Directory

Step 7: Build with CMake and Make

Verify the Build

Step 8: Install and Test

Option A: Install to VillageSQL Extensions Directory

Option B: Manual Installation

Test Your Extension

Creating Tests

Advanced Topics

Per-Statement State with Prerun/Postrun

Aggregate Functions

Window Functions

Temporary Tables

Triggers

Troubleshooting

Extension Won’t Load

Function Not Found

Build Errors

Example Extensions

Next Steps

Resources