Overview
VillageSQL’s extension framework allows you to add custom functionality to the database server. Build custom extensions using the VEF SDK and extension template.
What is a VillageSQL Extension?
A VillageSQL extension is packaged as a VEB file (VillageSQL Extension Bundle) containing:
- Manifest - Metadata about the extension (name, version, description)
- Shared library - Compiled C++ code implementing the functionality
- Optional metadata - Additional resources or configuration
Extensions are built using the VEF SDK (VillageSQL Extension Framework), which provides:
- C++ API for defining types and functions
- Automatic registration without SQL scripts
- Type-safe function wrappers
- Builder pattern for extension definition
VDFs vs Traditional UDFs: Functions registered through the VEF SDK are called VDFs (VillageSQL Defined Functions). VillageSQL also supports traditional MySQL UDFs registered via CREATE FUNCTION ... SONAME, but the VEF SDK approach is recommended for new extensions.
Calling VDFs in SQL
VDFs can be called with or without the extension prefix:
-- Unqualified (preferred for cleaner code)
SELECT complex_abs(impedance) FROM signals;
-- Qualified with extension name (explicit)
SELECT vsql_complex.complex_abs(impedance) FROM signals;
Function Resolution Order:
When you call a function without qualification, VillageSQL resolves it in this order:
- System functions (built-in MySQL functions like
NOW(), CONCAT())
- UDFs (traditional MySQL user-defined functions)
- VDFs (extension functions) - only if there’s exactly one function with that name
- Stored functions (created with
CREATE FUNCTION)
When to Use Qualified Names:
- Use
extension.function_name when multiple extensions provide functions with the same name
- Use unqualified names for cleaner code when there’s no ambiguity
- Qualification is never required if only one extension provides that function name
Extensions can add:
- Custom functions (VDFs) - SQL functions with automatic type checking and validation
- Custom data types - New column types like COMPLEX, UUID, or VECTOR that work with ORDER BY and indexes
- Type operations - Encode, decode, compare, and hash functions for custom types
Prerequisites
Before you begin, ensure you have:
- Git - For cloning and version control
- CMake 3.16 or higher - Build system
- C++ Compiler - GCC 8+, Clang 8+, or MSVC 2019+ with C++11 support
- VillageSQL - Either:
- Source build with
VillageSQL_BUILD_DIR set to build directory, or
- Binary installation with
mysql_config in PATH
- VillageSQL SDK headers - Included in VillageSQL source at
villagesql/sdk/include/
- Basic C++ knowledge - Understanding of C++ and function pointers
Step 1: Get the Extension Template
You can start with the extension template in two ways:
Option A: Use Template from VillageSQL Source
If you have VillageSQL source code, the template is included:
cd /path/to/villagesql-source
cp -r villagesql/sdk/template my-extension
cd my-extension
Option B: Fork from GitHub
Start by forking the VillageSQL extension template repository:
-
Visit the template repository on GitHub:
https://github.com/villagesql/vsql-extension-template
-
Click the “Fork” button to create your own copy
-
Clone your fork locally:
git clone https://github.com/YOUR_USERNAME/vsql-extension-template.git
cd vsql-extension-template
Alternatively, use the “Use this template” button on GitHub to create a new repository based on the template without forking history.
Step 2: Update the Manifest
Edit manifest.json to define your extension’s metadata:
{
"name": "my_awesome_extension",
"version": "1.0.0",
"description": "My custom VillageSQL extension that does amazing things",
"author": "Your Name",
"license": "GPL-2.0"
}
manifest.json Schema
| Field | Required | Format | Description |
|---|
name | ✅ Yes | lowercase_with_underscores | Unique identifier. Must match INSTALL EXTENSION name. |
version | ✅ Yes | MAJOR.MINOR.PATCH | Semantic version string |
description | No | String | Brief explanation of functionality |
author | No | String | Author name or organization |
license | No | String | License identifier (GPL-2.0 recommended) |
Validation rules:
name: Must be lowercase letters, numbers, and underscores only. No hyphens in manifest!
version: Must follow semantic versioning (e.g., 1.0.0, 0.2.1)
- Invalid manifest will cause
INSTALL EXTENSION to fail
Example:
-- manifest.json has "name": "my_awesome_extension"
INSTALL EXTENSION my_awesome_extension; -- ✅ Correct
INSTALL EXTENSION my-awesome-extension; -- ❌ Wrong (hyphens)
Step 3: Implement Your Extension with VEF SDK
The VEF SDK provides a C++ API for defining extensions using a fluent builder pattern:
- Type-safe function definitions with compile-time checking
- Automatic argument validation and type conversion
- Support for custom types with compare/hash functions (enables ORDER BY and indexes)
Create your main extension file (e.g., src/extension.cc) and include the VEF SDK:
#include <villagesql/extension.h>
#include <villagesql/func_builder.h>
#include <villagesql/type_builder.h>
// Your implementation code here
Define Your Extension
Use the VEF_GENERATE_ENTRY_POINTS() macro to define your extension:
VEF_GENERATE_ENTRY_POINTS(
make_extension("my_extension", "1.0.0") // name, version
.func(make_func<&my_reverse_impl>("my_reverse")
.returns(STRING)
.param(STRING)
.build())
.func(make_func<&count_vowels_impl>("count_vowels")
.returns(INT)
.param(STRING)
.build())
);
Function Builder Methods:
make_func<&impl>("name") - Create function with implementation pointer
.returns(type) - Set return type (STRING, INT, REAL, or custom type name)
.param(type) - Add parameter (maximum 8 parameters)
.buffer_size(size_t) - Request specific output buffer size for STRING/CUSTOM returns
.prerun<func>() - Set per-statement setup function (optional)
.postrun<func>() - Set per-statement cleanup function (optional)
.build() - Finalize function registration
Parameter Limit: Functions support a maximum of 8 parameters (defined by kMaxParams). If you need more, consider using structured types or multiple functions.
Custom Buffer Sizes
For functions returning variable-length data, request a specific buffer size:
make_func<&large_result_impl>("large_result")
.returns(STRING)
.param(INT)
.buffer_size(65536) // Request 64KB buffer
.build()
Check available buffer space before writing:
void large_result_impl(vef_context_t* ctx, vef_invalue_t* input,
vef_vdf_result_t* result) {
size_t needed = calculate_output_size(input);
if (needed > result->max_str_len) {
result->type = VEF_RESULT_ERROR;
strcpy(result->error_msg, "Output exceeds buffer size");
return;
}
// Write output to result->str_buf
result->type = VEF_RESULT_VALUE;
result->actual_len = actual_output_length;
}
Request sufficient buffer size via .buffer_size() based on your function’s maximum output size.
Implement Wrapper Functions
Implementation functions use the VEF API signature:
#include <algorithm>
#include <string>
#include <cstring>
// String reverse implementation
void my_reverse_impl(vef_context_t* ctx, vef_invalue_t* input,
vef_vdf_result_t* result) {
if (input->is_null) {
result->type = VEF_RESULT_NULL;
return;
}
// Reverse the string
size_t len = input->str_len;
for (size_t i = 0; i < len; i++) {
result->str_buf[i] = input->str_value[len - 1 - i];
}
result->type = VEF_RESULT_VALUE;
result->actual_len = len;
}
// Count vowels implementation
void count_vowels_impl(vef_context_t* ctx, vef_invalue_t* input,
vef_vdf_result_t* result) {
if (input->is_null) {
result->type = VEF_RESULT_NULL;
return;
}
int64_t count = 0;
for (size_t i = 0; i < input->str_len; i++) {
char lower = std::tolower(input->str_value[i]);
if (lower == 'a' || lower == 'e' || lower == 'i' ||
lower == 'o' || lower == 'u') {
count++;
}
}
result->int_value = count;
result->type = VEF_RESULT_VALUE;
}
Handling NULL Values
Check for NULL via the is_null flag and return NULL by setting the result type:
void my_func_impl(vef_context_t* ctx, vef_invalue_t* input,
vef_vdf_result_t* result) {
// Check if input is NULL
if (input->is_null) {
result->type = VEF_RESULT_NULL;
return;
}
// Access input data
const char* str = input->str_value;
size_t len = input->str_len;
// Process and write result to result->str_buf
// ...
result->type = VEF_RESULT_VALUE;
result->actual_len = output_length;
}
NULL handling options:
- Input NULL check:
if (input->is_null)
- Return NULL:
result->type = VEF_RESULT_NULL
- Return value:
result->type = VEF_RESULT_VALUE + write to type-specific buffer
- Return error:
result->type = VEF_RESULT_ERROR + error message in result->error_msg
Error Handling
Return errors with custom messages for validation failures or invalid input:
void validate_age_impl(vef_context_t* ctx, vef_invalue_t* age_input,
vef_vdf_result_t* result) {
if (age_input->is_null) {
result->type = VEF_RESULT_NULL;
return;
}
long long age = age_input->int_value;
// Validate input range
if (age < 0 || age > 150) {
result->type = VEF_RESULT_ERROR;
// Write error message
const char* msg = "Age must be between 0 and 150";
strcpy(result->error_msg, msg);
return;
}
// Return valid result
result->type = VEF_RESULT_VALUE;
result->int_value = age;
}
Result types:
VEF_RESULT_ERROR - Validation failure, invalid input, or runtime error
VEF_RESULT_NULL - NULL value
VEF_RESULT_VALUE - Success
Step 4: Creating Custom Types
Custom types allow you to define new column types that work seamlessly with SQL operations like ORDER BY, GROUP BY, and indexes.
Define Type Operations
Every custom type needs encode and decode functions, and optionally compare/hash functions:
// Encode: Convert string representation to binary
bool encode_mytype(unsigned char* buffer, size_t buffer_size,
const char* from, size_t from_len, size_t* length) {
// Parse 'from' string and write binary to 'buffer'
// Set *length to bytes written
// Return false on success, true on error
}
// Decode: Convert binary to string representation
bool decode_mytype(const unsigned char* buffer, size_t buffer_size,
char* to, size_t to_buffer_size, size_t* to_length) {
// Read binary from 'buffer' and write string to 'to'
// Set *to_length to string length
// Return false on success, true on error
}
// Compare: Enable ORDER BY and indexing (optional but recommended)
int compare_mytype(const unsigned char* a, size_t a_len,
const unsigned char* b, size_t b_len) {
// Return: negative if a<b, 0 if a==b, positive if a>b
}
// Hash: Custom hash function (optional, uses default if not provided)
size_t hash_mytype(const unsigned char* buffer, size_t buffer_size) {
// Return hash value for the binary data
}
Register the Type
VEF_GENERATE_ENTRY_POINTS(
make_extension("my_extension", "1.0.0")
.type(make_type(MYTYPE)
.persisted_length(16) // Fixed storage size in bytes
.max_decode_buffer_length(64) // Max string representation size
.encode(&encode_mytype)
.decode(&decode_mytype)
.compare(&compare_mytype) // Enables ORDER BY and indexes!
.hash(&hash_mytype) // Optional custom hash
.build())
.func(make_func<&mytype_constructor>("MYTYPE") // Constructor function
.returns(MYTYPE)
.param(REAL)
.param(REAL)
.build())
);
Type Conversion Functions
Expose your encode/decode functions as SQL functions for explicit type conversion:
VEF_GENERATE_ENTRY_POINTS(
make_extension("my_extension", "1.0.0")
.type(make_type(MYTYPE)
.persisted_length(16)
.max_decode_buffer_length(64)
.encode(&encode_mytype)
.decode(&decode_mytype)
.compare(&compare_mytype)
.build())
// String → Custom Type conversion
.func(make_func("mytype_from_string")
.from_string<&encode_mytype>(MYTYPE))
// Custom Type → String conversion
.func(make_func("mytype_to_string")
.to_string<&decode_mytype>(MYTYPE))
.build()
);
Usage in SQL:
-- Convert string to custom type
SELECT mytype_from_string('(1.0,2.0)');
-- Convert custom type to string
SELECT mytype_to_string(my_column) FROM my_table;
-- Explicit conversion in INSERT
INSERT INTO my_table (id, value)
VALUES (1, mytype_from_string('(3.0,4.0)'));
The .from_string<>() and .to_string<>() builders automatically create SQL-callable conversion functions using your encode/decode implementations.
Example: COMPLEX Type
Here’s a complete example implementing a COMPLEX number type:
#include <cstring>
#include <cmath>
// Encode complex number from "(real,imag)" to 16 bytes
bool encode_complex(unsigned char* buffer, size_t buffer_size,
const char* from, size_t from_len, size_t* length) {
if (buffer_size < 16) return true;
// Parse "(real,imag)" format
double real, imag;
if (sscanf(from, "(%lf,%lf)", &real, &imag) != 2) {
return true;
}
// Write 16 bytes: 8 bytes real + 8 bytes imag
memcpy(buffer, &real, 8);
memcpy(buffer + 8, &imag, 8);
*length = 16;
return false;
}
// Decode 16 bytes to "(real,imag)" string
bool decode_complex(const unsigned char* buffer, size_t buffer_size,
char* to, size_t to_buffer_size, size_t* to_length) {
if (buffer_size < 16) return true;
double real, imag;
memcpy(&real, buffer, 8);
memcpy(&imag, buffer + 8, 8);
int len = snprintf(to, to_buffer_size, "(%.6f,%.6f)", real, imag);
if (len < 0 || len >= (int)to_buffer_size) return true;
*to_length = len;
return false;
}
// Compare for ORDER BY: compare real part first, then imaginary
int compare_complex(const unsigned char* a, size_t a_len,
const unsigned char* b, size_t b_len) {
if (a_len < 16 || b_len < 16) return 0;
double a_real, a_imag, b_real, b_imag;
memcpy(&a_real, a, 8);
memcpy(&a_imag, a + 8, 8);
memcpy(&b_real, b, 8);
memcpy(&b_imag, b + 8, 8);
if (a_real < b_real) return -1;
if (a_real > b_real) return 1;
if (a_imag < b_imag) return -1;
if (a_imag > b_imag) return 1;
return 0;
}
After defining these operations, users can create tables with your custom type:
CREATE TABLE signals (
id INT PRIMARY KEY,
impedance COMPLEX,
frequency_response COMPLEX
);
INSERT INTO signals VALUES (1, '(50.0,10.0)', '(0.95,0.31)');
-- ORDER BY works because we provided compare_complex!
SELECT * FROM signals ORDER BY impedance;
-- Prepared statements work with custom types
PREPARE stmt FROM 'SELECT * FROM signals WHERE impedance = ?';
SET @val = '(50.0,10.0)';
EXECUTE stmt USING @val;
-- Aggregate operations work with custom types
SELECT COUNT(DISTINCT impedance), MIN(impedance), MAX(impedance) FROM signals;
See vsql_complex Example for the complete implementation.
Step 5: Update Build Configuration
Edit CMakeLists.txt to build your extension as a VEB file:
cmake_minimum_required(VERSION 3.16)
project(my_extension)
# Find VillageSQL Extension Framework
find_package(VillageSQLExtensionFramework QUIET
PATHS "${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake"
)
# The framework detects build flags via 4 methods (in order):
# 1. Explicit MYSQL_INCLUDE_FLAGS/MYSQL_CXXFLAGS
# 2. VillageSQL_BUILD_DIR (reads CMakeCache.txt from VillageSQL build)
# 3. VSQL_BASE_DIR (uses mysql_config)
# 4. Default - mysql_config from PATH
# Build shared library with your source files
add_library(extension SHARED
src/extension.cc
src/my_functions.cc
)
# Create VEB archive
VEF_CREATE_VEB(
NAME my_extension
LIBRARY_TARGET extension
MANIFEST ${CMAKE_CURRENT_SOURCE_DIR}/manifest.json
)
# Install VEB to VillageSQL extensions directory
install(FILES ${VEB_OUTPUT} DESTINATION ${INSTALL_DIR})
Configuration notes:
VillageSQLExtensionFramework provides CMake helpers for building extensions
VEF_CREATE_VEB() packages your library, manifest, and metadata into a .veb archive
- The framework automatically detects MySQL/VillageSQL build flags
- Library target name is typically
extension (can be anything)
- VEB name must match your manifest.json name
Step 6: Create a Build Directory
Create a separate build directory:
Step 7: Build with CMake and Make
Configure and build your extension:
# Configure the build
cmake ..
# Or, if building against VillageSQL source:
cmake .. -DVillageSQL_BUILD_DIR=/path/to/villagesql/build
# Build the extension
make
This creates:
- Compiled shared library (
.so file)
- VEB package (
.veb file) - a tar archive containing manifest and library
Verify the Build
Check the contents of your VEB file:
You should see:
manifest.json
lib/myext.so
Step 8: Install and Test
Option A: Install to VillageSQL Extensions Directory
Use the install target to copy the VEB to your VillageSQL installation:
This copies the .veb file to the directory configured via VillageSQL_VEB_INSTALL_DIR.
Option B: Manual Installation
Copy the VEB file manually:
# Find the VEF directory
mysql -u root -p -e "SHOW VARIABLES LIKE 'veb_dir';"
# Copy the VEB file
sudo cp my-awesome-extension.veb /path/to/veb_dir/
Test Your Extension
-
Connect to VillageSQL:
-
Install the extension:
INSTALL EXTENSION my_awesome_extension;
-
Verify installation:
SELECT * FROM INFORMATION_SCHEMA.EXTENSIONS;
-
Test your functions:
SELECT my_reverse('Hello, World!');
-- Output: !dlroW ,olleH
SELECT count_vowels('VillageSQL');
-- Output: 3
Creating Tests
Add test files to validate your extension works correctly:
-
Create a test file in
test/t/:
-- test/t/my_basic.test
SELECT my_reverse('abc');
SELECT my_reverse('');
SELECT my_reverse(NULL);
-
Generate expected results:
cd /path/to/villagesql/build/mysql-test
perl mysql-test-run.pl --suite=/path/to/your/extension/test --record
-
Run tests:
perl mysql-test-run.pl --suite=/path/to/your/extension/test
Advanced Topics
Per-Statement State with Prerun/Postrun
For VEF SDK functions that need setup/cleanup per SQL statement (not per row):
// Called once before the first row
void my_prerun(vef_context_t* ctx, vef_prerun_args_t* args,
vef_prerun_result_t* result) {
// Allocate state, open connections, etc.
void* state = malloc(sizeof(MyState));
result->user_data = state;
}
// Called once after all rows processed
void my_postrun(vef_context_t* ctx, vef_postrun_args_t* args,
vef_postrun_result_t* result) {
// Clean up state allocated in prerun
if (args->user_data) {
free(args->user_data);
}
}
// Register with extension
VEF_GENERATE_ENTRY_POINTS(
make_extension("my_ext", "1.0.0")
.func(make_func<&my_func_impl>("my_func")
.returns(STRING)
.param(STRING)
.prerun(&my_prerun)
.postrun(&my_postrun)
.build())
);
Most extensions don’t need prerun/postrun hooks. The VEF SDK automatically handles common cases like type checking and buffer allocation. Use prerun/postrun only when you need expensive per-statement setup (like opening connections) that shouldn’t happen per-row.If you find you need prerun/postrun for your use case, please share your scenario on the VillageSQL Discord - the team may be able to add SDK support to handle it automatically.
Aggregate Functions
COUNT(DISTINCT), MIN, and MAX operations work with custom types. We are investigating support for additional aggregate functions (SUM, AVG, etc.) for inclusion in a future release.
Supported aggregate operations with custom types:
-- COUNT(DISTINCT) works with custom types
SELECT COUNT(DISTINCT impedance) FROM signals;
-- MIN and MAX work with custom types (requires compare function)
SELECT MIN(impedance), MAX(impedance) FROM signals;
Extension functions are called in a per-row execution model:
- Each function call processes one row with its own result buffer (thread-safe)
prerun/postrun provide per-statement setup/teardown
- Avoid global state - use function parameters and return values instead
- If you must use global state, protect it with mutexes/locks
Best practice: Design functions to be stateless for simplicity and safety.
Troubleshooting
Extension Won’t Load
Check the error log and verify the VEB contents:
make show_veb
tail -f /var/log/mysql/error.log
Function Not Found
Verify installation and registration:
SELECT * FROM INFORMATION_SCHEMA.EXTENSIONS;
Build Errors
# Verify mysql_config is available
which mysql_config
mysql_config --version
# Check compiler version
gcc --version # or clang --version
# Verify CMake version (3.16+ required)
cmake --version
Example Extensions
Learn from existing VillageSQL extensions:
Next Steps
Resources