VillageSQL’s extension framework allows you to add custom functionality to the database server. Build custom extensions using the VEF SDK and extension template.
A VillageSQL extension is packaged as a VEB file (VillageSQL Extension Bundle) containing:
Manifest - Metadata about the extension (name, version, description)
Shared library - Compiled C++ code implementing the functionality
Optional metadata - Additional resources or configuration
Extensions are built using the VEF SDK (VillageSQL Extension Framework), which provides:
C++ API for defining types and functions
Automatic registration without SQL scripts
Type-safe function wrappers
Builder pattern for extension definition
VDFs vs Traditional UDFs: Functions registered through the VEF SDK are called VDFs (VillageSQL Defined Functions). VillageSQL also supports traditional MySQL UDFs registered via CREATE FUNCTION ... SONAME, but the VEF SDK approach is recommended for new extensions.
VDFs can be called with or without the extension prefix:
Copy
Ask AI
-- Unqualified (preferred for cleaner code)SELECT complex_abs(impedance) FROM signals;-- Qualified with extension name (explicit)SELECT vsql_complex.complex_abs(impedance) FROM signals;
Function Resolution Order:When you call a function without qualification, VillageSQL resolves it in this order:
System functions (built-in MySQL functions like NOW(), CONCAT())
UDFs (traditional MySQL user-defined functions)
VDFs (extension functions) - only if there’s exactly one function with that name
Stored functions (created with CREATE FUNCTION)
When to Use Qualified Names:
Use extension.function_name when multiple extensions provide functions with the same name
Use unqualified names for cleaner code when there’s no ambiguity
Qualification is never required if only one extension provides that function name
Extensions can add:
Custom functions (VDFs) - SQL functions with automatic type checking and validation
Custom data types - New column types like COMPLEX, UUID, or VECTOR that work with ORDER BY and indexes
Type operations - Encode, decode, compare, and hash functions for custom types
.buffer_size(size_t) - Request specific output buffer size for STRING/CUSTOM returns
.prerun<func>() - Set per-statement setup function (optional)
.postrun<func>() - Set per-statement cleanup function (optional)
.build() - Finalize function registration
Parameter Limit: Functions support a maximum of 8 parameters (defined by kMaxParams). If you need more, consider using structured types or multiple functions.
Every custom type needs encode and decode functions, and optionally compare/hash functions:
Copy
Ask AI
// Encode: Convert string representation to binarybool encode_mytype(unsigned char* buffer, size_t buffer_size, const char* from, size_t from_len, size_t* length) { // Parse 'from' string and write binary to 'buffer' // Set *length to bytes written // Return false on success, true on error}// Decode: Convert binary to string representationbool decode_mytype(const unsigned char* buffer, size_t buffer_size, char* to, size_t to_buffer_size, size_t* to_length) { // Read binary from 'buffer' and write string to 'to' // Set *to_length to string length // Return false on success, true on error}// Compare: Enable ORDER BY and indexing (optional but recommended)int compare_mytype(const unsigned char* a, size_t a_len, const unsigned char* b, size_t b_len) { // Return: negative if a<b, 0 if a==b, positive if a>b}// Hash: Custom hash function (optional, uses default if not provided)size_t hash_mytype(const unsigned char* buffer, size_t buffer_size) { // Return hash value for the binary data}
-- Convert string to custom typeSELECT mytype_from_string('(1.0,2.0)');-- Convert custom type to stringSELECT mytype_to_string(my_column) FROM my_table;-- Explicit conversion in INSERTINSERT INTO my_table (id, value)VALUES (1, mytype_from_string('(3.0,4.0)'));
The .from_string<>() and .to_string<>() builders automatically create SQL-callable conversion functions using your encode/decode implementations.
Here’s a complete example implementing a COMPLEX number type:
Copy
Ask AI
#include <cstring>#include <cmath>// Encode complex number from "(real,imag)" to 16 bytesbool encode_complex(unsigned char* buffer, size_t buffer_size, const char* from, size_t from_len, size_t* length) { if (buffer_size < 16) return true; // Parse "(real,imag)" format double real, imag; if (sscanf(from, "(%lf,%lf)", &real, &imag) != 2) { return true; } // Write 16 bytes: 8 bytes real + 8 bytes imag memcpy(buffer, &real, 8); memcpy(buffer + 8, &imag, 8); *length = 16; return false;}// Decode 16 bytes to "(real,imag)" stringbool decode_complex(const unsigned char* buffer, size_t buffer_size, char* to, size_t to_buffer_size, size_t* to_length) { if (buffer_size < 16) return true; double real, imag; memcpy(&real, buffer, 8); memcpy(&imag, buffer + 8, 8); int len = snprintf(to, to_buffer_size, "(%.6f,%.6f)", real, imag); if (len < 0 || len >= (int)to_buffer_size) return true; *to_length = len; return false;}// Compare for ORDER BY: compare real part first, then imaginaryint compare_complex(const unsigned char* a, size_t a_len, const unsigned char* b, size_t b_len) { if (a_len < 16 || b_len < 16) return 0; double a_real, a_imag, b_real, b_imag; memcpy(&a_real, a, 8); memcpy(&a_imag, a + 8, 8); memcpy(&b_real, b, 8); memcpy(&b_imag, b + 8, 8); if (a_real < b_real) return -1; if (a_real > b_real) return 1; if (a_imag < b_imag) return -1; if (a_imag > b_imag) return 1; return 0;}
After defining these operations, users can create tables with your custom type:
Copy
Ask AI
CREATE TABLE signals ( id INT PRIMARY KEY, impedance COMPLEX, frequency_response COMPLEX);INSERT INTO signals VALUES (1, '(50.0,10.0)', '(0.95,0.31)');-- ORDER BY works because we provided compare_complex!SELECT * FROM signals ORDER BY impedance;-- Prepared statements work with custom typesPREPARE stmt FROM 'SELECT * FROM signals WHERE impedance = ?';SET @val = '(50.0,10.0)';EXECUTE stmt USING @val;-- Aggregate operations work with custom typesSELECT COUNT(DISTINCT impedance), MIN(impedance), MAX(impedance) FROM signals;
# Configure the buildcmake ..# Or, if building against VillageSQL source:cmake .. -DVillageSQL_BUILD_DIR=/path/to/villagesql/build# Build the extensionmake
This creates:
Compiled shared library (.so file)
VEB package (.veb file) - a tar archive containing manifest and library
For VEF SDK functions that need setup/cleanup per SQL statement (not per row):
Copy
Ask AI
// Called once before the first rowvoid my_prerun(vef_context_t* ctx, vef_prerun_args_t* args, vef_prerun_result_t* result) { // Allocate state, open connections, etc. void* state = malloc(sizeof(MyState)); result->user_data = state;}// Called once after all rows processedvoid my_postrun(vef_context_t* ctx, vef_postrun_args_t* args, vef_postrun_result_t* result) { // Clean up state allocated in prerun if (args->user_data) { free(args->user_data); }}// Register with extensionVEF_GENERATE_ENTRY_POINTS( make_extension("my_ext", "1.0.0") .func(make_func<&my_func_impl>("my_func") .returns(STRING) .param(STRING) .prerun(&my_prerun) .postrun(&my_postrun) .build()));
Most extensions don’t need prerun/postrun hooks. The VEF SDK automatically handles common cases like type checking and buffer allocation. Use prerun/postrun only when you need expensive per-statement setup (like opening connections) that shouldn’t happen per-row.If you find you need prerun/postrun for your use case, please share your scenario on the VillageSQL Discord - the team may be able to add SDK support to handle it automatically.
COUNT(DISTINCT), MIN, and MAX operations work with custom types. We are investigating support for additional aggregate functions (SUM, AVG, etc.) for inclusion in a future release.
Supported aggregate operations with custom types:
Copy
Ask AI
-- COUNT(DISTINCT) works with custom typesSELECT COUNT(DISTINCT impedance) FROM signals;-- MIN and MAX work with custom types (requires compare function)SELECT MIN(impedance), MAX(impedance) FROM signals;
Extension functions are called in a per-row execution model:
Each function call processes one row with its own result buffer (thread-safe)
prerun/postrun provide per-statement setup/teardown
Avoid global state - use function parameters and return values instead
If you must use global state, protect it with mutexes/locks
Best practice: Design functions to be stateless for simplicity and safety.