> ## Documentation Index
> Fetch the complete documentation index at: https://villagesql.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# MySQL String Functions

> Reference guide for MySQL string functions: CONCAT, SUBSTRING, TRIM, REPLACE, UPPER, LOWER, LENGTH, LIKE, REGEXP, FORMAT, and common string manipulation patterns.

<Card title="VillageSQL is a drop-in replacement for MySQL with extensions." icon="database" href="/mysql-8.4/0.0.4/quickstart">
  All examples in this guide work on VillageSQL. Install Now →
</Card>

MySQL's string functions cover splitting, joining, searching, formatting, and comparing text. Most follow the same conventions: 1-based indexing, NULL propagation (any NULL argument returns NULL), and case-insensitive comparisons by default.

## Combining and Splitting

**CONCAT** — join strings together:

```sql theme={null}
SELECT CONCAT(first_name, ' ', last_name) AS full_name FROM users;

-- CONCAT_WS: join with a separator, skipping NULLs
SELECT CONCAT_WS(', ', city, state, country) AS address FROM locations;
```

`CONCAT` with a NULL argument returns NULL. `CONCAT_WS` skips NULL values but keeps the separator for non-NULL ones.

**SUBSTRING / SUBSTR / MID** — extract part of a string:

```sql theme={null}
-- SUBSTRING(string, start, length)  — 1-based indexing
SELECT SUBSTRING('Hello, World!', 1, 5);   -- 'Hello'
SELECT SUBSTRING('Hello, World!', 8);      -- 'World!'  (to end of string)
SELECT SUBSTRING('Hello, World!', -6);     -- 'World!'  (from the right)
```

**SUBSTRING\_INDEX** — split on a delimiter:

```sql theme={null}
-- Get everything before the second comma
SELECT SUBSTRING_INDEX('a,b,c,d', ',', 2);   -- 'a,b'

-- Get everything after the last dot (domain from email)
SELECT SUBSTRING_INDEX('user@example.com', '@', -1);  -- 'example.com'
```

## Searching Within Strings

**LOCATE / POSITION / INSTR** — find a substring:

```sql theme={null}
SELECT LOCATE('World', 'Hello, World!');       -- 8  (1-based position)
SELECT LOCATE('world', 'Hello, World!');       -- 8  (case-insensitive by default)
SELECT LOCATE('xyz', 'Hello, World!');         -- 0  (not found)

-- LOCATE with start position
SELECT LOCATE('o', 'Hello, World!', 6);        -- 9  (skips first 'o')
```

**LIKE** — pattern matching with wildcards:

```sql theme={null}
-- % matches zero or more characters, _ matches exactly one
SELECT * FROM products WHERE name LIKE 'Widget%';       -- starts with Widget
SELECT * FROM products WHERE name LIKE '%Pro%';         -- contains Pro
SELECT * FROM products WHERE code LIKE 'SKU-___-2024';  -- three chars in middle
```

**REGEXP / RLIKE** — regular expression matching:

```sql theme={null}
SELECT * FROM users WHERE email REGEXP '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$';
SELECT * FROM products WHERE name REGEXP 'Pro|Plus|Max';
```

`REGEXP` uses POSIX regex syntax. Use `REGEXP_LIKE()` (MySQL 8.0+) for the function form.

## Modifying Strings

**REPLACE** — replace all occurrences of a substring:

```sql theme={null}
SELECT REPLACE('Hello, World!', 'World', 'MySQL');  -- 'Hello, MySQL!'
UPDATE articles SET content = REPLACE(content, 'old_term', 'new_term');
```

**TRIM / LTRIM / RTRIM** — remove whitespace or specific characters:

```sql theme={null}
SELECT TRIM('  hello  ');                    -- 'hello'
SELECT TRIM(LEADING '0' FROM '007');         -- '7'
SELECT TRIM(BOTH 'x' FROM 'xxhelloxx');      -- 'hello'
```

**UPPER / LOWER** — change case:

```sql theme={null}
SELECT UPPER('hello');   -- 'HELLO'
SELECT LOWER('HELLO');   -- 'hello'
```

**LPAD / RPAD** — pad to a fixed length:

```sql theme={null}
SELECT LPAD('42', 6, '0');    -- '000042'
SELECT RPAD('hello', 10, '.');  -- 'hello.....'
```

**REPEAT** — repeat a string N times:

```sql theme={null}
SELECT REPEAT('ab', 3);   -- 'ababab'
```

**REVERSE** — reverse a string:

```sql theme={null}
SELECT REVERSE('hello');  -- 'olleh'
```

## Length and Comparison

MySQL has two length functions that produce different results for multibyte characters:

| Function                                 | What it measures |
| :--------------------------------------- | :--------------- |
| `LENGTH(s)`                              | Byte length      |
| `CHAR_LENGTH(s)` / `CHARACTER_LENGTH(s)` | Character length |

```sql theme={null}
SELECT LENGTH('café');      -- 5  (the é is 2 bytes in UTF-8)
SELECT CHAR_LENGTH('café'); -- 4  (4 characters)
```

Always use `CHAR_LENGTH` when you care about the number of characters, not bytes.

**STRCMP** — compare two strings, returns -1, 0, or 1:

```sql theme={null}
SELECT STRCMP('abc', 'abc');   -- 0
SELECT STRCMP('abc', 'abd');   -- -1
SELECT STRCMP('abd', 'abc');   -- 1
```

## Formatting

**FORMAT** — format a number with thousands separators:

```sql theme={null}
SELECT FORMAT(1234567.891, 2);    -- '1,234,567.89'
SELECT FORMAT(1234567.891, 2, 'de_DE');  -- '1.234.567,89'  (German locale)
```

**LEFT / RIGHT** — take N characters from start or end:

```sql theme={null}
SELECT LEFT('Hello, World!', 5);   -- 'Hello'
SELECT RIGHT('Hello, World!', 6);  -- 'World!'
```

## Common Patterns

Extract the domain from an email address:

```sql theme={null}
SELECT SUBSTRING_INDEX(email, '@', -1) AS domain FROM users;
```

Truncate long strings for display:

```sql theme={null}
SELECT CASE
    WHEN CHAR_LENGTH(description) > 100
    THEN CONCAT(LEFT(description, 97), '...')
    ELSE description
END AS short_description
FROM products;
```

Normalize whitespace:

```sql theme={null}
UPDATE users SET name = TRIM(REGEXP_REPLACE(name, '\\s+', ' '));
```

## Frequently Asked Questions

#### Are MySQL string comparisons case-sensitive?

By default, no — string comparisons use the column's collation, and most common collations are case-insensitive (e.g., `utf8mb4_general_ci`, where `ci` = case-insensitive). For case-sensitive comparisons, use a `_cs` collation or the `BINARY` operator:

```sql theme={null}
SELECT * FROM users WHERE BINARY username = 'Alice';
-- or: WHERE username COLLATE utf8mb4_bin = 'Alice'
```

#### What's the difference between CHAR\_LENGTH and LENGTH?

`LENGTH` returns byte count; `CHAR_LENGTH` returns character count. For ASCII strings they're the same. For UTF-8 strings with multibyte characters (accented letters, emoji, CJK), they differ. Use `CHAR_LENGTH` for user-visible string length limits.

## Troubleshooting

| Problem                                      | Solution                                                                                                 |
| :------------------------------------------- | :------------------------------------------------------------------------------------------------------- |
| `CONCAT` returns NULL                        | One argument is NULL — use `COALESCE(col, '')` to replace NULLs before concatenating, or use `CONCAT_WS` |
| `LIKE '%pattern%'` is slow                   | Leading wildcard prevents index use — consider FULLTEXT index for text search at scale                   |
| String comparison not matching expected rows | Case sensitivity or collation mismatch — check the column's collation with `SHOW CREATE TABLE`           |
| `REPLACE` updates more rows than expected    | `REPLACE` is case-insensitive by default — it replaces all case variants of the target string            |
| `SUBSTRING` returns empty string, not NULL   | SUBSTRING of an out-of-range position returns an empty string, not NULL                                  |

## See also

* [MySQL Character Sets and Collations](/guides/character-sets) — string functions behave differently across collations
* [Full-Text Search in MySQL](/guides/full-text-search) — searching strings at scale beyond LIKE
