SQL ascii and chr: Code Points and Characters

The pair ascii and chr converts a character into its numeric code and back. It looks trivial, yet these two functions power letter-label generation, the insertion of control characters such as newlines, and careful Unicode work. Let us walk through both on PostgreSQL and flag where MySQL and ClickHouse behave differently.

The key fact to fix in your head up front: in a UTF-8 database ascii returns the full Unicode code point of the first character, not the value of a byte, so it is the exact inverse of chr. That is what sets PostgreSQL apart from MySQL, whose same-named ASCII() looks only at the first byte. Working with character codes shines when you diagnose invisible data: tabs, line feeds, control characters, and suspicious imports where the eye cannot tell a space from a non-breaking space. A query with ascii(col) shows the actual code of a character right in the expression, not the way a terminal happens to render it.

ascii: the code of the first character

ascii(text) returns the code point of the first character in a string and ignores the rest. For an empty string the result is zero. In a UTF-8 database this is a real Unicode code point, not a byte number.

SELECT ascii('A');        -- 65
SELECT ascii('a');        -- 97
SELECT ascii('Apple');    -- 65, only the first character matters
SELECT ascii('');         -- 0

Because only the first character counts, the function is handy for first-letter classification. Group employees by the uppercase code of their initial:

SELECT ascii(upper(left(name, 1))) AS first_code,
       count(*)                    AS people
FROM employees
GROUP BY first_code
ORDER BY first_code;

The key practical move is to compare codes instead of letters. For example, pick users whose name starts with a Latin letter in the A through M range:

SELECT id, name
FROM users
WHERE ascii(upper(left(name, 1))) BETWEEN ascii('A') AND ascii('M');

chr: the character for a code

chr(integer) does the reverse: it returns the character for a given code point. In PostgreSQL this is a full Unicode code point, not only ASCII: chr(233) yields an accented e, chr(8364) the euro sign, and any code above 127 is encoded as UTF-8. The argument 0 is not allowed, and a value that is too large raises an error rather than returning a junk character.

SELECT chr(65);     -- A
SELECT chr(97);     -- a
SELECT chr(8364);   -- the euro sign

The idiom chr(ascii(x) + n) shifts a letter along the alphabet by adding an offset to its code. Generate labels A, B, C, ... to number rows in a report:

SELECT chr(ascii('A') + (n - 1)) AS label
FROM generate_series(1, 5) AS g(n);
-- A, B, C, D, E

The same approach builds letter codes for departments or short identifiers without a separate lookup table. Just keep the arithmetic inside the intended range: past ascii('Z') come not letters but the punctuation [, \, ], so an A-to-Z counter has to wrap or switch to two-letter labels by hand.

Control characters: newlines and tabs

chr is the standard way to produce invisible characters that are awkward or impossible to type into a string literal inline. The most common are the newline chr(10) and the tab chr(9); glue them onto your data with || and you assemble a multi-line value right in the query.

SELECT u.name || chr(10) || u.email AS contact_card
FROM users AS u
WHERE u.id = 1;

Assemble a CSV-like export of orders where rows are newline-separated and fields are tab-separated:

SELECT string_agg(
         o.id || chr(9) || o.amount || chr(9) || o.status,
         chr(10) ORDER BY o.id
       ) AS report
FROM orders AS o
WHERE o.status = 'paid';

Useful codes worth memorizing:

chr(9) is a horizontal tab;
chr(10) is a line feed (LF);
chr(13) is a carriage return (CR);
chr(13) || chr(10) is the CRLF pair for Windows compatibility.

Gotcha: the pair is symmetric in PostgreSQL, but not in MySQL

Contrary to a common myth, PostgreSQL ascii does not truncate a character down to a byte. In a UTF-8 database both functions operate on full Unicode code points and are mutual inverses: for an accented e, ascii returns 233 and chr(233) reconstructs the same letter. The round trip chr(ascii(x)) is correct not only for ASCII but for any Unicode character, as long as the database encoding is UTF-8. The trap appears when you carry the code to another engine, where a function with the same name means something quite different.

-- round trip works for ASCII
SELECT chr(ascii('A'));   -- A

-- and for non-ASCII too: ascii returns the full code point
SELECT ascii(chr(233));   -- 233, not a byte value
SELECT chr(233);          -- e with acute accent

The real trap hides exactly here, in the differences between engines:

In MySQL the function is ASCII() and returns the value of the first byte, not the code point — the very truncation PostgreSQL does not do. For the full Unicode point use ORD(). The inverse is CHAR(N USING utf8mb4), and without a charset CHAR() returns a binary string.
In ClickHouse there is char() (it accepts several codes at once) and code-point helpers such as ord over UTF-8; the names and behavior differ from Postgres.

The gotcha with ascii and chr is almost never the PostgreSQL function itself but the edge cases and the move to another engine: first character versus the whole string, multi-byte Unicode, and differing expectations about encoding. Before migrating between PostgreSQL, MySQL, and ClickHouse, run a small table with NULL, an empty string, and a few non-ASCII characters — say Cyrillic and an emoji. Engines often agree on Latin letters and diverge exactly where the test data is too clean: code like ascii(name) will quietly return different numbers in PostgreSQL and MySQL for the very same letter é.

One more practical point is indexes. The expression ascii(upper(left(name, 1))) in a WHERE clause is computed per row, and a plain B-tree on name does not help. If such a filter sits on a hot path, build an expression index for it or move the first code into a generated column. On a small sample the difference is invisible, but on large tables a sequential scan over ascii(...) is expensive.

The practical takeaway: inside PostgreSQL, ascii and chr are symmetric and work with full Unicode, so for generating Latin labels and control characters they are reliable. The danger is not the pair itself but the name: ASCII means "first byte" in MySQL and "Unicode code point" in PostgreSQL, and you have to keep that difference in mind when you port queries.

SQL ascii and chr: Code Points and Characters

ascii: the code of the first character

chr: the character for a code

Control characters: newlines and tabs

Gotcha: the pair is symmetric in PostgreSQL, but not in MySQL

Related articles

Practice on real tasks