Universal Text Codec – Base16 to Base91, URL, HTML Entity, Unicode Escape, ROT13 Online
What is the Universal Text Codec?
The Universal Text Codec encodes or decodes any text or binary data across 37 formats simultaneously. Paste a string and instantly see results in every supported encoding — all computed locally in your browser with no data sent to any server.
Whether you are debugging a JWT token, encoding a Bitcoin wallet address, preparing a QR code payload, escaping C firmware strings, inspecting UTF-16 byte order, reading Morse code, or normalizing Unicode — this tool eliminates the need to open multiple single-purpose converters.
Base64 and Base64url — JWT, Email, Data URIs
Base64 (RFC 4648 §4) encodes every 3 bytes as 4 characters from A–Z, a–z, 0–9, +, / with = padding — 33% overhead. Base64url replaces + with - and / with _ for safe use in URLs and HTTP headers.
Example — 'Hello' in Base64:
- Input bytes:
48 65 6C 6C 6F - Base64:
SGVsbG8= - Base64url:
SGVsbG8=(same when no + or / appear) - JWT tokens use Base64url for header and payload sections
Base58, Base62 and Crockford Base32
These three human-readable encodings remove visually ambiguous characters for safe manual transcription.
Base58 (Bitcoin) alphabet — 58 characters, no 0 O I l + /:
123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz
Crockford Base32 alphabet — 0-9 A-Z without I L O U:
0123456789ABCDEFGHJKMNPQRSTVWXYZ
Crockford Base32 is used in ULID (Universally Unique Lexicographically Sortable Identifier) and produces output that sorts correctly as a string.
All three use BigInteger arithmetic (O(n²)) and are limited to 4 KB input. For large binary data, use Base64 or Base85.
Binary String and C Escape Sequences
Embedded firmware developers frequently need to represent binary data as C source literals or binary digit patterns.
'Hello' in binary and C escape forms:
- Binary String:
01001000 01100101 01101100 01101100 01101111 - C Hex Escape:
\x48\x65\x6C\x6C\x6F— paste into C/C++ string literals - C Octal Escape:
\110\145\154\154\157— POSIX / ANSI terminal sequences
Use Hex Bytes input mode to feed raw binary data (e.g. DE AD BE EF) and see it in all formats at once.
UTF-16 and UUencode
UTF-16 Big Endian and Little Endian
'Hello' in UTF-16:
- UTF-16 BE:
00 48 00 65 00 6C 00 6C 00 6F— macOS, Java, network byte order - UTF-16 LE:
FF FE 48 00 65 00 6C 00 6C 00 6F 00— Windows (BOM included), .NET, NTFS
Supplementary plane characters (U+10000+) are encoded as surrogate pairs (4 bytes each).
UUencode — Legacy Unix Email Encoding
UUencode encodes 3 bytes as 4 characters in ASCII range 32–95 and wraps output in begin 644 data / end headers. Historically used to share binary files on Usenet before MIME Base64 became standard.
Text Transforms — JSON, QP, ROT47, Atbash, Morse, Unicode
JSON String Escape
"→\"·\→\\·\n→\n- Control chars below U+0020 →
\uXXXXnumeric escape - Use case: embedding multi-line strings or user content in JSON payloads
Quoted-Printable (RFC 2045)
Encodes non-ASCII bytes as =XX while leaving printable ASCII unchanged. Used for MIME email bodies where text is mostly readable. Lines are soft-wrapped at 76 characters with =\r\n continuation.
ROT47 vs ROT13
'Hello, World! 123' through both ciphers:
- ROT13:
Uryyb, Jbeyq! 123(only letters rotate) - ROT47:
w6==@[ (@C=5P "bD(all printable ASCII 33–126 rotates)
Both are self-inverse: applying the same cipher twice returns the original.
Atbash Cipher
Mirror substitution: A↔Z, B↔Y, C↔X … a↔z, b↔y. Non-alphabetic characters pass through unchanged. Originally a Hebrew cipher, it is self-inverse like ROT13.
Morse Code (ITU-R M.1677)
'SOS' in Morse: ... --- ...
Letters within a word are space-separated; words are separated by / . Supports A-Z, 0-9, and common punctuation.
Unicode NFD and NFC Normalization
- NFD: é (U+00E9) → e (U+0065) + combining acute accent (U+0301)
- NFC: e + combining acute → é (precomposed form, W3C recommended)
- Uses the browser's native
String.prototype.normalize()— handles full Unicode
Emoji Encoding
Each byte value (0–255) maps to a unique emoji from Unicode Miscellaneous Symbols and Pictographs (U+1F300–U+1F3FF). Fully reversible and a distinctive way to represent binary data visually.
Frequently Asked Questions
Which encoding should I use for a JWT?
JWTs use Base64url (RFC 4648 §5), not standard Base64. The - and _ characters replace + and / making the token safe in URL paths and Authorization headers without further escaping.
How do I use Hex Bytes input mode?
Switch Input Mode to Hex Bytes and paste space-separated or continuous hex: DE AD BE EF or DEADBEEF. All 37 encodings recalculate from the raw binary payload — useful for firmware register dumps and network captures.
Why are some encodings limited to 4 KB?
Base36, Base58, and Base62 use BigInteger arithmetic that scales as O(n²). These encodings are designed for small payloads (Bitcoin addresses are 25 bytes, IPFS CIDs are 34 bytes). For large binary data, use Base64 or Base85 which are block-based O(n) algorithms.