Escape hex like \\u… in kotlin strings

Issue

I have a string "\ufffd\ufffd hello\n"

i have a code like this

    fun main() {
      val bs  "\ufffd\ufffd hello\n"
      println(bs) // �� hello
    }

and i want to see "\ufffd\ufffd hello", how can i escape \u for every hex values

UPD:

val s  """\uffcd"""
val req  """(?<!\\\\)(\\\\\\\\)*(\\u)([A-Fa-f\\d]{4})""".toRegex()
return s.replace(unicodeRegex, """$1\\\\u$3""")

Solution

(I’m interpreting the question as asking how to clearly display a string that contains non-printable characters.  The Kotlin compiler converts sequences of a \u followed by 4 hex digits in string literals into single characters, so the question is effectively asking how to convert them back again.)

Unfortunately, there’s no built-in way of doing this.  It’s fairly easy to write one, but it’s a bit subjective, as there’s no single definition of what’s ‘printable‘…

Here’s an extension function that probably does roughly what you want:

fun String.printable()  map {
    when (Character.getType(it).toByte()) {
        Character.CONTROL, Character.FORMAT, Character.PRIVATE_USE,
        Character.SURROGATE, Character.UNASSIGNED, Character.OTHER_SYMBOL
            -> "\\u%04x".format(it.toInt())
        else -> it.toString()
    }
}.joinToString("")

println("\ufffd\ufffd hello\n".printable()) // prints ‘\ufffd\ufffd hello\u000a’

The sample string in the question is a bad example, because \uFFFD is the replacement character — a black diamond with a question mark, usually shown in place of any non-displayable characters.  So the replacement character itself is displayable!

The code above treats it as non-displayable by excluding the Character.OTHER_SYMBOL type — but that will also exclude many other symbols.  So you’ll probably want to remove it, leaving just the other 5 types.  (I got those from this answer.)

Because the trailing newline is non-displayable, that gets converted to a hex code too.  You could extend the code to handle the escape codes \t, \b, \n, \r and maybe \\ too if needed.  (You could also make it more efficient… this was done for brevity!)

Answered By – gidds

Leave a Comment