I have a very simple question since I couldn't find an answer in the documentation.
How do we convert a UTF8 string into uppercase? string.upper("абв") doesn't work for Cyrillic and non-Latin scripts.
I know it can be done using 3rd party libs but is there something in the API that I'm missing?
UTF8 uppercase
Forum rules
Before you make a thread asking for help, read this.
Before you make a thread asking for help, read this.
Re: UTF8 uppercase
You're not missing anything that I know. The function to convert is fairly simple though, given the appropriate table:
Here are the links in clickable form:
https://github.com/starwing/luautf8/blo ... ta.h#L2448 - table
http://www.unicode.org/copyright.html - Unicode® Inc. terms of use. The UCD in particular has this license:
http://www.unicode.org/copyright.html#License
Code: Select all
local utf8 = require'utf8'
local toupper_table = {}
for k, v in ipairs{
-- copy here lines 2448 to 2615 inclusive from
-- https://github.com/starwing/luautf8/blob/e953d23/unidata.h
-- (the file is generated from the UCD database, so it's
-- copyright Unicode® Inc; the terms of use are here:
-- http://www.unicode.org/terms_of_use.html
-- but I guess starwing would appreciate being credited too
-- as the author of the conversion program)
} do
for j = v[1], v[2], v[3] do
toupper_table[utf8.char(j)] = utf8.char(j + v[4])
end
end
local function toupper(s)
return (string.gsub(s, utf8.charpattern, toupper_table))
end
print(toupper("абв"))
https://github.com/starwing/luautf8/blo ... ta.h#L2448 - table
http://www.unicode.org/copyright.html - Unicode® Inc. terms of use. The UCD in particular has this license:
http://www.unicode.org/copyright.html#License
Re: UTF8 uppercase
Was really hoping for something more elegant but thanks, it works!
Who is online
Users browsing this forum: No registered users and 215 guests