Question about UTF-8 and file:getFilename

Questions about the LÖVE API, installing LÖVE and other support related questions go here.
Forum rules
Before you make a thread asking for help, read this.
User avatar
easy82
Party member
Posts: 184
Joined: Thu Apr 18, 2013 10:46 pm
Location: Hungary

Re: Question about UTF-8 and file:getFilename

Post by easy82 »

Nevermind, Google is my friend. :)

main.lua

Code: Select all

local filename = ""

function unescape (s)
  s = string.gsub(s, "+", " ")
  s = string.gsub(s, "%%(%x%x)", function (h)
    return string.char(tonumber(h, 16))
  end)
  return s
end

function love.filedropped(file)
  filename = file:getFilename()
end

function love.draw()
  love.graphics.print("Original: " .. filename, 10, 10)
  love.graphics.print("Unescaped: " .. unescape(filename), 10, 30)
end
Source: https://www.lua.org/pil/20.3.html
User avatar
Positive07
Party member
Posts: 1014
Joined: Sun Aug 12, 2012 4:34 pm
Location: Argentina

Re: Question about UTF-8 and file:getFilename

Post by Positive07 »

Well your solution is good, but this is a problem with your file manager, and you should be careful with this solution, because I could have a file called Let%C3%B6lt%C3%A9sek and your solution would turn this into Letöltések (I think) which is something I don't expect. Basically the inverse to your problem.

By covering an edge case you create another, so the best solution would either be to make this a configuration option or don't depend on it... Also reporting the "error" (may be intended behaviour though) to the developers of the file manager may be a good idea, I don't know
for i, person in ipairs(everybody) do
[tab]if not person.obey then person:setObey(true) end
end
love.system.openURL(github.com/pablomayobre)
User avatar
zorg
Party member
Posts: 3444
Joined: Thu Dec 13, 2012 2:55 pm
Location: Absurdistan, Hungary
Contact:

Re: Question about UTF-8 and file:getFilename

Post by zorg »

easy82 wrote:
raidho36 wrote:Does it escapes any unicode strings or just file paths?
I don't know, I was testing this through love.filedropped() and file:getFilename().
Test it then, just to be sure.
easy82 wrote:
zorg wrote:To me, what's interesting is that not only the text is encoded as UTF-8, but it's not plainly UTF-8, but rather, escaped per-character with %-s like html; very strange.
Do you know any good way to unescape this string in LUA?
It can be done, but i don't know whether that's a solution, and not just a workaround...

Code: Select all

string.gsub(filename, "(%%)([0-9a-fA-F])", "%2") -- untested, but should work.
Me and my stuff :3True Neutral Aspirant. Why, yes, i do indeed enjoy sarcastically correcting others when they make the most blatant of spelling mistakes. No bullying or trolling the innocent tho.
User avatar
easy82
Party member
Posts: 184
Joined: Thu Apr 18, 2013 10:46 pm
Location: Hungary

Re: Question about UTF-8 and file:getFilename

Post by easy82 »

Positive07 wrote:Well your solution is good, but this is a problem with your file manager, and you should be careful with this solution, because I could have a file called Let%C3%B6lt%C3%A9sek and your solution would turn this into Letöltések (I think) which is something I don't expect. Basically the inverse to your problem.

By covering an edge case you create another, so the best solution would either be to make this a configuration option or don't depend on it... Also reporting the "error" (may be intended behaviour though) to the developers of the file manager may be a good idea, I don't know
I see your point...!
zorg wrote:It can be done, but i don't know whether that's a solution, and not just a workaround...
Yes it's just a workaround. :)

Do you guys think this is a bug that I should report?
User avatar
pgimeno
Party member
Posts: 3550
Joined: Sun Oct 18, 2015 2:58 pm

Re: Question about UTF-8 and file:getFilename

Post by pgimeno »

The URL encoding is probably not LÖVE's fault. I'm unable to reproduce it.[Edit: see below] So the question is whom should you report it against?

Well, there's the possibility that there's some kind of flag indicating whether the input name is URL-encoded and LÖVE is not honouring it.

Edit: I could not reproduce it by dragging from xfe, but I could by dragging from the GTK2 file dialog. Now I have a basis to investigate further.
User avatar
raidho36
Party member
Posts: 2063
Joined: Mon Jun 17, 2013 12:00 pm

Re: Question about UTF-8 and file:getFilename

Post by raidho36 »

So some file managers escape Unicode and some don't. I assume that has to do with file:// protocol which is allowed to escape non ASCII characters, or in fact any characters. What I think is proper solution is for LÖVE to unescape file paths internally before output.
User avatar
pgimeno
Party member
Posts: 3550
Joined: Sun Oct 18, 2015 2:58 pm

Re: Question about UTF-8 and file:getFilename

Post by pgimeno »

I've traced the problem to the SDL library. It seems fixed in a newer version.
User avatar
easy82
Party member
Posts: 184
Joined: Thu Apr 18, 2013 10:46 pm
Location: Hungary

Re: Question about UTF-8 and file:getFilename

Post by easy82 »

pgimeno wrote:I've traced the problem to the SDL library. It seems fixed in a newer version.
Cool, thanks for the investigation!

The only problem with this is that the following code will not read the file with special characters in its full path:
main.lua

Code: Select all

function love.filedropped(file)
  local ok, err = file:open("r")
  local data = nil
  
  if not ok then
    print(err)
  else
    data = file:read()
    file:close()
  end
end
So you have to make a new file object or use the LUA io.* functions instead.
User avatar
pgimeno
Party member
Posts: 3550
Joined: Sun Oct 18, 2015 2:58 pm

Re: Question about UTF-8 and file:getFilename

Post by pgimeno »

You can always upgrade your SDL ;)
User avatar
zorg
Party member
Posts: 3444
Joined: Thu Dec 13, 2012 2:55 pm
Location: Absurdistan, Hungary
Contact:

Re: Question about UTF-8 and file:getFilename

Post by zorg »

True, but did something like this work?

Code: Select all

function love.filedropped(file)
  local file = love.filesystem.newFile(file:getFilename():gsub("+"," "):gsub("%%(%x%x)",function (h)
    return string.char(tonumber(h, 16))
  end))
  local ok, err = file:open("r")
  if not ok then
    print(err)
    return
  end
  local data = file:read()
  file:close()
end
Also, earlier you wrote "/home/user/Let%F6lt%E9sek/test.txt", using singular escaped chars (meaning the encoding was not utf-8 but some 8-bit codepage (Latin-2 or equivalent), but later, positive07 wrote "Let%C3%B6lt%C3%A9sek", which was the escaped utf-8 encoding. Technically those two are separate issues, would be nice to know whether the second case also happens, or not.
Me and my stuff :3True Neutral Aspirant. Why, yes, i do indeed enjoy sarcastically correcting others when they make the most blatant of spelling mistakes. No bullying or trolling the innocent tho.
Post Reply

Who is online

Users browsing this forum: No registered users and 84 guests