exfat upcase table for code points above U+FFFF (Was: Re: [PATCH] staging: exfat: add exfat filesystem code to staging)

Pali Rohár pali at kernel.org
Tue Apr 28 07:46:53 UTC 2020


On Monday 27 April 2020 11:49:13 Sasha Levin wrote:
> On Tue, Apr 21, 2020 at 11:30:45PM +0200, Pali Rohár wrote:
> > On Thursday 13 February 2020 16:18:47 Sasha Levin wrote:
> > > On Thu, Feb 13, 2020 at 01:06:56AM +0100, Pali Rohár wrote:
> > > > In released exFAT specification is not written how are Unicode code
> > > > points above U+FFFF represented in exFAT upcase table. Normally in
> > > > UTF-16 are Unicode code points above U+FFFF represented by surrogate
> > > > pairs but compression format of exFAT upcase table is not clear how to
> > > > do it there.
> > > >
> > > > Are you able to send question about this problem to relevant MS people?
> > > >
> > > > New Linux implementation of exfat which is waiting on mailing list just
> > > > do not support Unicode code points above U+FFFF in exFAT upcase table.
> > > 
> > > Sure, I'll forward this question on. I'll see if I can get someone from
> > > their team who could be available to answer questions such as these in
> > > the future - Microsoft is interested in maintaining compatiblity between
> > > Linux and Windows exFAT implementations.
> > 
> > Hello Sasha! Have you got any answer from exfat MS team about upcase
> > table for Unicode code points above U+FFFF?
> 
> Sorry for taking so long. This is my understanding from the Windows
> folks: Windows filesystems just don't support variable encoding length,
> and expect UCS-2 strings.

Ok, so should I understand your answer as exFAT upcase table does not
support representing Unicode code points above U+FFFF and therefore
exFAT implementation should expect that toupper(u) = u and tolower(u) = u
for any Unicode code point u in range [U+10000, U+10FFFF]? This is how
current exfat linux driver behave.


More information about the devel mailing list