Dup file output is in "Little-endian UTF-16 Unicode text" format - why?

Hello all...

I saved the "duplicate file" output to a file, and was trying to use Cygwin text-processing tools on it, but it wasn't working.

Looking into it, it looks like CCleaner outputs "Little-endian UTF-16 Unicode text" format text, and not regular old ASCII or UTF-8, which is what most apps I know use, even in Windows environments.

Can someone tell me why CCleaner uses this text flavor? And how to change it to use UTF-8 instead?

- Tim

as to why, no idea.

as to converting, Notepad++, under Encoding, has a Convert to UTF-8 option.

Yeah, I used the cygwin utility 'iconv' to convert it. But it would be nice if I didn't have to do that.

Who would I ask about why, if not here? File a support ticket?

- Tim

you can only try and see if they respond.

Regular old ASCII only suports Latin characters, and half the world uses some other script. I'm no expert, but to quote Wikip ' UTF-16 is used for text in the OS API in Microsoft Windows 2000 onwards', and ' UTF-16 is the native internal representation of text in the Microsoft Windows NT', which is the same thing I guess. So UTF-16 is what Windows uses.

The duplicate file txt output has a byte order marker of FF FE indicating that it is little endian. Wikip again '... the application is expected to figure out what encoding to use when reading text data.'

The duplicate file txt output is openable by Notepad, which reads the byte order marker amd interprets accordingly. I don't know why Cygwin can't do the same. I think that the bug lies with an application with a name beginning with one C.