OK, *someone* on my friends list must have a clue about this. What determines the sort order in Mac OS X for non-alphanumeric characters? Why does _ sort before - before •? Any ideas?
In the Finder? The Finder has some very confusing sort behavior. Among other things, it seems to have a heuristic to sort items numerically (so 10.txt would come after 9.txt, even though the lexical sort order would reverse that).
Yup. I later found the same doc that lionsburg quoted here. The Unix layer uses the old lexical sort by octets, so UTF-8 stuff sorts to the end in ls output.
Like it was said above, partly it relies on "numeric" values for files that contain digits. But in any case, it's the ASCII value (that is, the numeric value for each character). Start the Terminal (in the Finder choose Go>Utilities, then double click Terminal) and type "man ascii" and you can see the same table in three different ways, octal, hex and decimal. The character with the smaller value sorts first. For example, "_" is 95 (decimal), "*" is 42, thus "*" sorts before "_". Of course that table is awfully out of date, so you won't see characters that are produced with the "Option" key or Unicode. Most of the time, the easiest way to test is to create two temporary folders and then name them with the two characters you are trying to test, but if you want to have an idea of the entire character set, the table is your best bet for the standard ASCII. (Sorry I don't have more info).
See, that sounds so reasonable, except that "_" sorts before "*", at least on my machine! The reason this came up is that I was creating folders to test the sort order, but then I got curious about the logic behind it, and now I can't seem to just drop it. :)
``As you enter the programmers' chamber, an echo is heard saying "That is *not* a bug, it's a feature!!!" [...]'' ;-)
Best I can try to explain is that an awful lot of files/directories in Unix get named with a first character like "_", "~" etc and they wanted it to be say, first/last files sorted. But here I'm just talking thru my hat and it really makes very little sense, if any. Actually, let me stop right here and now and use the phrase Dave keeps trying to train me to use: "I don't know." -- there, I said it. But now you made me very curious too! ;-)
Here's from a quote from an Apple Developer's reference doc:
The Finder in Mac OS X takes advantage of some sanctioned ways for altering the default sorting behavior defined by the Unicode standard. In particular, the Finder supports the following sorting rules:
Punctuation and symbols are significant for sorting.
Digit sub-strings are sorted by numeric value rather than as characters.
Case is insignificant.
The first list item is the key. The "•" (aka low asterisk) symbol will take precedence over "_" since "_" is not considered, by Unicode standards for the Latin alphabet, a punctuation mark or a symbol.
no subject
Date: 2005-03-09 07:11 pm (UTC)no subject
Date: 2005-03-09 07:15 pm (UTC)no subject
Date: 2005-03-09 07:38 pm (UTC)no subject
Date: 2005-03-09 09:35 pm (UTC)no subject
Date: 2005-03-10 01:15 am (UTC)no subject
Date: 2005-03-09 08:32 pm (UTC)no subject
Date: 2005-03-09 09:38 pm (UTC)no subject
Date: 2005-03-09 10:34 pm (UTC)``As you enter the programmers' chamber, an echo is heard saying "That is *not* a bug, it's a feature!!!" [...]'' ;-)
Best I can try to explain is that an awful lot of files/directories in Unix get named with a first character like "_", "~" etc and they wanted it to be say, first/last files sorted. But here I'm just talking thru my hat and it really makes very little sense, if any. Actually, let me stop right here and now and use the phrase Dave keeps trying to train me to use: "I don't know." -- there, I said it. But now you made me very curious too! ;-)
An the answer is....
Date: 2005-03-09 10:01 pm (UTC)The Finder in Mac OS X takes advantage of some sanctioned ways for altering the default sorting behavior defined by the Unicode standard. In particular, the Finder supports the following sorting rules:
The first list item is the key. The "•" (aka low asterisk) symbol will take precedence over "_" since "_" is not considered, by Unicode standards for the Latin alphabet, a punctuation mark or a symbol.
Re: An the answer is....
Date: 2005-03-09 10:44 pm (UTC)There's a bit more to it. The Finder is using a Unicode Collation Chart to do sorting. Here's the chart for Latin-A:
http://developer.mimer.com/collations/charts/UCA_latin-1.htm
You'll notice "_" come before "*" in the chart.
Re: An the answer is....
Date: 2005-03-10 12:28 pm (UTC)