Date: 2019-09-07 12:16 am (UTC)
frith: Yellow pony with yellow mane, suspicious look (FIM Applejack)
From: [personal profile] frith
In the third article, I seemed to have missed the part where they state what exactly it is that they've converted to binary and subsequently measured the bit rate of. The unicode characters of the translated texts? A sound recording? What format? Alac? Wav? mp3? Which quality? The date on the article is not April 1st, so someone somewhere thought 'speech bitrate' made sense.

Date: 2019-09-08 11:01 pm (UTC)
frith: Yellow cartoon pony with yellow mane, sad (FIM Applejack sad)
From: [personal profile] frith
Ah, yes, the syllables, so sound, not unicode characters. I should have paid better attention. My query still stands since, as far as I know, human speech is not binary.

Date: 2019-09-09 02:59 pm (UTC)
frith: Yellow pony with yellow mane, stunned look. (FIM Applejack stunned)
From: [personal profile] frith
Oy vey, so many ways to codify speech, from the abstraction of letters, through phonemes and syllabaries, to ideograms. Since the focus here is on sounded syllables, it's syllabaries we need. Chris Barker ran a routine to find out how many symbols would it take to provide a syllabary for English. Output: 15,831 distinct syllables. This wikipedia article rounds that off to "over 10,000 different possibilities for individual syllables". The AAAS article cites 6,949 syllables for English (13 bits to cover), 10 times more than Japanese (10 bits to cover).

I don't know how many distinct syllables exist in the spoken Chinese languages, but apparently, Chinese languages are monosyllabic.

What's missing is how "they calculated the information density of each language" in their 17 translations of their texts. It's not from the number of syllables uttered per second and it's not from how long it took to read the texts.

Example: several Germans read a text in German (5.5 syllables per second), it takes them an average of 30 seconds for them to express the text clearly. Information content of the text = (information transmission rate per syllable) x (number of syllables spoken) = [(universal information transmission rate of 39 units per second) รท (5.5 syllables/second)] x [(30 seconds) x (5.5 syllables/second)] = 1170 units.

That can't be right, it implies that all translations of the text will take the exact same amount of time to read. After all, they all have the exact same amount of information and information is universally transmitted at "39 bits per second".

To summarize, Italians chirp for a long time while Germans just let out a long grunt, but in the end, they've transmitted the same amount of information and at about as fast as the brain can handle it. Eureka! Let's publish an article! 9_6

Giving the result in bits/second just annoys me. They made a measurement based on something they can't measure (information density) and camouflaged it in a glib bitrate reference and that had me believing they'd actually measured something inherent in human speech, something other than syllables per second.

That article would have been more interesting and informative if it had stated that the vehicle of speech is the uttered syllable, that the syllabaries of some languages are more information rich than others and that speech rates are adjusted to convey information at roughly the same rate regardless of the language spoken. Then trotted out the methods, without any of this hipster bitrate nonsense.
Edited (Tweaked the math) Date: 2019-09-10 03:29 pm (UTC)

January 2026

S M T W T F S
     1 2 3
45 6 7 8 9 10
11 12 13 1415 16 17
18 19 20 21 22 2324
25262728293031

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jan. 23rd, 2026 04:37 pm
Powered by Dreamwidth Studios