Jisho

×
18c961de99593919fecc3a4b473e7f0d
6 Replies ・ Started by PxyGnomes at 2024-02-24 10:57:47 UTC ・ Last reply by PxyGnomes at 2024-03-04 23:41:28 UTC

牙(きば)radical showing inccorect number of strokes in Radicals' kanji composing panel at Jisho.org?

Hello! I'm really enjoying this online jisho.

As I was thoroughly investigating and comparing radicals, I've stumbled with a weird thing.

The kanji 牙(きば)will appear on several documentations* as a 4-strokes kanji.
However, that same kanji is also a radical of exactly the same shape and name, however, in the Jisho.org Radicals button for composing kanjis, the 牙(きば)radical will show up as a 5-strokes radical...!

Now I'm confused, but othe radicals sources** indicates the radical itself is a 4-strokes one (and not a 5-strokes one), hence matching the 4-strokes kanji.

I hope this may proove relevant.
Thanks for your attention!
Best regards.

18c961de99593919fecc3a4b473e7f0d
PxyGnomes at 2024-02-24 10:58:53 UTC

I'm sorry, first link is actually this one:

80acb8692021ff0e94d9d3309aaa370b
Beelzebubbles at 2024-02-24 11:40:31 UTC

There are two ways to write it, one with four strokes and one with five. You can see animations for both ways here https://kakijun.jp/page/kiba05200.html

18c961de99593919fecc3a4b473e7f0d
PxyGnomes at 2024-02-26 18:17:02 UTC

Ohh! I see; thanks for the explanation and reference!

I'm sorry, also gave a wrong information about the first site. There were more appearances of 牙 (and its variations) than I had looked for, and so I just missed the next one (of 5 strokes). It's still a bit confusing, but I think I'm starting to get the right idea. Sometimes, the digital symbols do not match even though they look identical.

For the computer, this ⽛ and this 牙 aren't equal symbols. Sometimes, by getting kanji/radicals from different sources in the internet leads to this issue. Another problem, is that sometimes it is also very hard to find the correct variation (so, I cannot seem to get the symbol for the 5-strokes; but I'm pretty sure at least the one for 4-strokes is secured.

80acb8692021ff0e94d9d3309aaa370b
Beelzebubbles at 2024-02-27 08:54:07 UTC

For some reason the Korean section of the wiktionary article shows the 5 stroke version, but if I copy and paste it, it turns into the 4 stroke version 牙. I guess it must be the font they use? https://en.wiktionary.org/wiki/%E7%89%99

6d4b0d7986f5ee4c159a7e5fef92e241
flayxis at 2024-02-27 12:23:08 UTC

If you're wondering about how a computer represents text, in most cases nowadays it's based on the Unicode standard. Some things about it are quite complicated, but for a basic understanding it's enough to know that one of the main things they do is assign every character a name and a "codepoint". Try it out here: https://unicodeplus.com

For example ⽛ is listed as "Kangxi Radical Fang (U+2F5B)", so the codepoint in Unicode would be 2F5B. And if you look up 牙 then it's "CJK Unified Ideograph-7259 (U+7259)", a completely different codepoint. All the codepoints are organized into different blocks (and further into planes) which you can think of simply as categories. That can help when trying to distinguish between two characters that look similar. Here for example you were asking about 牙 as a character on its own and not just as a radical, so the latter (codepoint 7259 from the "CJK ideograph" block) would be appropriate to use.

Note however that in this specific case, this isn't really the issue here at hand. Just because we have decided on a certain character with one Unicode codepoint, doesn't mean there aren't multiple ways of writing it. Just think about the Latin alphabet and how many ways there are to write an "A" or really every character of the alphabet. Unicode does not concern itself much with different ways of writing the same character (often actually there's some kind of grey zone, but not going to go into that here). In any case, as far as the computer and its character encoding is concerned, a 牙 is just a certain byte value, it doesn't care whether it's rendered to you with 4 or with 5 strokes.

So why does it look different sometimes? Well yes, it comes down to the font that is being used. Different fonts show characters in a different way. You probably already noticed this when looking at Simplified Chinese text in comparison to Japanese text. There are many examples of characters that are the same, but look different. Think 直 or 曜 or 骨 etc.
Even though they are encoded in the same way (for example 直 has the codepoint 76F4), when you use a Japanese font it will look very distinct from a Chinese font.

A website can (and should!) declare what language the text is written in inside its HTML markup by using a "lang" attribute on any tag. That way the browser knows which font to choose. That's why the Korean section in Wiktionary looks different from e.g. the Japanese section. They always declare lang="ko" for Korean language and lang="ja" for Japanese language. It's not so much that Wikipedia itself declares the font, it's just that they declare the linguistic context, and your browser then chooses an appropriate font.

Unfortunately not all websites, applications etc. do this the right way. Even some OSes screw this up. That's why some people argue that the Unicode standard should not have "unified" CJK characters by giving them the same codepoint even for different countries. If you want to find out more about this controversy just search "CJK unification" and there's plenty going back aeons. But whatever you think of it, it is here to stay and there's no real alternative to Unicode that doesn't break in a multitude of other ways. The end.

Post scriptum: Yes, there are also sometimes instances where computers don't use Unicode for representing text digitally. Really by 2024 the best thing to do is just switch to some encoding based on Unicode in most cases and if you have a special case then you already know that and are not looking for advice on that on Jisho.org forums.

18c961de99593919fecc3a4b473e7f0d
PxyGnomes at 2024-03-04 23:41:28 UTC

Sorry about the delay. And thank you very much for the great insights in some of structural aspects of these issues. This is very helpful in understanding what can go wrong and why. Cheers!

to reply.