Unicode in 0.4.0 Savefiles? (Was: Unicode Font...)

Forum for technical discussions regarding development. If you have a general suggestion, problem or comment, please use one of the other forums.

Moderator: OpenTTD Developers

Pipian
Engineer
Engineer
Posts: 122
Joined: 10 Jul 2004 02:25
Contact:

Post by Pipian »

More notes:

A preliminary examination of the string handling code supports the theory that strings could be potentially handled as "uint16 *" rather than "byte *" prior to rendering. Furthermore, due to the way strings are handled, it is quite possible to map the unused characters between the ASCII block and the ISO-8859-1 block. There is only one problem however:

The language which uses the most accents is Hungarian. It utilizes 16 accented characters, and perhaps more if we add them.

In contrast, REQUIRED characters from the ISO-8859-1 block number approximately 18 or 19. This totals to MORE than the anticipated 32 characters that can be mapped between ASCII and ISO-8859-1. If current GRF methods are used for strings, internationalization is not entirely possible unless a block of currently unused GRF tiles (if any are left) are reserved for use for a character set (perhaps rounding number of visible characters being used to 256, instead of the current ~224) This would make room for standard ASCII, which should always be usable, as well as room for required non-ASCII characters (e.g. transport-type-icons, arrows), required characters to expand the ASCII character set for whatever ASCII-extended language is used (if any, say if main language is Icelandic, and town text is Russian, we would need room for Icelandic characters as well as Russian Characters. Naturally russian takes up more space, so it would overwrite the 96 characters currently used as extended ISO-8859, while Icelandic-specific chars would be in a buffer space). A similar char map would be used if main text was russian and town generation was Icelandic.

Essentially, we would have characters allocated generally as in the attached picture.

Regrettably, this is still nothing more than a half-assed attempt at implementing Unicode (as we're still limited to 256 different characters on screen). However, it's a good stop-gap measure until full font support can be coded.

Note that points I mention on the right are the sprite allocation locations. (upside down question mark would not be reallocated as it hints at)
Attachments
Possible Encoding "mapping"
Possible Encoding "mapping"
intended-encoding.gif (19.39 KiB) Viewed 5082 times
Pipian
Engineer
Engineer
Posts: 122
Joined: 10 Jul 2004 02:25
Contact:

Post by Pipian »

After the discussion about Unicode, it seems that the general consensus is that a separate resizable text-sprite memory block from the main sprite block would be best. Thus the trick then involves where and how said memory block would be allocated (given that allocation size would depend on languages selected for towns, chatting, and display.)

So a to-do list for Unicode support roughly would be as follows:

1. First implement a separate resizable memory block for text-sprites gthat doesn't depend on the main sprite block.
2. (in tandem with 1) Figure out how this block will be reallocated according to languages selected, and before network games start.
3. Implement the ability to load sprites into this block (perhaps from a new sprite format? This would need more discussion, as there are pros and cons to creating a new text-specific sprite format) and mapping of these sprites to Unicode.
4. Finally integrate said sprite work into the game.

Given the somewhat complicated and long task of this, it may be useful to fork the source temporarily to work on this specific feature before landing it back in the main release (hopefully with little difficulty).

In summary, this looks to be about 1/2 to 3/4 of the work of the network rewrite, though I could be overestimating.
Pipian
Engineer
Engineer
Posts: 122
Joined: 10 Jul 2004 02:25
Contact:

Post by Pipian »

Since I can't edit the wiki right now, here's a diagram of what I see in terms of Unicode upgrades (there are more I'm sure. This is just a basic outline of the major functions that need editing from what I can tell). Note the addition of something like addCharset and delCharset that would be called during game load as well as network game joining/creation, as well as to/fromUTF8, and lastly translateString (which more or less converts UTF-16 characters to sprite numbers)

Image
qinf
Engineer
Engineer
Posts: 16
Joined: 01 Feb 2005 07:01

Post by qinf »

I've supported Japanese localization of OTTD.
It uses UTF-8 and Truetype font to display Japanese text by using SDL_ttf. It allows to input with Input Method(Windows only and iconv is required).

I believe it can apply to other languages, but I haven't checked.
I submited the patch to sf.net.
Attachments
Japanese localization sample
Japanese localization sample
ottd.png (37.17 KiB) Viewed 4959 times
Pipian
Engineer
Engineer
Posts: 122
Joined: 10 Jul 2004 02:25
Contact:

Post by Pipian »

Not bad. I'll have a look to see if it's workable into something that can be easily worked into a more general form.
User avatar
orudge
Administrator
Administrator
Posts: 25138
Joined: 26 Jan 2001 20:18
Skype: orudge
Location: Banchory, UK
Contact:

Post by orudge »

Hmm, this is very interesting... I'll have to have a look at that patch (I don't understand Japanese, but I'm interested in the whole Unicodification of OpenTTD).
Pipian
Engineer
Engineer
Posts: 122
Joined: 10 Jul 2004 02:25
Contact:

Post by Pipian »

the patch relies on specific locations of true type fonts. This could be a problem given that we would have to distribute the fonts with it.
qinf
Engineer
Engineer
Posts: 16
Joined: 01 Feb 2005 07:01

Post by qinf »

If Truetype font would not be found, the patch should have reset language setting to English for using GRF font.
So user can specify location of Truetype font in openttd.cfg, and we can reset it when font is missing.
Alltaken
Tycoon
Tycoon
Posts: 1285
Joined: 03 Dec 2003 06:24
Location: Christchurch, New Zealand
Contact:

Post by Alltaken »

i must say that using TTF is a better idea than using Grf :P

Alltaken
User avatar
dominik81
OpenTTD Developer
OpenTTD Developer
Posts: 768
Joined: 16 Aug 2003 12:55
Location: Bonn, Germany

Post by dominik81 »

I'm sure we should be able to find a suitable free true type font to distribute with OpenTTD. I haven't looked at the patch yet, but I have one question: Does it work for both SDL builds and Windows builds?
"There's a readme that comes with the source. I suggest you read it."
- Korenn
Pipian
Engineer
Engineer
Posts: 122
Joined: 10 Jul 2004 02:25
Contact:

Post by Pipian »

It relies on SDL_ttf for both Windows and SDL builds. It may not build properly in environments other than Linux and OpenTTD. PCF fonts may be suitable, at least if we continue going bitmap.
Pipian
Engineer
Engineer
Posts: 122
Joined: 10 Jul 2004 02:25
Contact:

Post by Pipian »

Here's a mock screenshot using pre-existing Asian BDF fonts (and also a pre-existing small BDF font). Using SDL_bdf for rendering with SDL would result in something like this (note that some of the space used where small fonts are would need resizing, and CJK still takes a lot of space in top bars...)

The small font is only a sample, and would likely be replaced with a suitable constructed sans-serif font of similar size. Likewise, the CJK text is just raw (Baekmuk Batang and Gulim), and will likely be cleaned up as more legible (at least so far as the characters that are being used in translations)
Attachments
BDF fonts
BDF fonts
Chendhill Transport, 11th Feb 1950-Unicode.gif (87.81 KiB) Viewed 4771 times
qinf
Engineer
Engineer
Posts: 16
Joined: 01 Feb 2005 07:01

Post by qinf »

This patch has the following feature:

1. render with TTF
2. input with Input Method
3. convert coding system(from utf-8 to other etc.)

TTF part works fine on Windows and Linux both, but unfortunately the others work Windwos only.

I upload Windows executable to here:
http://kml.hp.infoseek.co.jp/cgi-bin/th ... mg/620.zip

Please try openttd_eng.exe to display Vera.ttf.
Attachments
ottd.png
ottd.png (47.83 KiB) Viewed 4752 times
Pipian
Engineer
Engineer
Posts: 122
Joined: 10 Jul 2004 02:25
Contact:

Post by Pipian »

There are several issues with TTF at the moment:

1. Not as portable (needs Freetype2, which, while present in most OSs, is still an additional dependancy. It may also need SDL for Windows, which we don't want.)
2. Problems with font distribution. Many fonts would have to be distributed with OpenTTD, or it will not function as well. BDF can be compressed much better, and many can be distributed free according to Debian. Font consistency is especially important in multilingual games.
3. A need for something like pango/fontconfig. Currently, it does not depend on such, and is thus difficult to display other fonts as fall-back measures.
4. Anti-aliasing is not used. This could be a big bonus to TTF over BDF, but thus far does not help
5. There seems to be minimal need for scaling. This is one of TTF's other major benefits, but it fails to be useful.
6. Hard-coded TTF references are no good, as not every Unix distribution has the same files or puts them in the same place (e.g. would this break builds on Mac OS X? MorphOS? OS/2?). To put them in the OpenTTD directory with minor text configuration then causes problem #2.
7. Problems with clipping? See the Message history in your compile.

Conversely there are problems with BDF as well:

1. Not (easily) scalable.
2. Not (easily) anti-aliasable. This still seems possible however if scaled down...
3. A need for something like pango/fontconfig.
4. Hard-coded BDF references. This is less of a problem here however, as font-distribution is less of a problem.

Regardless of which path is taken, I'm currently working on actually adding unicode support and having a working font display wrapper that should render the choice of BDF vs. TTF moot as it eliminates the pango/fontconfig issue and leaves the default-font-file configuration to the language file makers. While I am programming the initial wrapper to use SDL_bdf (perhaps even ending up shrinking and anti-aliasing), it should be relatively trivial to change it over to SDL_ttf when I am done, as rendering would no longer be dependant on sprites.

in any case, the only conversions we should ever need are from the old coding to UTF-16 (internal) and then back and forth between UTF-16 and UTF-8. These are trivial, and I've already mapped them out to begin with.

the other thing I noted was that, for whatever reason, the pound sign was still a sprite...

Lastly, IME support in Windows would be extremely helpful if we move into foreign languages, and possible compatibility with similar Linux IMEs would also be beneficial.
Pipian
Engineer
Engineer
Posts: 122
Joined: 10 Jul 2004 02:25
Contact:

Post by Pipian »

Current Progress on Font Wrapper and Unicode Support:
  • Fixing SDL_bdf code for UTF-16: 100%
  • Adding .lng-file based font selection: 50% (Still need to fix strgen. strings.c is fixed.)
  • Font Load and Unload wrapper functions: 80% (Load is mostly done. Needs zlib support to decompress compressed BDF files.)
  • FontConfig-style character selection functions: 90% (Font selection based on character now works, replacing non-found character with "?" not working yet, and thinking about streamlining process for gfx.c by returning the entire block containing a character (e.g. it searches for font with 'a' and function returns the font, together with the boundaries that contain 'a'))
  • Add RTL text support (e.g. Hebrew): 0%
  • Fix gfx.c to use font wrapper functions and SDL_bdf: 0%
  • Fix gfx.c to use UTF-16: 0%
  • Fix control character handling: 70% (Previous patch needs to be merged into current codebase)
  • Fix save/load to use UTF-8: 0%
  • Fix .lng files to use UTF-8: 0%
  • Prepare .bdf files: 100% (CJK fonts selected, though may need cleanup. Small, Medium, and News fonts are prepared (Latin, Greek, Cyrillic, Hebrew, and flavor characters), and may do with some character pruning.)
  • IME support: 0% (This comes dead last, after everything else is done.)
  • Ligature support: 0% (This comes even after IME support, and would allow for Arabic and Indian Script support)
Last edited by Pipian on 10 Feb 2005 02:50, edited 6 times in total.
qinf
Engineer
Engineer
Posts: 16
Joined: 01 Feb 2005 07:01

Post by qinf »

OK, I'm going to work out Linux IM support.
Pipian
Engineer
Engineer
Posts: 122
Joined: 10 Jul 2004 02:25
Contact:

Post by Pipian »

Note the update on fontwrapper support. Some progress has been made lately, primarily in font-config-based character selection.
User avatar
orudge
Administrator
Administrator
Posts: 25138
Joined: 26 Jan 2001 20:18
Skype: orudge
Location: Banchory, UK
Contact:

Post by orudge »

Keep it up, sounds good! :D
Pipian
Engineer
Engineer
Posts: 122
Joined: 10 Jul 2004 02:25
Contact:

Post by Pipian »

Updated (Primary progress involved mostly finishing off the load/unload wrapper functions and finishing off the medium font.). Also added two new targets (RTL support and Ligature support). RTL support may be postponed to a later version of the "fontinfo" file specification. Best to not get too confused.

Next targets: Finishing load-unload (by adding zlib support) and starting to integrate with gfx.c.

Out of curiosity, should that thumbtack in the upper right of windows and the three lines in the lower right be considered a separate icon or a glyph? I'm treating the X and Checkmarks and triangles as part of the font, due to the location of allocation, and ability to generate with ALT codes.
Pipian
Engineer
Engineer
Posts: 122
Joined: 10 Jul 2004 02:25
Contact:

Post by Pipian »

Today was spent on the Newspaper font, which is about 40% complete (Latin and Greek are pretty much complete. Cyrillic will be finished tomorrow with the rest of the font, including Hebrew.)
Post Reply

Return to “OpenTTD Development”

Who is online

Users browsing this forum: No registered users and 1 guest