GameMonkey Script

GameMonkey Script Forums
It is currently Thu Sep 09, 2010 12:18 pm

All times are UTC




Post new topic Reply to topic  [ 13 posts ] 
Author Message
 Post subject: About Unicode support
PostPosted: Sat Mar 22, 2008 2:54 pm 
Offline

Joined: Sat Mar 22, 2008 2:46 pm
Posts: 1
Hi, I'm using GM in making MMORPG.
I have some questions about unicode support.
I Need to use some unicode strings in gm script and I've seen your "Option for unicode strings" in ToDoList.
May I ask how long will it take updating for unicode support?


Top
 Profile  
 
 Post subject: Re: About Unicode support
PostPosted: Sat Mar 22, 2008 3:12 pm 
Offline

Joined: Mon Dec 15, 2003 1:38 pm
Posts: 580
Welcome to the forum :)

It is unlikely unicode support will appear in GM unless someone adds it. It will be supported in GM2 but the release of that is completely unknown as development is on hold.

I would suggest encoding in UTF-8 and putting the data in quotes. Other than that, don't use GM for your localization tables, just use a text file (UTF-8 or 16) and put key value pairs, plus perhaps C++ style comments. Simple to parse and use. Don't worry about matching external and internal encodings, you will likely need to re-encode to support different UTF types inside your game eg. UTF-16 when dealing with most MS interfaces, UTF-8 when dealing with others, UTF-32 when handling input or other single character interfaces.


Top
 Profile  
 
 Post subject: Re: About Unicode support
PostPosted: Sat Mar 22, 2008 4:43 pm 
Offline

Joined: Fri Feb 15, 2008 1:51 pm
Posts: 13
Someone beats me to it :D

Actually I was about to ask about unicode support,because my little game might need to display exotic characters at some point.

I looked the code briefly and it seems 'char' is used predominantly in GMcode and gmStringObject,so adding unicode support is probably a non-trivial task/hack.

edit:oops I was thinking about something else,GM uses 'char'


Top
 Profile  
 
 Post subject: Re: About Unicode support
PostPosted: Sat Mar 22, 2008 11:25 pm 
Offline

Joined: Mon Dec 15, 2003 1:38 pm
Posts: 580
Little bit of rant.... I use UTF-8 in files and in memory. I convert to UTF-16 and UTF-32 at runtime as needed to interact with other interfaces. I have found this solution works well. Microsoft is one of the few companies that uses UTF-16 and I suppose in a typical MS fasion, they thought 35k glyphs is enough for anyone. Well it isn't and now we have UTF-16 with surrogates to try and get around the problem. UTF-8 is elligant because you can quickly and stabaly work with 1-3 byte characters and the strings are single byte zero terminated making them compatible with classic string functions and ASCII.

When I use localized text in game, my script looks like this:
Code:
SomeDisplay( LangText("#str_health", player.health) );

Where #str_health is a Key to look up localized string "Health %d" or such.
LangText() is a script function that looks up that localized string and applies formatting.
Code:
global LangText = function(a_langString, a_param1, a_param2, a_param3, a_param4, a_param5, a_param6, a_param7, a_param8)
{
  return format( LoadLanguageString(a_langString), a_param1, a_param2, a_param3, a_param4, a_param5, a_param6, a_param7, a_param8);
};

LoadLanguageString() is a native binding that returns a string from a specified Key.
I believe GM strings can contain UTF-8 characters, but outside of that it doesn't handle unicode.


Top
 Profile  
 
 Post subject: Re: About Unicode support
PostPosted: Sun Mar 23, 2008 2:02 am 
Offline
User avatar

Joined: Fri Jan 14, 2005 2:28 am
Posts: 434
I strongly second that recommendation. Have your scripts use generic tags as shown and handle localization translation from those tags to your localization tables elsewhere. There's really no need to have that degree of memory bloat in the scripting system just to support unicode strings.


Top
 Profile  
 
 Post subject: Re: About Unicode support
PostPosted: Thu Mar 27, 2008 5:51 am 
Offline

Joined: Fri Nov 24, 2006 9:50 am
Posts: 165
we have limited utf-16 support in our internal version, we can probable release it as we set up sourceforge (am slowly setting it up)

we have implemented it with a new user type so we have a function NewWString() to create a wstring that can take a regular string and convert it, but we also have functions that translate a "tag" which is a regular string into a wstring (and return the new wstring object). We have also added operators for adding regular strings to wstrings etc.

So you can still do something like this for example:

GfxPrint( x, y, Translate( "SCORE_LABEL" ) + " " + player.score );

(the language database is a utxt simple text excel file saved out in utf-16 and if the "SCORE_LABEL" entry doesn't exist it is displayed as-is with ###'s in front of it so we spot it easily).


Top
 Profile  
 
 Post subject: Re: About Unicode support
PostPosted: Thu Mar 27, 2008 7:42 pm 
Offline
User avatar

Joined: Fri Jan 14, 2005 2:28 am
Posts: 434
Why even bother with a user type? To me it's wasting memory to have unicode strings even be exposed to the scripting system at all.

Why not just handle that stuff internal to your print function?

GfxPrint( x, y, "#SCORE_LABEL", " ", player.score );

This avoids a lot of the string concatenation and temporary string creations you are incurring by adding them all together as you are. Internal to the function you can build a string in a large unicode buffer, and as you loop through the variable number of parameters and convert them to strings if necessary(from float/int/etc) you can re-interpret tags such as #SCORE_LABEL into localized strings. No additional memory allocations, no temporaries, no need to pollute the garbage collector with a wide string type that it has little business even bothering with. Presumably your localization tables will be loaded into memory elsewhere anyways, so you pay double price in memory usage, and potentially script performance due to more frequent garbage collection due to the temporaries and multibytes strings, similar to the issues we ran into with user typed vector3.


Top
 Profile  
 
 Post subject: Re: About Unicode support
PostPosted: Fri Mar 28, 2008 12:27 am 
Offline

Joined: Fri Nov 24, 2006 9:50 am
Posts: 165
Because we wanted more generic usability with the ability to get the width of a string etc (for centering or alignment in tables), etc.

I think we have one routine that hides it like your print routine, so you can just specify the tag, but for graphical stuff we needed a bit more flexibility.


Top
 Profile  
 
 Post subject: Re: About Unicode support
PostPosted: Fri Mar 28, 2008 1:00 am 
Offline

Joined: Sun Jul 29, 2007 11:52 am
Posts: 31
Location: Newcastle Upon Tyne, UK
Where I work our strings are setup inside a tool, which just exports string data, and a set of defines which are a list of ID defines, and wrapper macros to get the strings from the manager. Makes translations easier and works with UTF-8 internally. The idea is that it could also export a list of defines for a scripting language, just passing an int through the system - meaning there is no additional overhead. Though GM current doesn't have defines/consts, they would technically just be stored as variables. I do generally prefer the idea of referencing strings by ID in code, not passing raw string data round, and simply putting tags in strings for inserting extra bits of text, button graphics, etc etc.


Top
 Profile  
 
 Post subject: Re: About Unicode support
PostPosted: Fri Mar 28, 2008 1:44 am 
Offline

Joined: Fri Nov 24, 2006 9:50 am
Posts: 165
that's similar except we use labels as the indices - it's not speed critical so a utilizing a table and a hash look up makes life easier - and lets people implement strings before they have been added to the translation file - they can just make up a label, slap it in, and when it shows up as ###s in the game the game planners/designers know they need to add it.


Top
 Profile  
 
 Post subject: Re: About Unicode support
PostPosted: Fri Mar 28, 2008 3:36 am 
Offline
User avatar

Joined: Fri Jan 14, 2005 2:28 am
Posts: 434
Right, but alignment ought to be a parameter as well, like an optional final parameter. It adds a lot of sloppiness to the scripts to do such low level stuff in there, which was my main point. Normally a gui library or something establishes the arrangement of gui elements, alignment, etc. And any raw functions needed to print screen text or whatever can take an extra parameter or 2 and handle alignment internally. Exposing all that low level stuff to script where the user has to handle their own alignment, and binding bloated low level multibyte strings to the script seems to defeat much of the purpose of scripting to me. Your call of course.


Top
 Profile  
 
 Post subject: Re: About Unicode support
PostPosted: Fri Mar 28, 2008 5:27 am 
Offline

Joined: Fri Nov 24, 2006 9:50 am
Posts: 165
yup, it depends on how much of a gui system set up you have set up with gm - ours is still quite basic but if it got more complicated we would hide the translations. For the amount of text we deal with the overhead of a wstring type wasn't much compared to the extra flexibility it gave us to treat it as a regular string with string-searching and sub-string concatenation and the like. (it's only the same overhead as for a regular string)


Top
 Profile  
 
 Post subject: Re: About Unicode support
PostPosted: Fri Mar 28, 2008 9:46 am 
Offline

Joined: Mon Dec 15, 2003 1:38 pm
Posts: 580
I've already already mentioned what we used in our last cross platform, fully localized game, but I'll add these things:

We did use a in house localization tool that sync'd strings from a spread sheet to the games data files, and a few other files like the installer, where localized strings were needed. This was important because our publisher wanted to work this way and the translators just edited Excel files.

We embeded formatting and display commands into our strings as well as parameters, so a final string might look like:
Code:
"{color: 255,230,111}Big Enemy{color:default} Big bad enemy with health %d, press {icon: GUI_IMAGE_XBUTTON} to destroy."
Text alignment and word wrapping was built into the UI.

We encountered no problems using UTF-8 internally as we could use it with almost all legacy string code, including GM, and convert to other formats when needed. For example, font character output and edit box input was UTF-32, most (but not all) Microsoft interfaces are UTF-16.

I used PSPad to view and hand edit unicode text files when needed, but many other text editors will work with various UTF encodings and display the correct glyphs if fonts are installed. The data or config files were typically UTF-8 text or UTF-16 xml. We happily interfaced with multiple 3rd party APIs and your string library had a few helpers like String::ToUTF16() String::FromUTF16(uint16* a_src) which made occasional conversion simple. Note that the string class did not internally support other formats, it just allocated appropriate size buffers and stored the bytes for those conversions, and so could not perform any string operations while in that format.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 13 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group