Page 1 of 3

Special european characters fix

Posted: Wed Feb 24, 2010 3:26 pm
by John Snow
Hi guys,

since folks around here helped me a lot with my own problems, I want to contribute. I hope my solution will help someone, so here it is:

Middle european countries have a set of special characters like wedge above s: "š" or č, etc. Firefox will show these characters in the address bar correctly, but wait until you hit rock bottom with Internet Explorer (as always). In IE, these special characters are URL-escaped and it looks like binary madness in your address bar. There goes SEO and all your effort.

Needless to say I needed to fix this ugly issue. So I opened lib/general.php and after line 134 added following code:

Code: Select all

/*	FIX
	Corrects SEO url for mid-european language characters conversion, e.g.: š => s, č => č, etc. Uses static substitution table.
*/	

static $tbl = array("\xc3\xa1"=>"a","\xc3\xa4"=>"a","\xc4\x8d"=>"c","\xc4\x8f"=>"d","\xc3\xa9"=>"e","\xc4\x9b"=>"e","\xc3\xad"=>"i","\xc4\xbe"=>"l","\xc4\xba"=>"l","\xc5\x88"=>"n","\xc3\xb3"=>"o","\xc3\xb6"=>"o","\xc5\x91"=>"o","\xc3\xb4"=>"o","\xc5\x99"=>"r","\xc5\x95"=>"r","\xc5\xa1"=>"s","\xc5\xa5"=>"t","\xc3\xba"=>"u","\xc5\xaf"=>"u","\xc3\xbc"=>"u","\xc5\xb1"=>"u","\xc3\xbd"=>"y","\xc5\xbe"=>"z","\xc3\x81"=>"A","\xc3\x84"=>"A","\xc4\x8c"=>"C","\xc4\x8e"=>"D","\xc3\x89"=>"E","\xc4\x9a"=>"E","\xc3\x8d"=>"I","\xc4\xbd"=>"L","\xc4\xb9"=>"L","\xc5\x87"=>"N","\xc3\x93"=>"O","\xc3\x96"=>"O","\xc5\x90"=>"O","\xc3\x94"=>"O","\xc5\x98"=>"R","\xc5\x94"=>"R","\xc5\xa0"=>"S","\xc5\xa4"=>"T","\xc3\x9a"=>"U","\xc5\xae"=>"U","\xc3\x9c"=>"U","\xc5\xb0"=>"U","\xc3\x9d"=>"Y","\xc5\xbd"=>"Z");
$val = strtr($val, $tbl);
$val = strtolower($val);
/*
	END OF FIX
*/
After quick CTRL-F5 in my browser, everything was OK and SEO addresses went OK. However, this has only one drawback:
function MakeURLNormal($val) seems to do this exact thing but in reverse order. I did not alter this function as I did not notice any error or problem to do so. If someone thinks fn MakeURLNormal needs to be adjusted as well, feel free to discuss or post your working code here.

Spread the word and help others fight such issues. Long live community help ;)

Re: Special european characters fix

Posted: Wed Feb 24, 2010 3:43 pm
by Martin
Always good to see a little pay-it-forward so thanks for that...

I haven't checked it over but it sounds like it could be useful to those with foreign products or customer bases... Cheers.. :)

Re: Special european characters fix

Posted: Wed Feb 24, 2010 3:44 pm
by CharlieFoxtrot
Wow! Nice work!

Re: Special european characters fix

Posted: Wed Feb 24, 2010 3:54 pm
by John Snow
Glad to be of a help guys ;)

Re: Special european characters fix

Posted: Sun Jun 13, 2010 10:51 pm
by meules
Exactly what I need... thx ;)

Re: Special european characters fix

Posted: Mon Jun 14, 2010 8:48 am
by Tony Barnes
Can this be adapted to get rid of other non letters such as hyphens and apostrophies also? Instead of getting all that 252%D crap..???

Re: Special european characters fix

Posted: Mon Jun 14, 2010 11:14 am
by Martin
Tony Barnes wrote:Can this be adapted to get rid of other non letters such as hyphens and apostrophies also? Instead of getting all that 252%D crap..???
If you want to get rid of "that crap" you need to come up with a different character substitution for dashes, etc... If you don't you end up with broken URLs because the SEO code uses dashes as a sub for spaces...

Personally I'd probably go with underscores to sub for spaces, then leave dashes alone but I have absolutely no idea what that does for SEO...

Re: Special european characters fix

Posted: Mon Jun 14, 2010 11:26 am
by Tony Barnes
I'd predominantly just want to get rid of hyphens, as they are the big headache for us - e.g.
Great product - with free product
becomes -
if it simply ignored them, or even replaced them with a hyphen (!!), then it would be:
which makes a lot more sense!!

Re: Special european characters fix

Posted: Mon Jun 14, 2010 11:52 am
by Martin
Tony Barnes wrote:I'd predominantly just want to get rid of hyphens, as they are the big headache for us - e.g.
Great product - with free product
becomes -
if it simply ignored them, or even replaced them with a hyphen (!!), then it would be:
which makes a lot more sense!!
True... might be possible but the problem is how would the system know you weren't listing:
  • Great product - with free product
  • Great product with free product
  • Great product --with free product
  • Great product-- with free product

Thinking about it logically what they should have in the code is an SEO translated field in the products table so the SEO isn't translated on the fly but stored with the product record and used directly.. That would allow direct translation and the sort of format you're after.

BUT, adding that in now would require a lot more sanity checking to ensure you didn't create duplicates...

Re: Special european characters fix

Posted: Mon Jun 14, 2010 1:14 pm
by Tony Barnes
Hmmm, I guess - but it's still a PITA!

As a hack there aren't going to be many products that are "that" similar, so would be an ok workaround I reckon - or are you saying that it would need to be unique to work "backwards" (i.e. cart interpreting an entered URL)?