Replace Text On Webpages

By JoeSimmons Last update Apr 13, 2010 — Installed 5,550 times.

Problem, need some help plz

in
Subscribe to Problem, need some help plz 10 posts, 4 voices



Da5h User
FirefoxWindows

Hi there Joe,
I ran into a problem with your script, while trying to translate japanese site to English.

Your script seem not to replace strings fully, if those strings contain in them another string from the "search and replace" list.

Example:

'メッセージ':          'Messages',
'メッセージはありません': 'No messages',
'メッセージを送る':      'Send message', 

But after i run it on a website i see:

Messages
Messagesはありません
Messagesジを送る

In case you cant see japanese text above, here is another example of the behavior:

'DOG':   'CAT',
'DOGX1': 'CATY1',
'DOGX2': 'CATY2', 

After using your script:

CAT
CATX1
CATX2

any help?

 
JoeSimmons Script's Author
FirefoxWindows

Probably due to how the Japanese text is, but I don't really know much about it.

 
Da5h User
FirefoxWindows

No, It's not unicode related.
Input those into your script and try it on my previous post:

 
'CAT':   'DOG',
'CATX1': 'DOGY1',
'CATX2': 'DOGY2', 

This English text:

 
CAT
CATX1
CATX2

Turns into this:

 
DOG
DOGX1
DOGX2

But it should be:

DOG
DOGY1
DOGY2

 
Da5h User
FirefoxWindows

Are there any plans on fixing that issue?

p.s.
I ran across that post in this forum,
http://userscripts.org/topics/34144
The Chinese guy seem to have stumbled on that problem as well (since most Chinese/Japanese words are build from other words) and he seem to say that an earlier version of the script didn't have this problem, so maybe it will help.

p.s.s.
the current script with
'ape': 'MONKEY',
'escape': 'RUN AWAY',
replaces "escapement" with "escMONKEYment".

and
'escape': 'RUN AWAY',
'ape': 'MONKEY',
replaces "escapement" with "RUN AWAYment".

 
kimatg User
FirefoxWindows

it's an issue of the ordering.
since the script applies the replacements in order of the strings..
so for this:

'メッセージ':          'Messages',
'メッセージはありません': 'No messages',
'メッセージを送る':      'Send message', 

it would be first replacing メッセージ to Messages -> Messagesはありません
and because there isn't any rule defining replacement of a string with text 'Messagesはありません', the string is left unchanged.

fix: always place the shortest word on the bottom, like this:

'メッセージはありません': 'No messages',
'メッセージを送る':      'Send message', 
'メッセージ':          'Messages',

in this case the script will search first for 'メッセージはありません' and replace with "No messages"
then search for single words 'メッセージ' and replace with 'messages'.

...hope you get what I mean :)

 
JoeSimmons Script's Author
FirefoxWindows

Ok try the new version out. It no longer changes parts of words, just the full word/sentence.

 
kimatg User
FirefoxWindows

um, sorry but exactly how does the new version work now?
after I updated the code, the replacement rules I had defined before including symbols such as 'Users:' or '(10) Masterpiece' don't get changed any more.
imo I don't think that was necessarily a part that had to be changed... rather would have been better if we had an option to set specific areas of the page that won't get the replacement applied to. :|

 
Da5h User
FirefoxWindows

@kimatg
The fix was necessary, just look at my post above your first one.
Simpy ordering strings from longest to shortest wouldn't fix that issue.

@JoeSimmons
Thank you. I'm gonna test this now & report back here.

Edit:
Hmmm, new script works now fine with plane text [A-Za-z0-9], but it completely stopped working for unicode or some "special" characters (such as &, (, : and etc)...

'&': 'AND',
':)': 'HAPPY SMILEY',
':|': 'NEUTRAL SMILEY',
Doesn't do anything now.

'メッセージ': 'TEST',
unicode replacement doesn't work now as well.

'plaintextメッセージ': 'TEST',
will be replaced with TESTメッセージ, so it looks like all unicode/Special character input simply gets ignored now.

 
JoeSimmons Script's Author
FirefoxWindows
Da5h wrote:
Hmmm, new script works now fine with plane text [A-Za-z0-9], but it completely stopped working for unicode or some "special" characters (such as &, (, : and etc)...
Because I put a word boundary on either side of the regex. I'm trying to come up with a fix but it's harder than it looks.
 
pgr User
ChromeWindows

i changed the script a bit using an array of replacement specifying objects instead of the word association table, like this:
{what:/(swr|sw3)/gi,by:"$1(00)"},
{what:/(3sat)/gi ,by:"$1(06)"},
it is shorter and should be faster this way too.

Cross
Presentational HTML allowed.
Use <code> for inline code and <pre> for code blocks. Use &lt; and &gt; for literal < and >.
We help break paragraphs and link your links.
or cancel