Pages in topic:   < [1 2 3] >
Freeware glossary lookup tool wanted
Thread poster: Samuel Murray
Jaroslaw Michalak
Jaroslaw Michalak  Identity Verified
Poland
Local time: 10:20
Member (2004)
English to Polish
SITE LOCALIZER
Perl script Aug 4, 2011

Even dirtier Perl script:


use Win32;
use Win32::Clipboard;


$clip = Win32::Clipboard("");

for (;;) {
$clip->WaitForChange();

$text = $clip->GetText();

open (INFILE, "C:\\Users\\user\\Documents\\gloss.txt"); # glossary path

while (<INFILE>) {
@entry = split "\t";
print "$text \t @entry[0]";
if ($text =~ /@entry[0]/) {
$output = "$output" . " $_";
}
}
if ($output) {
#print "Result: $output\n"; # Uncomment if you want to output the results
# $clip->Set($text); # Uncomment if you want to copy the results to clipboard
Win32::MsgBox($output,0,'Lookup results'); # Comment if you do not want to have a popup
}
}


It gives false positive for "row". To go around that would require a bit of coding - the minimal match string (i.e. word) can be surrounded by whitespace, punctuation on either side (or not at all).

To skip the DOSBox run with wperl instead of perl (if you use ActiveState, that is).

[Edited at 2011-08-04 14:21 GMT]


 
Michael Grant
Michael Grant
Japan
Local time: 18:20
Japanese to English
Sorry, I missed an important aspect of your requirements! Aug 5, 2011


do until objFile.atEndOfStream
strContents = objFile.ReadLine

If InStr(strContents, ClipboardText) Then
MyArray = Split(strContents, vbTab, -1, 1)

sData = sData & MyArray(1) & vbCrLf

End If

loop


I just realized that the code above only searches for matches of the entire text on the clipboard. However, in your example, Samuel, it appears you want the script to look for all individual word matches:

For example, instead of searching the Glossary for "The quick brown fox" (which is what the code above does), you want to search for "The" and "quick" and "brown" and "for", individually! Of course! I am sorry! I was thinking in terms of a TM instead of a glossary!!! I apologize for missing that!

The script needs to split the clipboard text into individual words, and then search each line of the glossary for each of the words on the clipboard...correct?

So something like this:

do until objFile.atEndOfStream
strContents = objFile.ReadLine
strClip = ""

arrClips = Split(ClipboardText) ’Split clipboard text into individual words according to whitespace.

MyArray = Split(strContents, vbTab, -1, 1) ’Split the next lineof the glossary into its component parts (source word, target word, comment)

strSource = MyArray(0)

For Each strClip in arrClips ’Compare each word from the clipboard against the source word in the glossary

If InStr(strSource, strClip) Then ’If there is a match, then

sData = sData & MyArray(1) & vbCrLf ’Add the target word to our output data

End If

Next
loop


...would be better, I assume.

As it is now, the code simply displays the results in a Windows dialog box, but let me know if you prefer something else.

Please either post or e-mail a copy(or just part) of your glossary to me, and I'll see what I can do!

MGrant


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 10:20
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
@Michael Aug 5, 2011

Michael Grant wrote:
The script needs to split the clipboard text into individual words, and then search each line of the glossary for each of the words on the clipboard...correct?


Yes, that is the minimum what it should do.

Your latest script works (I had to change arrClips to ArrClips, though), but it pops an error message whenever the glossary contains words that are not in the clipboard, or if the glossary has trailing lines (presumably any non-three-column line would cause the error too).

Also, "dog." is not matched, but that was to be expected since this is a simple, first version of the script (I suppose one could later use regex to split by word boundaries).

Also, the output does not display all three columns (which would be what I would want), but again I think that is easily solved when the time comes.

I really enjoy testing the code, but unfortunately it takes a little too much of my time, and I have several looming deadlines now. So thanks for your help -- tis a pity we won't be able to complete this.


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 10:20
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
@Jabberwock Aug 5, 2011

Jabberwock wrote:
Even dirtier Perl script:


I tried this script two days ago, and it worked once. I could not get it to work a second time (and it didn't work a few times before it worked the first time). It displayed the matching glossary lines, but with a leading tab, and it included the clipboard text at the end of each line. It was nice that it stayed resident and needed not to be launched each time I wanted to check text.


 
Jaroslaw Michalak
Jaroslaw Michalak  Identity Verified
Poland
Local time: 10:20
Member (2004)
English to Polish
SITE LOCALIZER
End lines or encoding Aug 5, 2011

Samuel Murray wrote:
I tried this script two days ago, and it worked once. I could not get it to work a second time (and it didn't work a few times before it worked the first time). It displayed the matching glossary lines, but with a leading tab, and it included the clipboard text at the end of each line. It was nice that it stayed resident and needed not to be launched each time I wanted to check text.


Typically most of such problems occur if endlines or encoding is different than what Perl expects (as you see, the script does not do any checking for those - it assumes some default values). If you can send me a sample glossary (with a few sample lines to test), I could check it out...


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 10:20
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
Own script (first, crummy version) Aug 6, 2011

I've spend a day or two writing the script myself (and learning some things in the process), and when I was almost done with it, I realised that I had done it the wrong way, so the script I present here is half commented and half very confusing to anyone looking at the code... but it works, so here ... See more
I've spend a day or two writing the script myself (and learning some things in the process), and when I was almost done with it, I realised that I had done it the wrong way, so the script I present here is half commented and half very confusing to anyone looking at the code... but it works, so here it is:

http://wikisend.com/download/460940/glossary%20lookup%20version%201.zip (89 days)

It requires Windows, it doesn't have a virus, it assumes UTF8 with BOM for the glossary, it stays resident and is triggered by one of two shortcuts, and it (hopefully) displays the result in your browser.

The script breaks down the clipboard text by word, and then checks to see if any glossary lemma contains any of those words. I'm now going to write a new script that breaks down the glossary instead, and then checks the clipboard for the lemmas... this will (I think) enable multi-word lemmas.

Edited 5 minutes later:
...and I found a bug, yet again (it didn't check the first word of the clipboard).



[Edited at 2011-08-06 14:45 GMT]
Collapse


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 10:20
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
My own script -- second, better version (please test) Aug 6, 2011

I've finished my own script (for now), and I invite anyone here to test it. I combined the two lookup methods and ironed out some bugs, and cleaned the code (so it reads easier, too).

http://wikisend.com/download/402094/glosslookup%20v2.zip (300 kb) (89 days)

GlossLookup looks up words in a sentence in a glossary, and looks up gl
... See more
I've finished my own script (for now), and I invite anyone here to test it. I combined the two lookup methods and ironed out some bugs, and cleaned the code (so it reads easier, too).

http://wikisend.com/download/402094/glosslookup%20v2.zip (300 kb) (89 days)

GlossLookup looks up words in a sentence in a glossary, and looks up glossary terms in the same sentence. Essentially, the script does two separate types of searches, one after the other. In the first search, it checks to see if any of the whole source text of any glossary entries occurs in the sentence. In the second search, it checks if any of the words in the sentence occur in the source text of all glossary entries. The results are output to HTML, then opened in a browser.

1. I assume your glossary is named 'glossary.txt' and that is in the script's own directory
2. I assume that your glossary is UTF8 with so-called BOM (can be changed in the code)

Shortcuts:
Ctrl + C = Copy selection to clipboard (normal Windows function)
WinKey + C = Look up the text on the clipboard
WinKey + V = Attempt to copy all (e.g. whole segment, whole page, etc.) before looking it up
WinKey + Q = Exit the program (or right-click its icon in the systray)

The glossary format is tab-delimited plain text, with at least two columns per line.
Collapse


 
Glossum with OmegaT plugin can now implement basic glossary search Oct 2, 2011

Glossum service has developed the first version of a plugin that integrates Glossum lookup into OmegaT. Users can now lookup values stored on Glossum without need to export files and juggle formats.

Explanation how it works can be found here:
http://glossum.com/omegat

For the moment the result show up in the Machine Translation pane of OmegaT; in the next version of the plugin th
... See more
Glossum service has developed the first version of a plugin that integrates Glossum lookup into OmegaT. Users can now lookup values stored on Glossum without need to export files and juggle formats.

Explanation how it works can be found here:
http://glossum.com/omegat

For the moment the result show up in the Machine Translation pane of OmegaT; in the next version of the plugin the results will move to the Glossary pane.
Collapse


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 10:20
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
Can't get glossum to work Oct 4, 2011

Glossum wrote:
Explanation how it works can be found here:
http://glossum.com/omegat


The plugin is a zip file that (I think) must be renamed to .jar) before adding it to the plugins directory. When it is a jar file, OmegaT refuses to launch (no error messages). When it is a zip file or unzipped, OmegaT launches as usual but the Glossum entry is not present in OmegaT's MT menu.

Samuel


 
Getting Glossum to work Oct 6, 2011

Hello Samuel,

The plugin is a jar file (http://www.glossum.com/OmegaT_plugin_Glossum.jar), and it should remain such. Please place in the plugin directory. You should get the Glossum option in the Machine Translation menu.

Please let me know if that does not work for you.

Regards,
Pavel


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 10:20
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
On Glossum Oct 6, 2011

Glossum wrote:
The plugin is a jar file (http://www.glossum.com/OmegaT_plugin_Glossum.jar), and it should remain such. Please place in the plugin directory. You should get the Glossum option in the Machine Translation menu. ... Please let me know if that does not work for you.


A developer from OmegaT told me that the problem is that OmegaT is compiled with Java 1.5 and that your plugin is compiled with Java 1.6. Do you think that that would cause the problems that I have had (e.g. OmegaT simply not starting)?


 
The re-compiled plugin Oct 9, 2011

Hello Samuel,

Indeed, the problem was with Java incompatibility, thank you for pointing that out.

We have recompiled the plugin in Java 1.5 and placed the new version on the site.

Please let me know if you have any other problems.

Regards,
Pavel


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 10:20
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
Glossum still not working Oct 10, 2011

Glossum wrote:
We have recompiled the plugin in Java 1.5 and placed the new version on the site.


Thanks. I have since discovered that the "home" directory is the directory in Program Files, not the directory in Documents and Settings. If the username/password in the glossum.properties file is wrong (or if the file can't be found), it will say "Login Failed". Also, lookup takes 3 seconds (compared to Google Translate which takes half a second).

I came to the point of not getting "Login Failed" and still getting no matches (just "Glossum Lookup" with a blank line above it). I tried saving glossum.properties in various encodings. I have two glossaries on Glossum (unable to delete the one) and even if both glossaries contain a term, it still doesn't show up in OmegaT. I tried making sure the project's languages are the same as the languages of the glossary in Glossum (tried various ways of writing it) -- and eventually it worked, if I used *uppercase* language codes in OmegaT.

Still, it doesn't quite work, and I'll post a message about it on the OmegaT dev list.


 
New version of plugin Oct 16, 2011

Hello Samuel,

Thanks for pointing out the problems. We have now released a new version of the plugin with the fixes (please download and replace the older one):

1. The garbage values are not returned now, they came from a test glossary that we embarassingly forgot to remove.
2. If the credentials file is not found, the appropriate error is displayed.
3. The language code can now be in either case.

Please let me know whether you have any other is
... See more
Hello Samuel,

Thanks for pointing out the problems. We have now released a new version of the plugin with the fixes (please download and replace the older one):

1. The garbage values are not returned now, they came from a test glossary that we embarassingly forgot to remove.
2. If the credentials file is not found, the appropriate error is displayed.
3. The language code can now be in either case.

Please let me know whether you have any other issues with the plugin.

Thanks for the cooperation,
Pavel
Collapse


 
Update: Glossum and OmegaT 2.5.0_3 Jan 6, 2012

Update: as of OmegaT last stable version (2.5.0_3), the results from Glossum are displayed in the Glossary Pane together with the results from standard glossaries. The details are here: http://glossum.com/omegat

(Glossum now supports as well a simple integration with SDL Multiterm via the SDL Multite
... See more
Update: as of OmegaT last stable version (2.5.0_3), the results from Glossum are displayed in the Glossary Pane together with the results from standard glossaries. The details are here: http://glossum.com/omegat

(Glossum now supports as well a simple integration with SDL Multiterm via the SDL Multiterm Widget, details here: http://glossum.com/sdl).
Collapse


 
Pages in topic:   < [1 2 3] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Freeware glossary lookup tool wanted






CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »
Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »