WFC not cleaning up files
Autor wątku: neilmac
Local time: 01:25
hiszpański > angielski
+ ...
Nov 5, 2019

I've recently been sent some word files to translate, which the client had converted from Excel. WordFast Classic isn't cleaning them up - and the report file shows this:

segments added: 0
segments updated: 0
bad segments: 0
segments words char.
SOURCE (0 tags) 0 0 0
TARGET (0 tags) 0 0 0

Jean Lachaud
Jean Lachaud  Identity Verified
Local time: 19:25
angielski > francuski
+ ...
You did select... Nov 5, 2019

... the documents to be cleaned up in Tools | Docs, and checked at least one box at the bottom of that tab, didn't you?

Local time: 01:25
hiszpański > angielski
+ ...
If at first you don't succeed… Nov 5, 2019

Jean Lachaud wrote:

... the documents to be cleaned up in Tools | Docs, and checked at least one box at the bottom of that tab, didn't you?

Yes, and I just did it again, with the same result. The funny thing is, this is the third document of roughly 30. The first one I translated didn't clean up, so I extracted the contents as TXT only, then translated and cleaned up the document no problem. I told the client about it, and he sent me the second document in an earlier format, Word 2003, which I managed to translate and clean up no problem. However, the one I've just done is the third, and I don't understand why it wouldn't clean up today, when it did so yesterday, no problem. At the moment, I'm working round the problem by simply going through the document and removing all the tags and source segments, but this will add about 10 minutes on to every job, so I'd like to find out why this is happening. The client wants to be able to paste the translated text back into Excel files, which is why I can't just give him the translated text unformatted.

Seems to be some kind of compatibility issue. I tried on another PC as well and got the same result. Removing the tags is taking forever...

[Edited at 2019-11-05 21:00 GMT]

Jean Lachaud
Jean Lachaud  Identity Verified
Local time: 19:25
angielski > francuski
+ ...
Did you try... Nov 5, 2019

... restarting Word? Rebooting your computer? Re-installing WfC?

I know, it sounds corny, and you most likely did all that.

What version of WfC are you using?

Samuel Murray
Samuel Murray  Identity Verified
Local time: 01:25
od 2006

angielski > afrikaans
+ ...
@Neilmac Nov 6, 2019

neilmac wrote:
WFC isn't cleaning them up...

It happens now and then that WFC doesn't clean a file.

Cleaning has two functions, namely to add translations to the TM and to remove the source text and hidden tags. When cleaning up doesn't work for me, to get the former, I simply copy everything to Notepad and then copy everything back into a blank Word file, and clean it up. To get the latter, I use find/replace to ensure that all text except that which should remain, is hidden, and then I delete all hidden text using find/replace.

neilmac wrote:
At the moment, I'm working round the problem by simply going through the document and removing all the tags and source segments, but this will add about 10 minutes on to every job, so I'd like to find out why this is happening.

Five to ten minutes sounds about right. Do you know how to use find/replace with wildcards and how to specify formatting in find/replace? You're not removing the tags manually (one by one), surely!

[Edited at 2019-11-06 06:50 GMT]

Local time: 01:25
hiszpański > angielski
+ ...
Didn't work either Nov 6, 2019

Samuel Murray wrote:

neilmac wrote:
WFC isn't cleaning them up...

It happens now and then that WFC doesn't clean a file.

Cleaning has two functions, namely to add translations to the TM and to remove the source text and hidden tags. When cleaning up doesn't work for me, to get the former, I simply copy everything to Notepad and then copy everything back into a blank Word file, and clean it up. To get the latter, I use find/replace to ensure that all text except that which should remain, is hidden, and then I delete all hidden text using find/replace.

neilmac wrote:
At the moment, I'm working round the problem by simply going through the document and removing all the tags and source segments, but this will add about 10 minutes on to every job, so I'd like to find out why this is happening.

Five to ten minutes sounds about right. Do you know how to use find/replace with wildcards and how to specify formatting in find/replace? You're not removing the tags manually (one by one), surely!

[Edited at 2019-11-06 06:50 GMT]

Yes, I removed the tags manually, it took me about half an hour, which is too long. I tried your suggestion (copy everything to Notepad and then copy everything back into a blank Word), which I assume means to select everything in the translated/uncleaned and paste it into a .txt file, which does remove everything, but unfortunately it also removes the "table/textboxes" vestigial Excel template which the client needs in order to be able to paste the translated text back into the original Excel files.
What I don't understand, is that a similar file, the second one the client sent me, cleaned up no problem, unlike the first and third ones. At the end of the day, with this type of text (verbatim interviews), I can most likely translate them more quickly without using WFC and the TMs, as the only things that get repeated are brief answers like "No/I don't know". However, I'd still like to be able to solve the original issue without workarounds.

As for "find/replace with wildcards and how to specify formatting in find/replace", I know how to use find/replace, but the wildcards bit I'm not so sure about. Or how to specify formatting.

[Edited at 2019-11-06 09:03 GMT]

[Edited at 2019-11-06 09:03 GMT]

Samuel Murray
Samuel Murray  Identity Verified
Local time: 01:25
od 2006

angielski > afrikaans
+ ...
@Neilmac Nov 6, 2019

neilmac wrote:
I tried your suggestion (copy everything to Notepad and then copy everything back into a blank Word), which I assume means to select everything in the translated/uncleaned and paste it into a .txt file, which does remove everything, but unfortunately it also removes the "table/textboxes" vestigial Excel template which the client needs in order to be able to paste the translated text back into the original Excel files.

The "pasting into Notepad" method is simply for if you need to update the TM. You can't use that method if you need to deliver to the client.

What I don't understand, is that a similar file, the second one the client sent me, cleaned up no problem, unlike the first and third ones.
However, I'd still like to be able to solve the original issue without workarounds.

Yes, that's Word, and that's WFC. There is just something odd about that file, and no-one knows what. Sorry to be so negative, but this is standard WFC wisdom: Word is a very complex tool and many things can go wrong, so it might take an expert at Word to figure out what is wrong with that one particular Word file.

I know how to use find/replace, but the wildcards bit I'm not so sure about. Or how to specify formatting.

It's easy to understand but hard to explain.

But if you're confident that all the tags and all the source text in your file is marked as "hidden text", and no other text is marked as "hidden text", then you can do this simple find/replace:

In the Find/Replace dialog (Ctrl+H), click the "More" button to expand the dialog, check the "Wildcards" option, then put a single question mark in the "Find" field, and then (after making sure sure your cursor is in the Find field) click the Format button, select Font, and then make sure "hidden" is checked (you may have to click it several times until it is checked), then make sure the Replace field is empty and does not mention any formatting, and press Replace All.

(I use this method often, since I often need to clean up files with tracked changes in them, and as you know, WFC itself refuses to clean up files with tracked changes in them.)

The question mark means "any character" in a wildcards search, and if you specify the Find formatting as "hidden" and you make sure the Replace field is empty *and* has no formatting specified, such a search will delete all characters with the format specified in the Find field. If you specify a formatting in the Replace field but no text, it'll replace all characters' formatting, which isn't what you want. An asterisk also means "any character" but it doesn't work on all text (e.g. doesn't work inside text boxes). Remember, a find/replace operation is performed on all visible text, so make sure hidden text is visible first (Ctrl+comma on my computer).

If later, you have more time, experiment with this:


[Edited at 2019-11-06 11:15 GMT]

Local time: 01:25
hiszpański > angielski
+ ...
Cheers Nov 6, 2019

Samuel Murray wrote:


The question mark means "any character" in a wildcards search, and if you specify the Find formatting as "hidden" and you make sure the Replace field is empty *and* has no formatting specified, such a search will delete all characters with the format specified in the Find field. If you specify a formatting in the Replace field but no text, it'll replace all characters' formatting, which isn't what you want. An asterisk also means "any character" but it doesn't work on all text (e.g. doesn't work inside text boxes). Remember, a find/replace operation is performed on all visible text, so make sure hidden text is visible first (Ctrl+comma on my computer).

If later, you have more time, experiment with this:


[Edited at 2019-11-06 11:15 GMT]

OK. The client just sent me another file - shorter this time. The same thing happened, so I'm removing the tags by find and replace, but that still leaves the source text segments that need to be removed manually. I'm almost finished now, but I'll try your wildcard suggestion later. Thanks for helping out with this, I really appreciate it.

Local time: 01:25
hiszpański > angielski
+ ...
WTF? Nov 6, 2019

I was halfway through removing the source text and tags manually when I stopped to look at the forum posts. After responding to Samuel, I answered an email and then went back to the semi-cleaned document. However, it seems to have cleaned itself up while I was away somehow. I'm totally baffled, but something I did must have worked. Unless it's WF fairies... Whatever, it's a win for me. I must try to remember exactly what I did...

It's the Control+comma thing. The original text is st
... See more
I was halfway through removing the source text and tags manually when I stopped to look at the forum posts. After responding to Samuel, I answered an email and then went back to the semi-cleaned document. However, it seems to have cleaned itself up while I was away somehow. I'm totally baffled, but something I did must have worked. Unless it's WF fairies... Whatever, it's a win for me. I must try to remember exactly what I did...

It's the Control+comma thing. The original text is still there, but hidden. Now all I need to do is delete the hidden text and that should be it....

[Edited at 2019-11-06 16:43 GMT]

Can't seem to delete the hidden text, despite following all the instructions. Need to walk the dog now before nightfall, I'll have another go when I get back....

[Edited at 2019-11-06 16:50 GMT]

Samuel Murray
Samuel Murray  Identity Verified
Local time: 01:25
od 2006

angielski > afrikaans
+ ...
Manually deleting WFC source text, part 1 Nov 6, 2019

To manually delete the source text and purple tags (this method only works if the purple tags are still there, although they don't have to be purple).

1. Make sure the source text and purple tags are visible (try pressing Ctrl+comma a few times).

2. Press Ctrl+H to open the Find/Replace dialog.


3. Click the "More" button to expand the Find/Replace dialog.


4. Put a checkmark next to "Wildcards", and type \ { 0 \ > * \ { \ > in the Find field (without the spaces).


5. Replace all.

6. Now, remove the checkmark next to "Wildcards, and type < 0 } (without the spaces).


7. Replace all.

Optional, if you want to test it to see if it'll delete the right stuff, do this: make sure your highlight colour is not white, then in the Find/Replace dialog, click inside the Replace field, then go Format > Highlight. The word "Highlight" should then appear directly underneath the Replace field. Then replace all -- it will highlight all text that would otherwise have been deleted. To remove formatting from either the Find or Replace field, click inside the Find or Replace field, and click the "Remove formatting" button.

Samuel Murray
Samuel Murray  Identity Verified
Local time: 01:25
od 2006

angielski > afrikaans
+ ...
Manually deleting WFC source text, part 2 Nov 6, 2019

To manually delete the source text (this method only works if the source text is marked as hidden text, although this method will delete all hidden text, including purple tags if any, but including any other hidden text that the client might have hidden).

1. Make sure the hidden text is visible (try pressing Ctrl+comma a few times).

2. Press Ctrl+H to open the Find/Replace dialog.


3. Click the "More" button to expand the Find/Replace dialog.


4. Put a checkmark next to "Wildcards", type a question mark in the Find field, and then click the Format button and select "Font".


5. In the Font dialog that pops up, make sure there is a checkmark next to "Hidden text". You may have to click it a few times to get the checkmark.


6. And then click OK, obviously, to get back to the Find/Replace dialog, which should now look like this:


7. Replace all.

Local time: 01:25
hiszpański > angielski
+ ...
Thanks for the heads-up! Nov 7, 2019

Samuel Murray wrote:

To manually delete the source text (this method only works if the source text is marked as hidden text, although this method will delete all hidden text, including purple tags if any, but including any other hidden text that the client might have hidden).


7. Replace all.

Cheers Samuel, I'll try that out when the next file arrives, probably later this evening...

[Edited at 2019-11-07 16:09 GMT]

[Edited at 2019-11-07 16:09 GMT]

Local time: 01:25
hiszpański > angielski
+ ...
FWIW Sep 12, 2022

Three years later, and I just received something similar. The same thing is happening again, after translating the document, it doesn't seem to want clean-up. I'm just back from a funeral and the booze up afterwards, so I'm a bit woozy, so maybe it's me.

A copy of original document into a text only format, then save it in Word format with another name, then run WFC And the clean-up on it and see what happens.


To report site rules violations or get help, contact a site moderator:

You can also contact site staff by submitting a support request »

WFC not cleaning up files

Your smart companion app

Pastey is an innovative desktop application that bridges the gap between human expertise and artificial intelligence. With intuitive keyboard shortcuts, Pastey transforms your source text into AI-powered draft translations.

Find out more »
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »