Can you download a whole website in one go? Thread poster: Christopher Schröder
|
I have been asked to check the English version of a bilingual website against the foreign version. The customer wants me to send my corrections as tracked changes in Word. Is there an easy way to extract the text from all pages of the website into one or more Word files? Or do I have to go to every page individually, save it on my computer in html and then open it in Word? Thanks for any ideas! | | |
DZiW (X) Ukraine English to Russian + ...
There're many alike but my fav is Teleport. | | |
A couple of things... | Oct 26, 2011 |
You try using the HTTRack Website copier, here: http://www.httrack.com/ Or, if you have the Windows commandline version of "wget" you can use that: http://www.gnu.org/software/wget/ Both ways download the HTML files, which you would then have to open in Word... Th... See more You try using the HTTRack Website copier, here: http://www.httrack.com/ Or, if you have the Windows commandline version of "wget" you can use that: http://www.gnu.org/software/wget/ Both ways download the HTML files, which you would then have to open in Word... There are other shareware and freeware applications out as well, just Google for "website downloader"... ▲ Collapse | | |
Max Chernov Russian Federation Local time: 23:01 Russian to German + ... It's many possibilities... | Oct 26, 2011 |
That means, many programs, which let's to make the whole copy of a web-site... Offline Explorer Pro, Teleport Pro, Webcopier... | |
|
|
David Wright Austria Local time: 22:01 German to English + ...
should have the files you need, rather than expecting you to download it yourself (as far as I know- but I'm no expert - you have to do it page by page) | | |
You could use this program | Oct 26, 2011 |
http://www.httrack.com/ It will download entire website, afterwards you can open required HTML files in MS Word. Mind that it will download entire website not just the files you need. Cheers S | | |
Thanks Stanislav but... | Oct 26, 2011 |
I tried HTtrack but gave up after an hour as it downloads everything, including big pdfs and images, and I only want the text! | | |
Samuel Murray Netherlands Local time: 22:01 Member (2006) English to Afrikaans + ...
Chris S wrote: I tried HTtrack but gave up after an hour as it downloads everything, including big pdfs and images, and I only want the text! Surely there is an option in the download task to specify what files should (or should not) be downloaded? You should also be able to set the crawl depth (how many subdirectories down) and whether files from other domains should be downloaded or not. | |
|
|
Thanks Samuel | Oct 26, 2011 |
You're right, and now I have the files! But I still have to get them into Word, where every file is grey text on a black background and contains the whole menu and side bars and everything. If only people still used frames, eh? | | |