Extracting Text from Website

Basically i'm merging one website into another existing website.

Wondered if there was a "magical" tool that could extract text from the pages of the old site so that i could easily copy & paste them into the new site?
 

webgeek

Free Member
May 19, 2009
4,091
1,464
Glasgow, Scotland, UK
Are the old pages pure html or are they created from a database using php/mysql or other similar tech?

If the former, you could use ftp or any mirroring software like winhttrack to pull down and archive all the files locally then ftp upload them to the new site.

If you could neatly plop that site in a subfolder on the new site, you wouldn't have nearly as much a task of updating any links. There are some easy free tools that will search all files and replace a string.
 
Upvote 0

Clinton

Free Member
  • Business Listing
    Jan 17, 2010
    5,750
    1
    3,070
    ukbusinessbrokers.com
    you could use...winhttrack...

    ...There are some easy free tools that will search all files and replace a string.
    Does he need another tool? If I remember correctly, winhttrack offers the option to modify links to make them absolute / make other path changes. But it's been a while since I used the software.
     
    Upvote 0

    UKcentric

    Free Member
    Jun 7, 2011
    176
    25
    London
    How is your existing site built and how many pages are there? The options for extracting the data totally depend on that.

    Assuming you have no access to the site files or Content Management System, you would have to scrape the site. Scraping means using a piece of software to grab text from all or parts of each page.

    http://www.httrack.com/ is a piece of software which will download the whole site to your local computer, as raw HTML files. The user interface leaves a lot to be desired and can be confusing for beginners.

    There are many browser extensions available for Chrome, Firefox etc. which will scrape pages. Go to the Google Chrome Web Store and search for "scraper".

    If there are only a few pages then the quickest and easiest way would be to copy and paste the text by hand.
     
    Upvote 0

    Latest Articles