Author Topic: Archiving online resources  (Read 1661 times)

0 Members and 1 Guest are viewing this topic.

Offline CharlesL

  • Administrator
  • Full Member
  • *****
  • Posts: 62
    • View Profile
    • Email
Archiving online resources
« on: May 15, 2020, 02:00:09 pm »
Phyllis:  I wasted horus of my taime late last night attempting to download software that actually collects web sites.  What is happening is that the design of that software is fighting a change in Intenet protocols and as a result I cannot get some of it to actually download a web site with all sorts of info we wish to save for the library is blocked.  CharlesL, you might help us if you can figure out what process we can use we do not know about to download Inaternet sites for archive storage and processing later for public use.  There is software but where and name?  that can do this as there is a huge institution which is already doing this that is filling a warehouse of archive boxes and folders to do just what we neeed to do:  preseerve and entire wweb site as a download that allows them to be r eproduced for historical research.  We need copies of that and fast to archive hundreds of useful web sites and their data.

A standard format used for archiving websites is WARC (Web ARChive, ISO 28500). WARC is used by the Internet Archive, Library of Congress, and many national libraries. There are many free programs that can crawl a website and generate a WARC file with the contents.

For one that looks easiest to use, I suggest Web Archiving Integration Layer (WAIL): https://machawk1.github.io/wail/
WAIL is an application you can run on your computer to archive websites in WARC format, and browse the resulting archives. It uses some of the tech used by Internet Archive's Wayback Machine.

There is also a Chrome Extension WARCreate. That lets you archive one page at a time as you are browsing them in Chrome/Chromium. The archive it generates includes images and other resources the web page you are viewing uses. To view the archive you need another program, such as WAIL.

Offline Ron Besser

  • Administrator
  • Hero Member
  • *****
  • Posts: 5521
    • View Profile
    • Email
Re: Archiving online resources
« Reply #1 on: May 15, 2020, 04:34:03 pm »
I am most grateful for this Charles.  We all thank you.  Ron
« Last Edit: May 15, 2020, 04:46:30 pm by Ron Besser »
Located in Historic York, Pennsylvania