Dear colleagues,
I thought some of us will find this piece which just appeared in the
New Yorker on the Wayback Machine interesting. This "machine" is
intended to archive all the world wide web content. The Wayback
Machine has already archived more than four hundred and thirty billion
Web pages!
http://www.newyorker.com/magazine/2015/01/26/cobweb
Regards /
Ramzi Mabsout
<<The Wayback Machine is a Web archive, a collection of old Web pages;
it is, in fact, the Web archive. There are others, but the Wayback
Machine is so much bigger than all of them that it’s very nearly true
that if it’s not in the Wayback Machine it doesn’t exist. The Wayback
Machine is a robot. It crawls across the Internet, in the manner of
Eric Carle’s very hungry caterpillar, attempting to make a copy of
every Web page it can find every two months, though that rate varies.
(It first crawled over this magazine’s home page, newyorker.com, in
November, 1998, and since then has crawled the site nearly seven
thousand times, lately at a rate of about six times a day.) The
Internet Archive is also stocked with Web pages that are chosen by
librarians, specialists like Anatol Shmelev, collecting in subject
areas, through a service called Archive It, at archive-it.org, which
also allows individuals and institutions to build their own archives.
(A copy of everything they save goes into the Wayback Machine, too.)
And anyone who wants to can preserve a Web page, at any time, by going
to archive.org/web, typing in a URL, and clicking “Save Page Now.” >>
|