February 14, 2020
It should be understood that each waybackmachine snapshot creates a small directory tree inside of a parent folder, each of which are assigned names as per the 19970616061813 example tree below.
➜ Blot tree /Users/admin/Desktop/19970616061813
/Users/admin/Desktop/19970616061813
└── dev
└── techsupport
└── insidemac
└── QuickDraw3D
└── chap18_pointingdevice
└── QuickDraw3D-1087.html
The challenge for any download you find on this site invariably requires he child directory to be isolated from the parent so that a merge can occur because the parent has a unique name which means a merge can’t occur from there. One merge occurs an approachable a readable directory emerges.
At first, ditto
was considered but it appears best suited for a single source and destination folder. Until a diff merge or other merge tool is found, child folders must be isolated from the parent and manually dropped to an FTP server. A properly configured FTP client does a masterful job at merging, the results of which yield a directory tree worthy of inspection.
wayback_machine_downloader -s -d devworld.apple.com_onlybin_IA -t2003 -c6 --only "/\.(smi|hqx|sit|dd|pkg|abs|bin|sea|cpt|dmg)$/i" devworld.apple.com
tree -N -shQDF --charset /path/to/parent/dir > devworld.apple.com_onlybin_IA_tree.txt
find /path/to/parent/dir -not -name '.*' -type file -exec basename {} \;
Out of curiosity this capture deliberately scraped beyond the proposed scope of file types usually considered.
wayback_machine_downloader -s -d devworld.apple.com_onlybin_IA -t2003 -c6 --only "/\.(smi|hqx|sit|dd|pkg|abs|bin|sea|cpt|dmg)$/i" devworld.apple.com
tree -N -shQDF --charset /path/to/parent/dir > devworld.apple.com_onlybin_IA_tree.txt
find /path/to/parent/dir -not -name '.*' -type file -exec basename {} \;
If you are interested in the wayback machine learn more about it and the wayback machine API.
Below is a curious URL that provides a way to filter results (i.e. ‘txt’):
http://web.archive.org/*/www.yoursite.com/*