Time |
Nickname |
Message |
00:26
π
|
|
ephemer0l has quit IRC (Quit: http://quassel-irc.org - Chat comfortably. Anywhere.) |
00:31
π
|
|
ephemer0l has joined #archiveteam-bs |
01:52
π
|
|
killsushi has joined #archiveteam-bs |
02:32
π
|
Somebody2 |
continuing from #archiveteam, Raccoon |
02:32
π
|
Raccoon |
ok |
02:33
π
|
Somebody2 |
there's no index of stuff that was grabbed by archiveteam and not put up on archive.org -- that's correct. |
02:33
π
|
Raccoon |
that's a very specific request I didn't make though :) |
02:33
π
|
Somebody2 |
The archiveteam wiki might provide some vague hints |
02:33
π
|
Raccoon |
I see what you did there |
02:34
π
|
Somebody2 |
and as for an index of stuff grabbed by archiveteam and put up on archive.org -- for that, you can just use the archive.org search |
02:34
π
|
Raccoon |
so what about the stuff that archiveteam grabbed, while forgetting that archive.org even exists |
02:34
π
|
Somebody2 |
that would only be documented on the archiveteam wiki |
02:35
π
|
Somebody2 |
generally, most efforts by archiveteam are documented there |
02:35
π
|
Raccoon |
ok, so there is a complete file index somewhere on the wiki |
02:35
π
|
Somebody2 |
but not particularly completely or consistently |
02:35
π
|
Somebody2 |
a file index? heck no. It's an index of *projects* -- i.e. "the time when we grabbed this recipie website" |
02:36
π
|
Somebody2 |
or "the time we got yelled at by Apple for trying to grab something they were trying to remove" |
02:36
π
|
Somebody2 |
(these are hypothetical examples) |
02:36
π
|
Raccoon |
most projects are comprised of files with file names, right? |
02:37
π
|
Somebody2 |
most projects are lists of URLs that are grabbed and stored into WARC files. |
02:37
π
|
Somebody2 |
and then posted to archive.org and included in the Wayback Machine |
02:37
π
|
Somebody2 |
that's the usual flow |
02:37
π
|
Raccoon |
hmm |
02:38
π
|
Raccoon |
that's the kicker. you can't really search for a *\filename.here query on archive.or |
02:38
π
|
Raccoon |
.org |
02:38
π
|
Somebody2 |
yes you can |
02:38
π
|
Raccoon |
one needs to know the domain name for which they want to find said file |
02:38
π
|
Somebody2 |
ah, not in the Wayback Machine, I see |
02:38
π
|
Somebody2 |
yeah, the Wayback Machine search is still somewhat limited |
02:39
π
|
Somebody2 |
although I think it allows more than full URL searches (but I think it is still limited to domains) |
02:39
π
|
Raccoon |
and you say that archiveteam doesn't keep a master index of urls that can be grepped for \\filename\.here |
02:45
π
|
Raccoon |
Hopefully if the day comes where we need to find all ten parts of the disarm code to the doom's day machine (cleverly hosted on 10 different free hosting websites between 1998 and 2005, under the filename disarmcode.html), the search engine will be prepared for this :P |
03:03
π
|
Somebody2 |
hopefully! |
03:03
π
|
Somebody2 |
I mean, if someone wanted to throw enough money at one of the cloud services, making such a filename index at least of the stuff ... |
03:04
π
|
Somebody2 |
... grabbed by archiveteam would absolutely be possible. |
03:04
π
|
Somebody2 |
And I don't see any reason it wouldn't be welcomed. |
03:04
π
|
* |
Flashfire throws monopoly money |
03:05
π
|
|
Maylay has quit IRC (Pipe Terminated) |
03:05
π
|
* |
Somebody2 catches the monopoly money, stares at it (it says: "Valid only on AWS; expires in 1995"), and throws it back |
03:07
π
|
|
Maylay has joined #archiveteam-bs |
03:07
π
|
|
Maylay has quit IRC (Remote host closed the connection!) |
03:08
π
|
|
Maylay has joined #archiveteam-bs |
03:10
π
|
kiska |
I assume we know about tinypic shutting down? |
03:11
π
|
kiska |
https://server8.kiska.pw/uploads/397c396a6c83b1bc/unknown.png |
03:12
π
|
Raccoon |
:( |
03:13
π
|
|
wyatt8740 has quit IRC (Remote host closed the connection) |
03:16
π
|
Raccoon |
Somebody2: that's one of the things that's carried me through the years in completing collections. Getting lucky with a Google intitle:"index of" search of known filenames already in the set, finding somebody else hosting them, and hitting the jackpot |
03:16
π
|
Raccoon |
since other people have picked up on that, this has really gone away for the most part. (i discovered the trick before it was hipster) |
03:17
π
|
Raccoon |
would also clue me in on the merits of contributing some of my junk |
03:17
π
|
Somebody2 |
Raccoon: hm? |
03:18
π
|
Raccoon |
hmm? |
03:18
π
|
arkiver |
ah shit |
03:18
π
|
arkiver |
kiska: letΒ΄s think of a channel |
03:19
π
|
dxrt |
oh shit |
03:20
π
|
Raccoon |
how many dick pics do you suppose they've hosted over the years? |
03:20
π
|
kiska |
I think #ohshit is applicable |
03:20
π
|
dxrt |
#tinydick |
03:20
π
|
Raccoon |
jinxed dxrt |
03:20
π
|
arkiver |
nice |
03:20
π
|
dxrt |
haha |
03:21
π
|
arkiver |
kiska: agree with #tinydick? |
03:21
π
|
kiska |
Yep |
03:21
π
|
arkiver |
awesome |
03:22
π
|
Raccoon |
can't wait for mega.nz to shut down |
03:23
π
|
* |
Raccoon squats on #megadick for the puns |
03:23
π
|
kiska |
Please no mega.nz there is so much js on there |
03:24
π
|
Raccoon |
don't they have an api |
03:24
π
|
Raccoon |
mobile apps |
03:33
π
|
ivan_ |
https://github.com/meganz/MEGAcmd |
03:58
π
|
|
qw3rty119 has joined #archiveteam-bs |
04:04
π
|
|
qw3rty118 has quit IRC (Read error: Operation timed out) |
04:15
π
|
|
Pixi` has quit IRC (Read error: Connection reset by peer) |
04:15
π
|
|
Pixi has joined #archiveteam-bs |
04:17
π
|
|
d5f4a3622 has quit IRC (Read error: Connection reset by peer) |
04:19
π
|
|
d5f4a3622 has joined #archiveteam-bs |
04:46
π
|
|
Pokemonpr has joined #archiveteam-bs |
04:47
π
|
Pokemonpr |
Hey, question. Is there a way to check what the bot has archived? |
04:55
π
|
ivan_ |
Which bot |
04:55
π
|
Pokemonpr |
...uhm, the main one that auto-archives smaller sites? Not sure if it has a proper name |
04:56
π
|
ivan_ |
See #archivebot topic |
04:56
π
|
Pokemonpr |
There's a site I suggested for it to archive a little while ago; but I'm not sure if it got done. The site went from "Sort of dead but not in real danger" to "We're shutting down August 1st" and I wanted to check |
04:56
π
|
Pokemonpr |
Thank you |
04:56
π
|
Pokemonpr |
Haven't been here in a little while, sorry if this was a dumb quesiton |
04:57
π
|
ivan_ |
http://archive.fart.website/archivebot/viewer/ |
04:59
π
|
|
godane1 has joined #archiveteam-bs |
05:00
π
|
|
godane has quit IRC (Read error: Operation timed out) |
05:06
π
|
Pokemonpr |
Can't see it there.. Could I request something then? |
05:58
π
|
dxrt |
Pokemonpr: what site? |
06:03
π
|
|
m007a83 has quit IRC (Read error: Operation timed out) |
06:07
π
|
Pokemonpr |
dxrt it's amesfanclub.com ; it recently announced out of the blue that it was going down August 1st due to reasons out of their control. |
06:08
π
|
|
Dimtree has joined #archiveteam-bs |
06:11
π
|
dxrt |
Pokemonpr: I've added it. |
06:34
π
|
Pokemonpr |
thank you |
07:58
π
|
|
m007a83 has joined #archiveteam-bs |
09:45
π
|
|
VerifiedJ has joined #archiveteam-bs |
10:23
π
|
JAA |
Somebody2: How does one search by filename on IA? I'm not aware of any method to do so. It only searches item metadata (identifier, title, description, etc.), not the names of files inside items. |
11:00
π
|
|
BlueMax has quit IRC (Quit: Leaving) |
12:07
π
|
|
BIER has joined #archiveteam-bs |
12:07
π
|
|
BIER has quit IRC (Client Quit) |
12:18
π
|
|
killsushi has quit IRC (Quit: Leaving) |
13:04
π
|
|
HashbangI has quit IRC (Remote host closed the connection) |
13:12
π
|
|
HashbangI has joined #archiveteam-bs |
14:34
π
|
Somebody2 |
um, let me look. |
14:35
π
|
Somebody2 |
if nothing else, you can download the (now rather out of date) IA census and use that. |
14:35
π
|
Somebody2 |
that may be the only way, indeed. |
15:10
π
|
|
Hani has quit IRC (Quit: Hani) |
15:17
π
|
|
Verified_ has quit IRC (Ping timeout: 252 seconds) |
15:47
π
|
fallenoak |
All you really need to search by files in the projects that produced megawarcs is the cdx files (which are fortunately fairly small) |
15:47
π
|
fallenoak |
At least, that's what I did when I wanted to comb through the files in the GameFront grab |
15:49
π
|
|
wyatt8740 has joined #archiveteam-bs |
16:05
π
|
Igloo |
betamax: CDβs sorted |
16:06
π
|
|
schbirid has joined #archiveteam-bs |
16:16
π
|
betamax |
Yay! |
16:29
π
|
Somebody2 |
fallenoak: yes, but you need to know which project you care about first :-) |
16:38
π
|
|
Hani has joined #archiveteam-bs |
17:00
π
|
Igloo |
if http_stat.statcode == 500 then |
17:00
π
|
Igloo |
-- try again |
17:00
π
|
Igloo |
woops |
17:16
π
|
|
DogsRNice has joined #archiveteam-bs |
18:15
π
|
|
Joseph_ has joined #archiveteam-bs |
18:15
π
|
|
VerifiedJ has quit IRC (Read error: Connection reset by peer) |
18:26
π
|
|
antomati_ is now known as antomatic |
18:41
π
|
|
Dallas has joined #archiveteam-bs |
18:50
π
|
|
hi has joined #archiveteam-bs |
19:45
π
|
|
VerifiedJ has joined #archiveteam-bs |
19:45
π
|
|
Joseph_ has quit IRC (Read error: Connection reset by peer) |
19:46
π
|
|
VerifiedJ has quit IRC (Read error: Connection reset by peer) |
19:47
π
|
|
VerifiedJ has joined #archiveteam-bs |
19:50
π
|
|
Ravenloft has joined #archiveteam-bs |
19:57
π
|
|
DogsRNice has quit IRC (Ping timeout: 252 seconds) |
19:58
π
|
|
Dj-Wawa has joined #archiveteam-bs |
19:59
π
|
|
DogsRNice has joined #archiveteam-bs |
20:07
π
|
|
hi has quit IRC (Quit: Page closed) |
20:19
π
|
|
DogsRNice has quit IRC (Ping timeout: 252 seconds) |
20:28
π
|
|
Smiley has quit IRC (Ping timeout: 265 seconds) |
20:29
π
|
|
Smiley has joined #archiveteam-bs |
20:38
π
|
|
schbirid has quit IRC (Remote host closed the connection) |
20:39
π
|
|
DogsRNice has joined #archiveteam-bs |
20:59
π
|
|
Ravenloft has quit IRC (Remote host closed the connection) |
21:42
π
|
|
VerifiedJ has quit IRC (Read error: Connection reset by peer) |
21:42
π
|
|
VerifiedJ has joined #archiveteam-bs |
22:28
π
|
|
Dj-Wawa has quit IRC (Quit: Connection closed for inactivity) |
22:29
π
|
|
Dj-Wawa has joined #archiveteam-bs |
22:30
π
|
|
fredgido_ has quit IRC (Read error: Operation timed out) |
22:36
π
|
|
dashcloud has quit IRC (Remote host closed the connection) |
22:37
π
|
|
dashcloud has joined #archiveteam-bs |
23:22
π
|
|
BlueMax has joined #archiveteam-bs |
23:26
π
|
|
fredgido has joined #archiveteam-bs |
23:26
π
|
|
SmileyG has joined #archiveteam-bs |
23:27
π
|
|
Smiley has quit IRC (Read error: Operation timed out) |
23:32
π
|
|
benjinsmi has joined #archiveteam-bs |
23:33
π
|
|
benjins has quit IRC (Ping timeout: 252 seconds) |
23:42
π
|
|
benjins has joined #archiveteam-bs |
23:43
π
|
|
benjinsmi has quit IRC (Ping timeout: 604 seconds) |
23:47
π
|
|
VerifiedJ has quit IRC (Read error: Operation timed out) |