[01:53] *** username1 has joined #archiveteam-bs [01:55] *** schbirid2 has quit IRC (Read error: Operation timed out) [01:59] *** pizzaiolo has joined #archiveteam-bs [02:28] *** j08nY has quit IRC (Remote host closed the connection) [02:31] *** Mateon1 has quit IRC (Ping timeout: 268 seconds) [02:32] *** Mateon1 has joined #archiveteam-bs [03:21] *** Ravenloft has joined #archiveteam-bs [03:23] *** pizzaiolo has quit IRC (Remote host closed the connection) [03:26] so, i have my first warc and it is confirmed working and playable with web archive player 1.4.7 [03:26] now what? [03:26] *** qw3rty111 has joined #archiveteam-bs [03:28] https://archive.org/upload/ [03:29] i guess the web uploader has settings to allow for a 1GB+ warc that might take half an hour to upload [03:30] do i throw this cdx file in with it? [03:30] *** qw3rty119 has quit IRC (Read error: Operation timed out) [03:32] *** fie has quit IRC (Ping timeout: 250 seconds) [03:35] eh, we'll try it [03:42] you don't need to, it will generate one [03:42] shouldn't hurt anything though [03:43] we will find out if it hurt anything eventually [03:43] i suspect nothing will catch fire [03:52] *** Asparagir has joined #archiveteam-bs [04:02] *** Stilett0 has quit IRC (Read error: Operation timed out) [04:17] *** Stilett0 has joined #archiveteam-bs [04:22] *** Sk1d has quit IRC (Ping timeout: 250 seconds) [04:22] *** TheLovina has quit IRC (Ping timeout: 370 seconds) [04:22] *** TheLovina has joined #archiveteam-bs [04:26] https://archive.org/details/dep.ca does this look right? [04:29] *** Sk1d has joined #archiveteam-bs [04:32] seems ok [04:33] I'd recommend putting a crawl date on it as well [04:35] would that just be the generic date field? wasn't sure if that was supposed to be the age of the data or the archive [04:45] this is how IA's go in https://archive.org/details/WIDE-20170819012251-crawl802 [04:45] I don't think the details matter that much as long as it's basically sensible [04:46] something so you can tell it apart if the same site gets crawled more than once [04:46] ah i see, firstfiledate and lastfiledate [04:47] those are probably internal use fields [04:47] the exact timestamps are stored in the warc itself [04:51] yeah I think it generates those fields when they process it for the wayback machine, here's one I uploaded and I never filled in firstfiledate etc https://archive.org/details/archive.pdp11.org.ru-20130504 [04:51] but I put a date in the title and item name for identification when browsing through [04:53] okay, i think i'm starting to understand the logic/structure of things [04:53] i need to recategorize this as a "just in time grab" as the company's bankrupt so i just change the collection name to archiveteam-fire yeah? [04:54] you can't, only IA admins have access to do that [04:58] for now adding archiveteam to the subject keywords would be good [05:00] btw for small jobs like this we have #archivebot which does the crawl and upload automatically if you just give it a base URL [05:00] cool, archivebot was brought up but this was a good learning experience for me as well [05:22] *** Asparagir has quit IRC (Asparagir) [05:56] *** Honno has joined #archiveteam-bs [06:08] *** alfie has quit IRC (Ping timeout: 260 seconds) [06:55] *** BlueMaxim has joined #archiveteam-bs [07:08] *** Ravenloft has quit IRC (Read error: Connection reset by peer) [07:12] *** Honno_ has joined #archiveteam-bs [07:22] *** Honno has quit IRC (Read error: Operation timed out) [07:26] *** schbirid2 has joined #archiveteam-bs [07:29] *** username1 has quit IRC (Read error: Operation timed out) [07:45] *** fie has joined #archiveteam-bs [09:26] *** pie_ has joined #archiveteam-bs [09:26] hi guys, is there a way to get archive to start recursively archiving a website? [09:28] also hm, so archive.is isnt a frontend for archive.org? [09:31] *** zyphlar has joined #archiveteam-bs [09:31] it isn't [09:31] for archiving a website see #archivebot [09:31] *** schbirid2 has quit IRC (Quit: Leaving) [09:32] pie_ ^ [09:33] thanks [09:57] *** schbirid has joined #archiveteam-bs [10:04] *** j08nY has joined #archiveteam-bs [10:04] bluesoul: I downloaded the torrent, what do I do with a .warc file ? [10:05] cherish it [10:05] archive it [10:05] love it [10:05] hexdump it [10:06] :) [10:19] I got it to work with webarchiveplayer.exe [10:19] sweet! thanks a lot :) [10:19] Seems like it got everything [11:07] *** Mateon1 has quit IRC (Remote host closed the connection) [11:11] *** BlueMaxim has quit IRC (Read error: Operation timed out) [11:51] *** ld1 has quit IRC (Ping timeout: 260 seconds) [11:52] *** Mateon1 has joined #archiveteam-bs [12:00] *** zyphlar has quit IRC (Quit: Connection closed for inactivity) [13:24] *** RichardG has quit IRC (Read error: Connection reset by peer) [13:24] *** RichardG has joined #archiveteam-bs [14:00] SketchCow, replied on reddit [14:06] Sorry, I never assume I know the people, even though I often do. [14:07] Just give me the link when I can grab it, I'll pull them down, make a collection, shove them up. [14:07] I wish someone would write descriptions of them [14:09] SketchCow, sound, his upload is going steady so I'm not syncing to the dhevel server until it's all done then I'll put it in all the places, post the torrent and nudge you here. [14:11] At his current speed it should be done in a little over 5 hours. [14:13] Great. [14:13] I'll be out on a date, will be back tonight, we'll have a nice collection [14:16] Ohh sweet, have fun! I'll be having a bbq myself so late tonight works :D [14:23] what is this about? [14:51] the fuck https://www.youtube.com/watch?v=CO-NaKJIXPA [14:51] fuck wrong channel [15:35] this fella needs some archival directed at them, I think: https://twitter.com/themaddimension [15:35] one of the 'unite the right' people, apparently deleting tweets [16:25] *** BnAboyZ66 has quit IRC (Ping timeout: 260 seconds) [16:27] *** fie has quit IRC (Ping timeout: 250 seconds) [16:28] *** Meroje has quit IRC (Ping timeout: 260 seconds) [16:33] *** Meroje has joined #archiveteam-bs [16:37] *** fie has joined #archiveteam-bs [16:53] *** BartoCH has quit IRC (Ping timeout: 260 seconds) [17:01] *** RichardG has quit IRC (Read error: Connection reset by peer) [17:02] *** brayden has quit IRC (Read error: Connection reset by peer) [17:02] *** brayden_ has joined #archiveteam-bs [17:02] *** swebb sets mode: +o brayden_ [17:02] *** brayden_ is now known as brayden [17:10] *** j08nY has quit IRC (Read error: Operation timed out) [17:27] *** Pudsey has joined #archiveteam-bs [17:27] *** Asparagir has joined #archiveteam-bs [17:32] *** Pudsey_ has joined #archiveteam-bs [17:35] *** pizzaiolo has joined #archiveteam-bs [17:35] *** Pudsey has quit IRC (Ping timeout: 245 seconds) [17:42] *** BartoCH has joined #archiveteam-bs [17:49] *** odemg has quit IRC (Read error: Operation timed out) [18:12] *** Pudsey_ has quit IRC (Remote host closed the connection) [18:15] *** RichardG has joined #archiveteam-bs [18:24] *** RichardG_ has joined #archiveteam-bs [18:28] *** RichardG has quit IRC (Ping timeout: 370 seconds) [18:41] *** schbirid2 has joined #archiveteam-bs [18:44] *** schbirid has quit IRC (Read error: Operation timed out) [18:51] *** Odd0002 has quit IRC (Remote host closed the connection) [18:56] joepie91: I already archived him five days ago. :-) [18:58] *** RichardG_ has quit IRC (Read error: Connection reset by peer) [19:00] *** RichardG has joined #archiveteam-bs [19:09] Kenshin: you around? [19:10] *** RichardG has quit IRC (Read error: No route to host) [19:11] *** RichardG has joined #archiveteam-bs [19:18] *** RichardG_ has joined #archiveteam-bs [19:20] *** RichardG has quit IRC (Ping timeout: 250 seconds) [19:21] *** RichardG_ has quit IRC (Read error: Connection reset by peer) [19:22] *** RichardG has joined #archiveteam-bs [19:35] Here are some posts that ArchiveTeam might find interesting, about Twitter archiving and massive data processing of the tweetstreams: [19:35] https://inkdroid.org/2017/08/15/utr/ [19:36] and its follow-up post https://inkdroid.org/2017/08/18/delete-forensics/ [19:37] It ran off a collection of 165,314 tweets, of which 16,492 (9.9%) were later deleted. Has interesting stats and musings. [19:47] *** Odd0002 has joined #archiveteam-bs [20:00] *** wp494 has quit IRC (Read error: Connection reset by peer) [20:10] *** Odd0002 has quit IRC (Read error: Operation timed out) [20:15] *** Odd0002 has joined #archiveteam-bs [20:16] *** Lotheric has quit IRC (Leaving) [21:12] *** wp494 has joined #archiveteam-bs [21:20] *** ZexaronS has joined #archiveteam-bs [21:36] *** TheLovina has quit IRC (Ping timeout: 370 seconds) [21:39] *** Famicoman has joined #archiveteam-bs [21:41] *** Stilett0 has quit IRC (Ping timeout: 250 seconds) [21:55] so i have uploaded 2527 items so far this month [22:01] Awesome! [22:06] *** Odd0002 has quit IRC (Remote host closed the connection) [22:12] *** Odd0002 has joined #archiveteam-bs [22:28] *** Odd0002_ has joined #archiveteam-bs [22:29] *** Odd0002 has quit IRC (Read error: Operation timed out) [22:32] *** Odd0002_ has quit IRC (Remote host closed the connection) [22:40] *** Odd0002 has joined #archiveteam-bs [22:56] *** pie_ has quit IRC (Ping timeout: 246 seconds) [23:27] *** Asparagir has quit IRC (Read error: Connection reset by peer) [23:29] *** Asparagir has joined #archiveteam-bs [23:50] *** dashcloud has quit IRC (Read error: Operation timed out) [23:55] *** dashcloud has joined #archiveteam-bs [23:55] *** Ravenloft has joined #archiveteam-bs [23:55] http://www.blackfalcongames.net/?p=183