[00:08] *** Start has quit IRC (Quit: Disconnected.) [00:08] *** Start has joined #archiveteam [00:09] *** Coderjoe has quit IRC (Read error: Operation timed out) [00:16] serious sql-regret on a private crawl, wondered why that query took minutes. have to recover state from > 500GB of incomplete WARC now :( [00:27] *** Smiley has quit IRC (Remote host closed the connection) [00:28] *** Smiley has joined #archiveteam [00:35] *** Smiley has quit IRC (Remote host closed the connection) [00:43] *** Smiley has joined #archiveteam [00:56] *** goldfinch has joined #archiveteam [01:01] *** goldfinch has quit IRC (Quit: Page closed) [01:07] *** tomaspark has joined #archiveteam [01:22] *** AlexLehm has quit IRC (Ping timeout: 260 seconds) [01:26] *** maelstrom has joined #archiveteam [01:29] BITSAVERS collection is now automatic. [01:29] That means that every day, my scripts are going to go in and mirror Bitsavers, then send PDFs to archive.org. [01:29] So that pipeline's going to jump AND I'm going to ignore it from now on, while it runs. [01:30] *** dan- has quit IRC (Ping timeout: 260 seconds) [01:32] *** JesseW has joined #archiveteam [02:07] SketchCow: can you replace the static bitsavers items with items which are auto-updated from the daily mirror? [02:09] and is it possible to have the bitsavers original filepath somewhere visible in the metadata when viewing a particular pdf? it took me a while to find a specific bitsavers-mirrored pdf earlier on bitsavers itself, since i needed something else which was stored in the same directory [02:09] and the path wasn't noted anywhere [02:45] *** n00b139 has joined #archiveteam [02:46] hey guys.. im not seeing the "current project" screen in my browser. i tried vmware/virtualbox and several browsers.. how could this possible happen [02:46] *** tomaspar1 has joined #archiveteam [02:48] *** tomaspark has quit IRC (Read error: Operation timed out) [02:50] n00b139: the warrior doesn't have anything active right now, I think [02:50] *** ky0ko has joined #archiveteam [02:52] okay thanks.. that would be funny [02:52] i never had that in 3 yrs i think [02:53] well, there's always urlteam -- but even that is kinda short of jobs right now [02:53] n00b139: also, the virtual machine image is rather elderly now; it *may* be freaking out (although that seems unlikely) [02:56] okay thanks.. that's unfortunate as i love archiving stuff [02:56] tomorrow a new day i guess [02:56] *** vitzli has joined #archiveteam [02:56] n00b139: well, there's still plenty to archive. [02:57] if you have the time, please research more url shorteners -- that will give us more to work on with urlteam. Join the #urlteam channel and I'll walk you through it. [02:57] An interesting set of feature requests. [02:58] 2:07 < Lord_Nigh> SketchCow: can you replace the static bitsavers items with items which are auto-updated from the daily mirror? [02:58] No. What situation would that require [02:58] *** Sneakyimp has quit IRC (Read error: Operation timed out) [02:58] 22:09 < Lord_Nigh> and is it possible to have the bitsavers original filepath somewhere visible in the metadata when viewing a particuladsfjuslfkjsdflksjdfhsdkljfhskdjfh [02:58] Probably [03:01] I can prepare metadata for this, I have bitsavers mirror and could grab metadata from IA [03:01] *** gibigiana has quit IRC (Read error: Operation timed out) [03:02] *** alembic has quit IRC (Quit: Leaving) [03:02] I don't have one tool, but all bits and pieces to do it [03:02] yaml, csv, xml, something else? [03:04] I can certainly do it. [03:06] Hm. I realized that I don't know why IA doesn't make the HTTP request and response headers available for material in the Wayback Machine. Not that it matters much, but I thought it worth asking about. [03:09] So, I'm trying to set up the nujij grabber, but I noticed it doesn't have the list of tasks in the web interface. I understand it's a new project, but I still wanted to check, is this a known issue? I like having the confirmation in the web interface that it's running [03:11] ky0ko: I think it's a known issue. [03:13] Okay. I'll have to take that shot in the dark then. My VMs I'm setting up for this are still a bit finnicky. From the tracker though it looks like It's working [03:14] No. What situation would that require <- when documents get moved around/updated on bitsavers, the paths can change (and sometimes scans are replaced with more complete copies than the ones that IA might have uploaded already) [03:14] if the mirror auto-regenerates IA items when things move around, then everything stays in sync [03:18] *** gibigiana has joined #archiveteam [03:23] *** tomwsmf has quit IRC (Read error: Operation timed out) [03:54] *** dan- has joined #archiveteam [04:09] *** Sk1d has quit IRC (Ping timeout: 250 seconds) [04:15] *** dashcloud has quit IRC (Read error: Operation timed out) [04:16] *** Jogie has joined #archiveteam [04:17] *** Sk1d has joined #archiveteam [04:22] *** nertzy2 has quit IRC (Read error: Operation timed out) [04:45] *** maelstrom has quit IRC (Remote host closed the connection) [04:52] *** Meroje has quit IRC (Quit: bye!) [04:53] *** Meroje has joined #archiveteam [05:18] *** BlueMaxim has joined #archiveteam [06:03] *** ravetcofx has joined #archiveteam [06:07] *** JesseW has quit IRC (Ping timeout: 370 seconds) [06:12] *** ndizzle has joined #archiveteam [06:15] *** yipdw_ is now known as yipdw [06:19] *** ndiddy has quit IRC (Ping timeout: 633 seconds) [06:23] *** Honno has joined #archiveteam [06:42] *** BlueMaxim has quit IRC (Read error: Operation timed out) [06:46] *** BlueMaxim has joined #archiveteam [06:47] *** JesseW has joined #archiveteam [07:04] *** Chorca_ has quit IRC (Ping timeout: 260 seconds) [08:21] *** yeoldetoa has quit IRC (Read error: Operation timed out) [08:31] *** JesseW has quit IRC (Ping timeout: 370 seconds) [09:35] *** WinterFox has joined #archiveteam [09:35] *** atomotic has joined #archiveteam [09:40] *** BlueMaxim has quit IRC (Quit: Leaving) [09:48] *** kristian_ has joined #archiveteam [09:57] *** notjack has quit IRC (Ping timeout: 268 seconds) [10:02] *** kristian_ has quit IRC (Leaving) [10:04] *** AlexLehm has joined #archiveteam [10:09] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [10:47] *** Muad-Dib has quit IRC (Ping timeout: 260 seconds) [10:51] *** ndizzle has quit IRC (Read error: Connection reset by peer) [11:00] *** Muad-Dib has joined #archiveteam [11:25] *** n00b139 has quit IRC (Quit: Page closed) [11:30] *** zenguy_pc has joined #archiveteam [11:56] *** dserodio has quit IRC (Quit: ZNC - http://znc.in) [12:03] *** zenguy_pc has quit IRC (Read error: Operation timed out) [12:03] *** zenguy_pc has joined #archiveteam [12:15] *** zenguy_pc has quit IRC (Read error: Operation timed out) [12:26] *** zenguy_pc has joined #archiveteam [12:38] *** zenguy_pc has quit IRC (Read error: Operation timed out) [12:42] *** RichardG_ has joined #archiveteam [12:42] *** RichardG has quit IRC (Read error: Connection reset by peer) [12:43] *** RichardG_ is now known as RichardG [12:53] *** zenguy_pc has joined #archiveteam [13:03] *** tomaspar1 has quit IRC (Read error: Operation timed out) [13:04] *** ky0ko has quit IRC (Read error: Operation timed out) [13:17] *** Morbus has quit IRC (Quit: http://www.disobey.com/) [13:18] *** ky0ko has joined #archiveteam [13:18] *** zenguy_pc has quit IRC (Read error: Operation timed out) [13:26] *** schbirid has joined #archiveteam [13:27] *** zenguy_pc has joined #archiveteam [13:29] *** Morbus has joined #archiveteam [13:51] *** Morbus has quit IRC (Quit: http://www.disobey.com/) [13:52] *** WinterFox has quit IRC (Read error: Operation timed out) [13:54] *** zenguy_pc has quit IRC (Read error: Operation timed out) [14:00] *** nwf_ has joined #archiveteam [14:01] *** Morbus has joined #archiveteam [14:08] *** nwf has quit IRC (Read error: Operation timed out) [14:10] *** zenguy_pc has joined #archiveteam [14:21] *** zenguy_pc has quit IRC (Read error: Operation timed out) [14:39] *** zenguy_pc has joined #archiveteam [14:43] *** metalcamp has joined #archiveteam [14:54] *** metal_cam has joined #archiveteam [14:58] *** metalcamp has quit IRC (Read error: Operation timed out) [15:25] *** vitzli has quit IRC (Quit: Leaving) [16:10] *** cadbury_ has quit IRC (Read error: Operation timed out) [16:10] *** antomati_ has joined #archiveteam [16:10] *** swebb sets mode: +o antomati_ [16:10] *** sep332 has joined #archiveteam [16:11] *** xmc has quit IRC (Read error: Operation timed out) [16:11] *** Mayonaise has quit IRC (Read error: Operation timed out) [16:11] *** midas1 has quit IRC (Read error: Operation timed out) [16:11] *** sivoais has quit IRC (Read error: Operation timed out) [16:11] *** sivoais has joined #archiveteam [16:11] *** chfoo has quit IRC (Read error: Operation timed out) [16:11] *** Famicoma1 has quit IRC (Read error: Operation timed out) [16:11] *** irl has quit IRC (Read error: Operation timed out) [16:11] *** edsu has quit IRC (Write error: Broken pipe) [16:11] *** antomatic has quit IRC (Read error: Operation timed out) [16:11] *** edsu_ has joined #archiveteam [16:11] *** swebb sets mode: +o edsu_ [16:11] *** chfoo has joined #archiveteam [16:11] *** nox_ has quit IRC (Read error: Operation timed out) [16:11] *** Mayonaise has joined #archiveteam [16:12] *** midas1 has joined #archiveteam [16:12] *** swebb sets mode: +o midas1 [16:12] *** redlob has quit IRC (Read error: Operation timed out) [16:12] *** sep332_ has quit IRC (Read error: Operation timed out) [16:12] *** acridAxid has quit IRC (Read error: Operation timed out) [16:12] *** acridAxid has joined #archiveteam [16:12] *** JW_work has quit IRC (Read error: Operation timed out) [16:12] *** Kenshin has quit IRC (Read error: Operation timed out) [16:12] *** yuitimoth has quit IRC (Read error: Operation timed out) [16:13] *** nox_ has joined #archiveteam [16:13] *** HCross has quit IRC (Read error: Operation timed out) [16:13] *** xmc has joined #archiveteam [16:13] *** swebb sets mode: +o xmc [16:13] *** HCross has joined #archiveteam [16:13] *** irl has joined #archiveteam [16:13] *** yuitimoth has joined #archiveteam [16:13] *** JW_work has joined #archiveteam [16:14] *** Kenshin has joined #archiveteam [16:14] *** TC01_ has joined #archiveteam [16:15] *** brayden has quit IRC (Read error: Operation timed out) [16:15] *** JesseW has joined #archiveteam [16:15] *** cadbury_ has joined #archiveteam [16:17] *** TC01 has quit IRC (Read error: Operation timed out) [16:22] *** redlob has joined #archiveteam [16:36] *** Famicoma1 has joined #archiveteam [16:45] *** maelstrom has joined #archiveteam [16:45] *** JesseW has quit IRC (Read error: Operation timed out) [16:51] *** VADemon has joined #archiveteam [16:58] *** Simpbrain has joined #archiveteam [17:07] *** JW_work has quit IRC (Quit: Leaving.) [17:12] *** schbirid has quit IRC (Quit: Leaving) [17:18] *** maelstrom has quit IRC (Quit: Leaving) [17:19] *** schbirid has joined #archiveteam [17:23] *** maelstrom has joined #archiveteam [17:23] *** schbirid2 has joined #archiveteam [17:28] *** schbirid has quit IRC (Read error: Operation timed out) [17:40] *** edsu_ is now known as edsu [17:49] *** JW_work has joined #archiveteam [17:49] *** JW_work has quit IRC (Remote host closed the connection) [17:50] *** JW_work has joined #archiveteam [17:54] *** maelstrom has quit IRC (Remote host closed the connection) [18:13] *** odie5533 has joined #archiveteam [18:14] Firefall.com has now shutdown. [18:15] odie5533: add it to our Deathwatch list [18:15] The bot tried archiving it on Aug 12. Not sure if it ever got the whole forum though [18:16] i got the whole game assets [18:16] i also sent the original game assets archive to someone who should have produced an archive [18:16] * luckcolor goes to search who was that person [18:17] game assets = the client? [18:17] yeah [18:17] it was distrivuited using an 10gb 7z archive [18:19] mmh [18:19] it should be inside some warc of newsgrabber [18:21] http://dl-production.firefall.com/installer/client/production/EU/prod-1962/FirefallInstaller.7z [18:21] is the url [18:22] aaand it's not in the wayback machine [18:22] HCross: any ideas? [18:22] Was that up to date and have everything? [18:22] it did not get added to newsgrabber [18:22] so do you have it then? [18:23] no [18:23] sigh didn't you wget the list or something? [18:24] no [18:24] someone else mentioned having httrack that they might've dropped it into as we noticed AB didn't like it [18:25] Can't remember when this all happened, so won't know details until I can check logs [18:25] also -bs [18:36] *** kristian_ has joined #archiveteam [18:45] *** tomaspark has joined #archiveteam [18:58] *** kristian_ has quit IRC (Leaving) [19:07] Sooo about firefall [19:07] Bad News: the original archive + fiddler requests save were deleted [19:07] Good News: i still have the game installed [19:09] Bad News: I have only 1 megabit and for pushing 10gb to IA it iwll take ....... some ...... ENORMOUS .. ammount of time [19:10] (1 megabit of upload speed :( ) [19:10] *** dashcloud has joined #archiveteam [19:10] Solution: Use some Uber compression algoritm and then to upload everything in chunks of 1mb (or 10mb) [19:11] feedback for better solutions is appreciated [19:13] *** RichardG has quit IRC (Read error: Operation timed out) [19:14] compress it, ship a USB drive to someone else [19:15] well i don't know anybody wich has a fiber connection [19:15] and sending a pendrive to ia doens't seem a good idea [19:16] i guess i'll just upload it slowly [19:16] *** RichardG has joined #archiveteam [19:24] theyll accept usb key iirc [19:24] or a drive [19:29] *** maelstrom has joined #archiveteam [19:30] upload with torrent? upload-resume, limitable speed. [19:36] mmh Torrent can eb an option [19:36] -bs [19:58] *** dashcloud has quit IRC (Read error: Operation timed out) [20:17] *** ndiddy has joined #archiveteam [20:22] *** VADemon has quit IRC (Read error: Operation timed out) [20:35] *** tomaspark has quit IRC (Ping timeout: 370 seconds) [20:37] *** maelstrom has quit IRC (Quit: Leaving) [20:40] let's fix nujij [20:41] going to get rid of this checking in the beginning [20:41] should speed up the project [20:45] arkiver: can you also checkout my error about the encoding? [20:45] wait it was for yahoo [20:45] not nuujij [20:45] I'll have a look [20:58] *** atrocity has quit IRC (Ping timeout: 244 seconds) [21:00] nujij is updated! [21:02] strange, push problems [21:07] new nujij version is set in the tracker [21:09] *** maelstrom has joined #archiveteam [21:14] *** schbirid2 has quit IRC (Quit: Leaving) [21:15] *** ky0ko has quit IRC (Read error: Operation timed out) [21:16] *** dashcloud has joined #archiveteam [21:25] *** pfallenop has quit IRC (Ping timeout: 244 seconds) [21:29] *** nox_ has quit IRC (Read error: Operation timed out) [21:29] *** ndiddy has quit IRC (Quit: Leaving) [21:31] *** dashcloud has quit IRC (Read error: Operation timed out) [21:31] *** pfallenop has joined #archiveteam [21:32] *** nox_ has joined #archiveteam [21:35] *** ky0ko has joined #archiveteam [21:35] *** maelstrom has quit IRC (Remote host closed the connection) [21:47] *** aschmitz has quit IRC (Ping timeout: 255 seconds) [21:52] *** metal_cam has quit IRC (Ping timeout: 501 seconds) [21:52] *** filippo__ has quit IRC (Connection closed) [21:52] *** hook54321 has quit IRC (Connection closed) [21:52] *** hook12345 has quit IRC (Connection closed) [21:52] *** antonizoo has quit IRC (Connection closed) [21:53] *** antonizoo has joined #archiveteam [21:54] *** hook54321 has joined #archiveteam [22:00] *** aschmitz has joined #archiveteam [22:10] *** dashcloud has joined #archiveteam [22:23] torrents mean hashing all of the data first, which is annoying for very large file.s [22:23] *** xmc has quit IRC (Read error: Operation timed out) [22:24] *** pfallenop has quit IRC (Read error: Operation timed out) [22:24] *** pfallenop has joined #archiveteam [22:25] zout: still a lot easier than trying to push 10gb over 1 Mbit/s link in some other way [22:25] and 10gb isn't *that* big [22:26] sure, I guess. [22:26] small on disk, big to shove around with ADSL. [22:27] *** nox_ has quit IRC (Read error: Operation timed out) [22:27] *** WinterFox has joined #archiveteam [22:28] *** nox_ has joined #archiveteam [22:30] *** irl has quit IRC (Read error: Operation timed out) [22:36] *** HCross has quit IRC (Ping timeout: 633 seconds) [22:36] *** d_rebel has quit IRC (Ping timeout: 633 seconds) [22:37] *** d_rebel has joined #archiveteam [22:40] *** nox_ has quit IRC (Ping timeout: 633 seconds) [22:41] *** nox_ has joined #archiveteam [22:43] *** cadbury_ has quit IRC (Ping timeout: 633 seconds) [22:43] *** Fletcher has quit IRC (Ping timeout: 633 seconds) [22:43] *** cadbury_ has joined #archiveteam [22:43] *** SilSte has quit IRC (Read error: Operation timed out) [22:43] *** SilSte has joined #archiveteam [22:52] *** dashcloud has quit IRC (Read error: Operation timed out) [22:53] *** Fletcher has joined #archiveteam [22:55] *** Whopper has joined #archiveteam [22:58] *** cadbury_ has quit IRC (Ping timeout: 633 seconds) [22:58] *** JW_work has quit IRC (Ping timeout: 633 seconds) [22:58] *** Whopper_ has quit IRC (Ping timeout: 633 seconds) [22:59] *** SilSte has quit IRC (Ping timeout: 633 seconds) [22:59] *** ploop has joined #archiveteam [23:01] *** JW_work has joined #archiveteam [23:02] *** ploop_ has quit IRC (Ping timeout: 633 seconds) [23:04] *** cadbury_ has joined #archiveteam [23:04] *** HCross has joined #archiveteam [23:06] *** SilSte has joined #archiveteam [23:10] *** xmc has joined #archiveteam [23:10] *** swebb sets mode: +o xmc [23:13] *** maelstrom has joined #archiveteam [23:17] *** dashcloud has joined #archiveteam [23:25] *** fie has joined #archiveteam [23:28] *** fie__ has quit IRC (Read error: Operation timed out) [23:33] *** maelstrom has quit IRC (Remote host closed the connection) [23:56] *** kristian_ has joined #archiveteam [23:57] *** mutoso has quit IRC (Ping timeout: 260 seconds)