[00:02] *** mkram has quit IRC (Read error: No route to host) [00:07] *** GE has quit IRC (Remote host closed the connection) [00:22] *** pikhq has joined #archiveteam-bs [00:26] *** dxrt- has joined #archiveteam-bs [01:01] *** LastNinja has quit IRC (Read error: Connection reset by peer) [01:07] *** LastNinja has joined #archiveteam-bs [01:42] *** icedice has quit IRC (Quit: Leaving) [01:51] *** BlueMaxim has joined #archiveteam-bs [02:35] *** j08nY has joined #archiveteam-bs [02:38] *** pizzaiolo has quit IRC (Remote host closed the connection) [02:40] *** SilSte has quit IRC (Read error: Operation timed out) [02:49] *** username1 has joined #archiveteam-bs [02:52] *** schbirid2 has quit IRC (Read error: Operation timed out) [03:10] *** SilSte has joined #archiveteam-bs [03:18] *** j08nY has quit IRC (Quit: Leaving) [04:21] *** ndiddy has quit IRC (Read error: Connection reset by peer) [05:05] *** Sk1d has quit IRC (Ping timeout: 194 seconds) [05:11] *** Sk1d has joined #archiveteam-bs [06:10] *** odemg has quit IRC (Remote host closed the connection) [07:02] *** ItsYoda has quit IRC (Ping timeout: 260 seconds) [07:06] *** dxrt- has quit IRC (Read error: Operation timed out) [07:15] *** Igloo has quit IRC (Ping timeout: 260 seconds) [07:19] *** ItsYoda has joined #archiveteam-bs [07:47] *** VADemon has quit IRC (Quit: left4dead) [08:24] *** Stilett0 has joined #archiveteam-bs [08:40] *** gui7 has joined #archiveteam-bs [09:02] *** odemg has joined #archiveteam-bs [09:08] *** GE has joined #archiveteam-bs [09:23] *** Igloo has joined #archiveteam-bs [09:48] *** odemg has quit IRC (Remote host closed the connection) [10:05] *** odemg has joined #archiveteam-bs [10:46] *** Silvan has joined #archiveteam-bs [10:47] *** SilSte has quit IRC (Ping timeout: 194 seconds) [10:49] does archivebot crawl plaintext urls? [11:03] *** odemg has quit IRC (Remote host closed the connection) [11:04] *** ravetcofx has quit IRC (Read error: Operation timed out) [11:09] *** fie has quit IRC (Read error: Operation timed out) [11:10] *** odemg has joined #archiveteam-bs [11:25] *** GE has quit IRC (Remote host closed the connection) [11:31] *** SmileyG has joined #archiveteam-bs [11:33] *** Smiley has quit IRC (Read error: Operation timed out) [12:00] *** BlueMaxim has quit IRC (Quit: Leaving) [12:41] *** GE has joined #archiveteam-bs [12:51] *** VeganMars has quit IRC (Quit: ZNC - http://znc.in) [12:52] *** VeganMars has joined #archiveteam-bs [12:54] *** pizzaiolo has joined #archiveteam-bs [13:26] *** odemg has quit IRC (Remote host closed the connection) [14:52] *** username1 is now known as schbiri [14:52] *** schbiri is now known as schbirid [14:59] *** odemg has joined #archiveteam-bs [15:38] *** odemg has quit IRC (Remote host closed the connection) [15:43] *** VADemon has joined #archiveteam-bs [16:08] *** odemg has joined #archiveteam-bs [16:34] *** spiko has joined #archiveteam-bs [17:18] Sanqui, I think it does? I know it has grabbed gist .txt links before. [17:19] Also we have finally finished up our Asheron Call grabs. [17:28] *** odemg has quit IRC (Remote host closed the connection) [17:30] *** odemg has joined #archiveteam-bs [17:38] *** odemg has quit IRC (Remote host closed the connection) [17:39] *** odemg has joined #archiveteam-bs [17:46] *** odemg has quit IRC (Remote host closed the connection) [17:48] *** odemg has joined #archiveteam-bs [17:56] rocode: gist txt links are imported as !a < though [19:18] *** ravetcofx has joined #archiveteam-bs [19:34] someone should write/spread a virus that stealth runs warrior projects.. [19:44] *** tobbez has quit IRC (Ping timeout: 240 seconds) [19:54] *** masterX24 has joined #archiveteam-bs [19:56] noticed some bugs on the DMOZ.org crawl on archivebot, it crawls abuse/flag links, too [20:10] masterX24: those look good. Can you join #archivebot and ask someone with permission to add them? [20:11] i still got a independently running crawl as fallback (monitoring at nplusc.de:29000 ) [20:14] i told my crawl to skip off-domain links since those aren't affected by closure, estimating 25% progress on mine atm [20:15] *** yeoldetoa has quit IRC (Read error: Operation timed out) [20:34] *** odemg has quit IRC (Remote host closed the connection) [20:34] *** odemg has joined #archiveteam-bs [20:47] *** Honno has joined #archiveteam-bs [20:49] *** godane has quit IRC (Quit: Leaving.) [20:56] *** RichardG_ has joined #archiveteam-bs [20:57] *** RichardG has quit IRC (Ping timeout: 245 seconds) [21:00] *** godane has joined #archiveteam-bs [21:01] so i got a new 4tb external hard drive [21:01] i'm formatting it to ext2 [21:12] *** ndiddy has joined #archiveteam-bs [21:18] godane: Is it 1996 and nobody told me? [21:19] *** masterX24 has quit IRC (Quit: Page closed) [21:29] 4tb? [21:29] That's not 1996 [21:29] Also, don't be a [21:29] Also, don't be a "Mine's Bigger" nerd, it's unseemly. [21:34] I think he meant the filesystem [21:37] thats what i was thinking [21:37] i only use ext2 cause ext4 broke a filesytem on a hard drive/usb stick once [21:41] *** spiko has quit IRC (Read error: Operation timed out) [21:43] *** j08nY has joined #archiveteam-bs [21:43] That is even nerdier [21:43] ext2 is fine, I do everything with FAT32 [21:45] didn't ext2 have some size limits? [21:45] maybe at least 16tb or something [21:45] only size limit i could think that would be problem currently [21:46] https://www.cyberciti.biz/howto/question/static/maximum-partition-size.php [21:47] looks like maximum partition size is 4tb [21:47] hmm yeah, nvm [21:51] anyways this drive will be mostly use as my media drive ;-) [21:51] and backup for my dad's media drive [21:51] depending on how you do that backup I guess the 2TB limit is fine [21:52] its 4TB limit [21:52] 2TB for max file size [21:52] oh [21:53] i have nothing close to even 5gb file size [21:53] sounds good :) [21:53] its mostly SD videos [21:54] it'll take a few more years for videos to reach TB sizes [21:54] some day maybe [21:54] this will offload the Johnny Carson collection i have on my main drive also [22:05] *** BlueMaxim has joined #archiveteam-bs [22:07] *** GE has quit IRC (Remote host closed the connection) [22:15] *** tobbez has joined #archiveteam-bs [22:26] *** Honno has quit IRC (Ping timeout: 370 seconds) [22:30] So, interesting problem. I am running a long grab on all fanfiction archives. Some of them are hosted on freeservers.com, which does inquisition levels of checking of stuff like referrer, etc to prevent direct grabs. Is there an easy solution around this? [22:31] Example: http://animalfiles.freeservers.com/stories/poorscully.txt [22:32] rocode: do you know any good way to get all stories for a certain user? [22:34] Going through their index I guess. I think as long as the referrer is http://animalfiles.freeservers.com it allows it. [22:36] Ah right. Author I want has 23 stories. Shouldn't be too bad.. but some of them are 60k worsa [22:36] Words [22:37] Oh, like getting all the stories from a single author across multiple archives? [22:37] Yea [22:38] Probably a manual thing [22:38] I guess I could make a GCSE from my master list, search by author name, compile responses, and feed into grab-site. [22:39] it would only search archives indexed by Fanlore though. [22:40] But if it is only 23 stories I would just compile the links and grab-site the list. [22:40] I think he meant the filesystem <-- This. ext2 seems like a poor choice for storage. [22:41] Also, my NAS is 2TB, so I'm not getting into any epeen contests here. [22:41] I have 1TB of storage. I have to hide my shame when commenting in #DataHoarder. [22:42] My workflow is: Grab 500gb, break into chunks, upload to IA, delete, start next set. [22:43] I remember buying an 8GB drive around 1997. It cost around 450€. [22:45] *** RichardG_ is now known as RichardG [23:01] ext2 is not a good choice for anything since ext4 is strictly better, downgradeable to ext3 and available since 2008. [23:03] (The downgradable part mean you can read it with any kernel released after 2000) [23:09] turns out have the other drive has ext4 [23:09] and the drive i'm using as ext4 [23:10] so i'm converting it to ext4 now [23:27] *** odemg has quit IRC (Remote host closed the connection) [23:30] *** odemg has joined #archiveteam-bs [23:36] >Fanfiction "Archive", created after previous iteration went down without backup. >Robots.txt User-Agent: * Disallow: / [23:36] Why do people do this. [23:41] *** GE has joined #archiveteam-bs [23:58] *** passerby has quit IRC ()