#archiveteam 2015-04-26,Sun

↑back Search

Time Nickname Message
00:04 🔗 wp494 anyone that has tracker access: please check #yolohalo for a note re. a user named "dotZIP"
00:04 🔗 wp494 I'll repost here for convenience:
00:04 🔗 wp494 [19:03:20] <wp494> HEY head's up: "dotZIP" has been returning quite a bunch of 0.3 MB items
00:05 🔗 wp494 [19:03:34] <wp494> that definitely doesn't seem right, probably want to check the stuff out
00:05 🔗 wp494 [19:03:45] <wp494> he isn't using the warrior either
00:05 🔗 wp494 (this is for the Halo project)
00:18 🔗 chfoo person was using tor
00:23 🔗 kyan has quit IRC (Quit: Leaving)
00:32 🔗 aaaaaaaaa maybe checkip should error out if facebookcorewwwi.onion resolves
00:35 🔗 wp494 ^^^
00:35 🔗 wp494 that's what I was thinking, a check to maybe check.torproject.org
00:36 🔗 wp494 but a check for whether or not a specific .onion resolves would do the job just as well
00:41 🔗 mistym has quit IRC (Remote host closed the connection)
01:15 🔗 BlueMaxim has quit IRC (Read error: Connection reset by peer)
01:35 🔗 schbirid has quit IRC (Read error: Operation timed out)
01:35 🔗 schbirid has joined #archiveteam
01:37 🔗 mistym has joined #archiveteam
01:54 🔗 Kazzy_ has joined #archiveteam
01:55 🔗 jk[[SVP]] has joined #archiveteam
01:55 🔗 fx_ has joined #archiveteam
01:56 🔗 w0rp_ has joined #archiveteam
01:56 🔗 Sk2d has joined #archiveteam
01:56 🔗 raccoon__ has joined #archiveteam
01:58 🔗 RedType_ has joined #archiveteam
01:59 🔗 bsmith093 has joined #archiveteam
01:59 🔗 NovaKing_ has joined #archiveteam
02:00 🔗 mutoso has quit IRC (hub.se efnet.portlane.se)
02:00 🔗 lytv has quit IRC (hub.se efnet.portlane.se)
02:00 🔗 Kazzy has quit IRC (hub.se efnet.portlane.se)
02:00 🔗 fx__ has quit IRC (hub.se efnet.portlane.se)
02:00 🔗 RedType has quit IRC (hub.se efnet.portlane.se)
02:00 🔗 ben__ has quit IRC (hub.se efnet.portlane.se)
02:00 🔗 nico_32 has quit IRC (hub.se efnet.portlane.se)
02:00 🔗 NovaKing has quit IRC (hub.se efnet.portlane.se)
02:00 🔗 Deewiant has quit IRC (hub.se efnet.portlane.se)
02:00 🔗 jk[SVP] has quit IRC (hub.se efnet.portlane.se)
02:00 🔗 raccoon_ has quit IRC (hub.se efnet.portlane.se)
02:00 🔗 w0rp has quit IRC (hub.se efnet.portlane.se)
02:00 🔗 filippo_ has quit IRC (hub.se efnet.portlane.se)
02:00 🔗 Sk1d has quit IRC (hub.se efnet.portlane.se)
02:01 🔗 nico_32_ has joined #archiveteam
02:03 🔗 filippo__ has joined #archiveteam
02:15 🔗 w0rp_ is now known as w0rp
02:15 🔗 Sk2d is now known as Sk1d
02:15 🔗 jk[[SVP]] is now known as jk[SVP]
02:15 🔗 lytv has joined #archiveteam
02:15 🔗 Kazzy_ is now known as Kazzy
02:34 🔗 nico_32_ is now known as nico_32
02:38 🔗 kyan has joined #archiveteam
02:38 🔗 nico_32 has quit IRC (Quit: Reconnecting)
02:38 🔗 nico_32 has joined #archiveteam
02:53 🔗 tux has joined #archiveteam
02:53 🔗 tux Ping-a-ling
02:54 🔗 xmc pong-a-long
02:54 🔗 tux Guess what I have
02:54 🔗 tux is now known as Compresse
02:54 🔗 Compresse I have a copy of TropicalWikis from February 2013.
02:55 🔗 Compresse Well, I guess it's better than having no copy.
02:55 🔗 xmc sweet!
02:55 🔗 xmc in mediawiki xml dump format?
02:55 🔗 Compresse Only SQL and images
02:55 🔗 xmc how big is it?
02:56 🔗 Compresse It's roughly around 1GB
02:56 🔗 xmc https://archive.org/create/
02:56 🔗 Compresse I wouldn't want to upload raw SQL obv
02:56 🔗 xmc why not?
02:57 🔗 Compresse passwords
02:57 🔗 Compresse need to scrub those
02:57 🔗 xmc ah, that
03:00 🔗 Compresse let me start doing XML dumps
03:02 🔗 Sellyme has quit IRC (Remote host closed the connection)
03:03 🔗 primus104 has quit IRC (Leaving.)
03:10 🔗 Infreq has joined #archiveteam
03:14 🔗 Ymgve has quit IRC ()
03:20 🔗 kyan has quit IRC (Quit: Leaving)
03:31 🔗 gui7 has joined #archiveteam
03:51 🔗 Sellyme has joined #archiveteam
03:51 🔗 Sellyme has quit IRC (Remote host closed the connection)
03:57 🔗 kyan has joined #archiveteam
04:07 🔗 VADemon has quit IRC (Quit: left4dead)
04:17 🔗 khaoohs has quit IRC (Read error: Connection reset by peer)
04:18 🔗 khaoohs has joined #archiveteam
04:22 🔗 Compresse xmc, I'm uploading it now
04:22 🔗 Compresse It will take a while as the images are ~1.1GB
04:23 🔗 Compresse also, I'm doing this over a residential connection, where upload isn't really good
04:33 🔗 aaaaaaaaa has quit IRC (Leaving)
04:33 🔗 espes___ http://www.eurogamer.net/articles/2015-04-26-p-t-is-being-pulled-from-psn-on-wednesday
04:36 🔗 SketchCow has quit IRC (Read error: Connection reset by peer)
04:36 🔗 SketchCow has joined #archiveteam
04:36 🔗 swebb sets mode: +o SketchCow
04:41 🔗 SketchCow Enjoying my night before going to Sweden with some high-test CD Ripping
04:48 🔗 Compresse nearly 80% done...
04:56 🔗 SketchCow Slicing through CD-ROMs like nothing, really.
05:00 🔗 Compresse yay, done
05:23 🔗 yotta has joined #archiveteam
06:13 🔗 mistym has quit IRC (Remote host closed the connection)
06:39 🔗 xmc Compresse: sweet
06:39 🔗 xmc Compresse: link?
06:39 🔗 Compresse https://archive.org/details/tropicalwikis-feb-2013
06:40 🔗 xmc \o/
06:41 🔗 Compresse I actually imported one of my older wikis into my current one
06:41 🔗 xmc it'd be helpful if you were to upload a plain .tar of the dumps, because then you can browse it online without unzipping
06:41 🔗 xmc but thanks!
06:42 🔗 Compresse xmc, that would have taken longer on my painful connection
06:48 🔗 xmc fair enough
06:51 🔗 mistym has joined #archiveteam
07:01 🔗 Compresse has quit IRC (Quit: Leaving)
07:12 🔗 za3k has joined #archiveteam
07:14 🔗 za3k Hey. I've never worked on anything involved with the archive team. I want to use Warrior to archive Github. Would this be cool?
07:17 🔗 za3k I'm familiar with all of the 5-6 major software tools to archive github already, and understand how to archive it (less reliably and completely than needed for the final version) on a personal scale already, I just don't have adequate resources.
07:17 🔗 za3k I asked Github through official channels if they'd just hand over data for public repos and they said no, although I haven't tried personal contacts yet (I'm in the same city)
07:27 🔗 scyther has joined #archiveteam
07:38 🔗 kyan has quit IRC (Quit: Leaving)
07:40 🔗 BlueMaxim has joined #archiveteam
07:40 🔗 yipdw za3k: the Warrior's main purpose at the moment is grabbing websites, and grabbing what Github sends is an untenable mess due to all the history links etc
07:41 🔗 yipdw something like github-backup is more appropriate and I guess you could wrap that up in a pipeline
07:41 🔗 yipdw that said, where would you shove the backup
07:46 🔗 dPhoenix has quit IRC (Read error: Connection reset by peer)
07:53 🔗 za3k yipdw: Yeah, I was talking about github-backup type things when I said existing tools. Wasn't sure if Warrior could deal with those. I thought archive team might have space; that's my limiting factor, as I can probably manage bandwidth.
07:53 🔗 yipdw it can
07:54 🔗 yipdw we don't really have space on our own, we use IA's facilities
07:54 🔗 yipdw I don't know if they'd be wild about taking on terabytes of data for a site that is (A) alive and (B) rapidly changing
07:54 🔗 yipdw you'd have to ask
07:55 🔗 za3k Okay, I'll ask. I'll also get a more accurate size estimate. Do you know of any way to grab the completely list of public repos and gists? That would save me some time and let me do a meaningful statistical sample to get size estimates.
07:56 🔗 za3k Also, every existing archive tool is unmaintained, and only one of them even works, so that's my first step--I'll double-check and then update the wiki with a list of tools.
07:56 🔗 yipdw no, but I do know the latter list changes about every second
07:57 🔗 yipdw if you run into problems with github-backup, its maintainer is in this channel
07:58 🔗 za3k Useful to know. Like I said, re-checking the state of tooling is probably one of my first steps.
07:58 🔗 yipdw although I'm wondering what's wrong with /gists/public from the github API
07:59 🔗 za3k Nothing, neccesarily; I'm familiar with third-party tools, not the github API. I'll check up these leads and get back in a couple days.
07:59 🔗 yipdw yeah, you should really look throug the github API
07:59 🔗 za3k Thanks for the help.
07:59 🔗 za3k I assumed I'd have to anyway--most of the tools I saw stopped working because they were on old API versions and unmaintained.
08:00 🔗 za3k I remember there being a few projects called github-backup so I might have skipped the linked one thinking it was a duplicate.
08:00 🔗 za3k Like I said, I need to do some more research.
08:00 🔗 za3k Just scoping out feasibility.
08:00 🔗 yipdw I also wonder what the goal here is; public github is massive and rapidly changing
08:01 🔗 yipdw Archive Team projects go for massive but the rapidly changing thing is a bit tough
08:01 🔗 za3k 1) There are a lot of research projects you can do with representative code-samples 2) Updating git repos isn't too painful relative to updating other changing sites 3) If github goes down it's really bad
08:01 🔗 za3k Yeah, I'm not sure how bad a snapshot is for archive team's goals
08:01 🔗 yipdw I take it (2) is not really the big deal though
08:02 🔗 yipdw I mean github is way more than its repos
08:02 🔗 za3k Agreed, although I think if you look at amounts of raw data it's mostly repos.
08:03 🔗 yipdw as far as (3), well, we all have to get burned once in a while
08:03 🔗 yipdw besides it'd be cool to watch Silicon Valley stop dead for a day
08:03 🔗 db48x heh
08:03 🔗 yipdw maybe they might learn something
08:03 🔗 yipdw and then make the same mistake again
08:05 🔗 yipdw a slightly more serious answer to (3) is yeah it definitely would be
08:08 🔗 za3k So, another serious answer might be that I could write the snapshot-style script, but let it sit stagnant unless something scary-sounding happens with Github. I don't really trust sites to give advance notice of going down, but I don't really have a feasible answer to space storage yet either.
08:09 🔗 yipdw that'd be useful, yeah
08:09 🔗 za3k I think that's a poor use of time on second thought, actually; directing effort at github-backup or something might be more useful.
08:10 🔗 yipdw also if you can get in touch with IA folk and get them on board with mirroring github, another option is to have github staff work with IA directly
08:10 🔗 za3k Anyway, again thanks for help, and I'll get back after talking to people about storage, size estimates, and with a version that runs well on a single repo.
08:10 🔗 yipdw SketchCow has IA contacts, you can ask him for an inroad
08:12 🔗 za3k Sure, I'd like to have a snapshot size [and if feasible daily delta] to quote at IA first but will do.
08:13 🔗 za3k SketchCow: feel free to reply, I'm offline but can grab logs
08:13 🔗 za3k has quit IRC (Quit: Page closed)
08:18 🔗 bugfiend has joined #archiveteam
08:19 🔗 db48x has quit IRC (Read error: Connection reset by peer)
08:23 🔗 bugfiend does anyone here know of cboyardee
08:25 🔗 primus104 has joined #archiveteam
08:33 🔗 Infreq has quit IRC (Remote host closed the connection)
08:36 🔗 espes___ re github: https://www.githubarchive.org/
08:38 🔗 espes___ wonder if you can reliable put together repos based on that
08:38 🔗 espes___ reliably*
08:54 🔗 mistym has quit IRC (Remote host closed the connection)
08:56 🔗 augusztin well the git part is easy, just clone. the rest is the issue :D
09:03 🔗 bugfiend is someone able to help me with my stupid newbie question?
09:04 🔗 bugfiend i'm trying to, uh... "wget" an entire ftp directory, but whenever I try, it gives me an error:
09:04 🔗 bugfiend "Error in server response, closing control connection."
09:04 🔗 bugfiend I have no idea what I'm doing
09:05 🔗 bugfiend what i'm trying to back up is ftp://ftp.agdg.me
09:05 🔗 bugfiend it has heaps of useful game development resources in there but it hasnt been updated for months
09:05 🔗 bugfiend user:obama pass:voteforme
09:06 🔗 bugfiend i've been afraid about it tanking all of a sudden due to its inactivity and am trying to gather a working backup
09:07 🔗 bugfiend if somebody can help, that would be very appreciated
09:12 🔗 DFJustin if wget doesn't work you could try wpull or lftp
09:14 🔗 bugfiend alright, i'll look into it
09:16 🔗 xmc i've also had some success with fuse-mounting and then doing rsync from one filesystem to another
09:17 🔗 xmc that got me a 10x speedup over a naive sftp copy (had to use sftp, don't ask)
09:19 🔗 SimpBrain ok got a 1 concurrent wpull job on that
09:20 🔗 bugfiend speaking of wpull, it requires python 3
09:20 🔗 bugfiend can i install it beside python 2?
09:21 🔗 SimpBrain needs python 3
09:29 🔗 BlueMaxim has quit IRC (Quit: Leaving)
09:34 🔗 Infreq has joined #archiveteam
09:47 🔗 bugfiend jesus christ im lost
09:47 🔗 bugfiend maybe i'll save this for another day............
09:47 🔗 bugfiend has quit IRC (Quit: till next time)
09:54 🔗 scyther has quit IRC (Leaving)
09:55 🔗 mistym has joined #archiveteam
10:01 🔗 mistym has quit IRC (Read error: Operation timed out)
10:10 🔗 ivan` has joined #archiveteam
10:23 🔗 schbirid shit https://dolphin-emu.org/blog/2015/04/25/commemoration-rachel-bryk/ :(
10:25 🔗 schbirid http://ask.fm/RachelB_
10:27 🔗 Infreq damn...
10:31 🔗 SimpBrain another trans-hate suicide?
10:33 🔗 SimpBrain ftp.agdg.me archive, will upload to ia
10:34 🔗 SimpBrain 16gb
10:36 🔗 Ymgve has joined #archiveteam
10:37 🔗 Infreq her git https://github.com/RachelBryk
10:44 🔗 xmc fuck
11:22 🔗 app has quit IRC (Ping timeout: 258 seconds)
11:45 🔗 app has joined #archiveteam
11:46 🔗 lysobit has quit IRC (quit)
11:51 🔗 lysobit has joined #archiveteam
12:02 🔗 Ravenloft has quit IRC (Read error: Connection reset by peer)
12:04 🔗 ivan` does anyone have a dedicated server that they want to donate to the archivebot cause
12:05 🔗 ivan` archivebot pipeline nodes mostly eat CPU
12:18 🔗 trs80 how much cpu/etc do you need? I could spin up a vm
12:20 🔗 ivan` the main constraint is that it has to stay running for months
12:20 🔗 ivan` the equivalent of a 2-core i3 is fine
12:21 🔗 ivan` 6GB RAM
12:22 🔗 ivan` maybe even less
12:22 🔗 ivan` ~100GB disk minimum
12:41 🔗 trs80 OS?
12:41 🔗 trs80 I prefer debian
12:43 🔗 ivan` need something with the newest libxml2
12:44 🔗 ivan` libxml2 2.9.2, I know Ubuntu 15.04 has it
12:44 🔗 trs80 ugh, jessie is 2.9.1. Ubuntu it is then
12:44 🔗 * trs80 grabs an ios
12:44 🔗 * trs80 *iso
12:45 🔗 ivan` thanks
13:01 🔗 augusztin ivan`: i wish i could guarantee you few months of CPU uptime/network connectivity :D
13:01 🔗 augusztin but that is next to impossible
13:03 🔗 ivan` network connectivity can go down if it eventually comes back up, heh
13:04 🔗 augusztin http://lolsnaps.com/upload_pic/FutureArcheology-17614.png <-- we are helping these guys :D
13:04 🔗 trs80 ivan`: do you just need a user on there?
13:04 🔗 ivan` trs80: yeah, and I guess I'll give you a list of packages I need
13:04 🔗 ivan` I'll PM you my key in a sec
13:47 🔗 Infreq ivan`: wish i could help but all my servers are small 1gb guys with minimal disk space
13:48 🔗 ivan` trs80 has saved the day
13:51 🔗 augusztin i have a not very used ESXi system with 2TB drive and 24GB RAM, but i cannot guarantee uptime, network connectivity and i would have to set up DMZ for it :D
14:26 🔗 Infreq has quit IRC (Remote host closed the connection)
14:27 🔗 Infreq has joined #archiveteam
14:28 🔗 Infreq has quit IRC (Client Quit)
14:31 🔗 Infreq has joined #archiveteam
15:17 🔗 nwf has quit IRC (Read error: Operation timed out)
15:18 🔗 nwf has joined #archiveteam
15:35 🔗 SketchCow Hi.
15:35 🔗 SketchCow It Github Backup Guy comes back, the answer is: No
15:35 🔗 SketchCow I mean, certainly not alone, with someone just trying it out on their own
15:35 🔗 SketchCow It's too big, too important and too involved for a single person to hope for the best.
15:45 🔗 habi has joined #archiveteam
15:45 🔗 habi has left
15:52 🔗 app has quit IRC (Ping timeout: 258 seconds)
15:53 🔗 app has joined #archiveteam
15:57 🔗 SketchCow I'm heading out today to Sweden. I'll be back Wednesday and no doubt online during.
16:00 🔗 mistym has joined #archiveteam
16:02 🔗 signius has quit IRC (Read error: Operation timed out)
16:05 🔗 SketchCow One other piece
16:05 🔗 SketchCow FOS is getting HAMMERED right now with updates and uploads and the rest.
16:05 🔗 SketchCow It got down to a gig of free space at one point, I slammed that back to something like a terabyte.
16:06 🔗 SketchCow I've got a bunch of processes running on a ton of jobs to deal with it. With luck, it should clear out pretty well.
16:06 🔗 SketchCow But if someone sees bad behavior, that's what it is.
16:09 🔗 mistym has quit IRC (Read error: Operation timed out)
16:15 🔗 signius has joined #archiveteam
16:36 🔗 xtr-107 has joined #archiveteam
16:39 🔗 xtr-201 has quit IRC (Ping timeout: 370 seconds)
16:41 🔗 mistym has joined #archiveteam
16:45 🔗 app103 has joined #archiveteam
16:45 🔗 app has quit IRC (Ping timeout: 258 seconds)
17:06 🔗 RichardG has quit IRC (Keyboard not found, press F1 to continue)
17:06 🔗 RichardG has joined #archiveteam
17:32 🔗 mistym has quit IRC (Remote host closed the connection)
17:37 🔗 aaaaaaaaa has joined #archiveteam
17:45 🔗 mistym has joined #archiveteam
17:50 🔗 Deewiant has joined #archiveteam
17:50 🔗 mutoso has joined #archiveteam
18:22 🔗 lytv has quit IRC (Ping timeout: 260 seconds)
18:32 🔗 dashcloud has quit IRC (Ping timeout: 260 seconds)
18:33 🔗 dashcloud has joined #archiveteam
18:37 🔗 arkiver SketchCow: achip and me are creating and testing a new newsletter project: http://mail3.newsletter.nerds.io/newspoc/
18:45 🔗 mistym has quit IRC (Remote host closed the connection)
18:51 🔗 lytv has joined #archiveteam
18:54 🔗 mistym has joined #archiveteam
18:58 🔗 app has joined #archiveteam
18:59 🔗 app103 has quit IRC (Ping timeout: 258 seconds)
19:01 🔗 habi has joined #archiveteam
19:07 🔗 dashcloud has quit IRC (Read error: Operation timed out)
19:10 🔗 dashcloud has joined #archiveteam
19:12 🔗 habi has quit IRC (Quit: Leaving.)
19:39 🔗 habi has joined #archiveteam
19:41 🔗 habi has left
19:52 🔗 SN4T14 has joined #archiveteam
19:58 🔗 SN4T14_ has quit IRC (Ping timeout: 512 seconds)
20:01 🔗 kniffy i've got a couple of sites that i run that will be dying soon if anyone wants to take a shot at archiving - http://not99chan.org and http://kingofthemem.es
20:02 🔗 kniffy i'll have to dump an sql db for kingofthemem.es
20:05 🔗 kyan has joined #archiveteam
20:08 🔗 nwf has quit IRC (Read error: Operation timed out)
20:09 🔗 nwf has joined #archiveteam
20:14 🔗 dashcloud has quit IRC (Read error: Operation timed out)
20:27 🔗 dashcloud has joined #archiveteam
20:49 🔗 app103 has joined #archiveteam
20:50 🔗 app has quit IRC (Read error: Operation timed out)
20:52 🔗 midas added them to the archivebot, not sure if it can do anything with the meme page tho
21:01 🔗 Stilett0 has joined #archiveteam
21:01 🔗 SimpBrain ftp.agdg.me upload to ia
21:02 🔗 SimpBrain https://archive.org/details/ftp.agdg.me.warc
21:03 🔗 dashcloud has quit IRC (Read error: Operation timed out)
21:05 🔗 RichardG has quit IRC (Ping timeout: 606 seconds)
21:06 🔗 kniffy midas: yeah, it's a weird one
21:10 🔗 dashcloud has joined #archiveteam
21:22 🔗 app103 has quit IRC (Ping timeout: 258 seconds)
21:23 🔗 app has joined #archiveteam
21:24 🔗 RichardG has joined #archiveteam
21:39 🔗 app has quit IRC (Ping timeout: 258 seconds)
21:40 🔗 app has joined #archiveteam
21:50 🔗 Dark_Star has quit IRC (Read error: Connection reset by peer)
21:51 🔗 Dark_Star has joined #archiveteam
21:52 🔗 Froggypwn has quit IRC (Read error: Connection reset by peer)
22:02 🔗 app has quit IRC (Ping timeout: 258 seconds)
22:09 🔗 app has joined #archiveteam
22:13 🔗 dPhoenix has joined #archiveteam
22:16 🔗 ivan` has left
22:38 🔗 mistym has quit IRC (Remote host closed the connection)
22:45 🔗 app has quit IRC (Ping timeout: 258 seconds)
22:47 🔗 app has joined #archiveteam
22:52 🔗 mistym has joined #archiveteam
22:56 🔗 mistym has quit IRC (Remote host closed the connection)
22:57 🔗 BlueMaxim has joined #archiveteam
23:00 🔗 SimpBrain has quit IRC (Ping timeout: 258 seconds)
23:03 🔗 BlueMaxim has quit IRC (Read error: Connection reset by peer)
23:09 🔗 BlueMaxim has joined #archiveteam
23:26 🔗 kyan has quit IRC (Quit: Leaving)
23:26 🔗 kyan has joined #archiveteam
23:27 🔗 mistym has joined #archiveteam
23:38 🔗 Morbus has quit IRC (http://www.disobey.com/)
23:41 🔗 Morbus has joined #archiveteam
23:45 🔗 app has quit IRC (Ping timeout: 258 seconds)
23:45 🔗 app has joined #archiveteam
23:45 🔗 dashcloud has quit IRC (Read error: Operation timed out)
23:47 🔗 dashcloud has joined #archiveteam
23:58 🔗 app has quit IRC (Ping timeout: 258 seconds)
23:59 🔗 BlueMaxim has quit IRC (Ping timeout: 512 seconds)

irclogger-viewer