#archiveteam 2015-06-10,Wed

↑back Search

Time Nickname Message
00:01 🔗 winr5r has joined #archiveteam
00:01 🔗 Fusl_ is now known as Fusl
00:02 🔗 kisspunc- is now known as kisspunch
00:12 🔗 mistym has quit IRC (Remote host closed the connection)
00:26 🔗 mistym has joined #archiveteam
00:31 🔗 lexicon has quit IRC (Read error: Operation timed out)
00:32 🔗 Fletcher has quit IRC (Ping timeout: 252 seconds)
00:34 🔗 SadDM has quit IRC (Remote host closed the connection)
00:34 🔗 SadDM has joined #archiveteam
00:34 🔗 swebb sets mode: +o SadDM
00:35 🔗 lexicon has joined #archiveteam
00:43 🔗 BlueMaxim has joined #archiveteam
00:54 🔗 Fletcher has joined #archiveteam
01:18 🔗 JesseW has joined #archiveteam
01:22 🔗 schbirid2 has joined #archiveteam
01:24 🔗 username1 has quit IRC (Read error: Operation timed out)
01:55 🔗 sunnymilk has joined #archiveteam
01:57 🔗 McGEE has quit IRC (Quit: Connection closed for inactivity)
01:59 🔗 ripvanwin has quit IRC (Read error: Operation timed out)
02:13 🔗 ripvanwin has joined #archiveteam
02:25 🔗 JesseW has quit IRC (Quit: Leaving.)
02:31 🔗 McGEE has joined #archiveteam
02:33 🔗 Nyenkaht has joined #archiveteam
02:33 🔗 Nyenkaht Hi, I keep trying to assist the pomf.se project, but I keep getting told to "wait" another minute
02:34 🔗 Nyenkaht It's been over 12 minutes since I've been given something, I have only one connection to this project
02:35 🔗 aaaaaaaaa there are rate limits to keep from overloading their servers and you just haven't gotten lucky yet.
02:35 🔗 Nyenkaht so basically i have to run like 30+ servers in order to become a face on the project? theres people doing like 5+ units like every second
02:38 🔗 Nyenkaht aaaaaaaaa: I have one unit since I started close to a half hour ago
02:38 🔗 Nyenkaht that is sad
02:39 🔗 aaaaaaaaa I don't know what the traffic looks like, just that they do do limits
02:40 🔗 Nyenkaht well, clearly those limits arent effective
02:41 🔗 Nyenkaht Are there any active projects that don't have a restriction on 1 unit per world war
02:46 🔗 aaaaaaaaa URLTeam uses a different formula and you may have better luck with halo
02:47 🔗 Nyenkaht Yeah I've been contributing to URLTeam primarily
02:47 🔗 Nyenkaht 1 million records scanned and counting since i think sunday
02:48 🔗 achip Nyenkaht the rate limiting is done on the tracker. it hands out x jobs a minute to the first warriors that ask, not limiting you specifically. the limit may go up, but for the next few hours pomf.se'll remain at the same rate so we can make sure the system fairs ok
02:50 🔗 sirdancea has quit IRC (Read error: Operation timed out)
02:57 🔗 Nyenkaht achip: looks like i'm gonna fit in well with this halo project, i'm averaging 1MB/s apparently
03:15 🔗 Ymgve has quit IRC ()
03:16 🔗 bzc6p_ has joined #archiveteam
03:16 🔗 swebb sets mode: +o bzc6p_
03:19 🔗 bzc6p has quit IRC (Read error: Operation timed out)
03:21 🔗 mistym has quit IRC (Remote host closed the connection)
03:26 🔗 lexicon has quit IRC (Read error: Operation timed out)
03:26 🔗 Nyenkaht has quit IRC (Quit: .)
03:29 🔗 SadDM has quit IRC (Remote host closed the connection)
03:29 🔗 SadDM has joined #archiveteam
03:29 🔗 swebb sets mode: +o SadDM
03:30 🔗 lexicon has joined #archiveteam
03:32 🔗 yuvadm_ has joined #archiveteam
03:32 🔗 tephra_ has joined #archiveteam
03:32 🔗 useretail has quit IRC (Read error: Operation timed out)
03:33 🔗 garyrh has quit IRC (Read error: Operation timed out)
03:33 🔗 yuvadm has quit IRC (Read error: Operation timed out)
03:33 🔗 tephra has quit IRC (Read error: Operation timed out)
03:34 🔗 winr5r has quit IRC (Ping timeout: 255 seconds)
03:35 🔗 lytv has quit IRC (Read error: Operation timed out)
03:38 🔗 lytv has joined #archiveteam
03:42 🔗 winr4r has joined #archiveteam
03:49 🔗 mistym has joined #archiveteam
03:57 🔗 Ravenloft has quit IRC (Ping timeout: 362 seconds)
04:06 🔗 useretail has joined #archiveteam
04:09 🔗 yotta has quit IRC (Read error: Operation timed out)
04:09 🔗 Ctrl-S has quit IRC (Read error: Connection reset by peer)
04:09 🔗 Ctrl-S_ is now known as Ctrl-S
04:10 🔗 joepie91 has quit IRC (Read error: Operation timed out)
04:10 🔗 aMunster has quit IRC (Read error: Operation timed out)
04:10 🔗 toad1 has quit IRC (Read error: Operation timed out)
04:10 🔗 phuzion has quit IRC (Read error: Operation timed out)
04:10 🔗 mutoso has quit IRC (Read error: Operation timed out)
04:10 🔗 nwf has quit IRC (Read error: Operation timed out)
04:10 🔗 dinomite_ has quit IRC (Write error: Broken pipe)
04:11 🔗 dinomite has joined #archiveteam
04:11 🔗 S[h]O[r]T has quit IRC (Read error: Operation timed out)
04:11 🔗 marvinw has quit IRC (Read error: Operation timed out)
04:12 🔗 achip has quit IRC (Read error: Operation timed out)
04:12 🔗 ripvanwin has quit IRC (Read error: Operation timed out)
04:12 🔗 joepie91 has joined #archiveteam
04:13 🔗 vegbrasil has quit IRC (Ping timeout: 600 seconds)
04:16 🔗 bzc6p_ has quit IRC (Read error: Operation timed out)
04:18 🔗 mutoso has joined #archiveteam
04:20 🔗 sep332 has quit IRC (Ping timeout: 600 seconds)
04:22 🔗 mistym has quit IRC (Remote host closed the connection)
04:29 🔗 RichardG_ has joined #archiveteam
04:30 🔗 RichardG has quit IRC (Read error: Connection reset by peer)
04:31 🔗 Emcy_ has joined #archiveteam
04:32 🔗 SketchCow POMF is being put in the wrong place on FOS, but I will deal.
04:34 🔗 Emcy has quit IRC (Ping timeout: 306 seconds)
04:34 🔗 lytv has quit IRC (Ping timeout: 306 seconds)
04:34 🔗 SketchCow Can someone help rip a site out of wayback to give to someone?
04:37 🔗 lytv has joined #archiveteam
04:38 🔗 aaaaaaaaa has quit IRC (Leaving)
04:38 🔗 marvinw has joined #archiveteam
04:38 🔗 phuzion has joined #archiveteam
04:38 🔗 ripvanwin has joined #archiveteam
04:38 🔗 nwf has joined #archiveteam
04:38 🔗 S[h]O[r]T has joined #archiveteam
04:38 🔗 vegbrasil has joined #archiveteam
04:39 🔗 Control-S has joined #archiveteam
04:39 🔗 achip has joined #archiveteam
04:40 🔗 aMunster has joined #archiveteam
04:40 🔗 Froggypwn has quit IRC (Ping timeout: 240 seconds)
04:41 🔗 sep332 has joined #archiveteam
04:42 🔗 toad1 has joined #archiveteam
04:44 🔗 rduser has quit IRC (Read error: Operation timed out)
04:44 🔗 midas has quit IRC (Read error: Operation timed out)
04:44 🔗 midas has joined #archiveteam
04:44 🔗 rduser has joined #archiveteam
04:47 🔗 SN4T14_ has joined #archiveteam
04:48 🔗 SN4T14 has quit IRC (Ping timeout: 369 seconds)
04:59 🔗 yipdw SketchCow: oh oops
05:00 🔗 yipdw I just put it in /1 because that's where everything else was
05:00 🔗 yipdw is it supposed to be /0?
05:04 🔗 mistym has joined #archiveteam
05:05 🔗 SketchCow traditionally it's in /1/CHFOO/warrior but I've already adapted.
05:07 🔗 McGEE has quit IRC (Quit: Connection closed for inactivity)
05:15 🔗 SketchCow Pomf now being loaded into archive
05:16 🔗 SketchCow halo continues, it's not flooding the system yet
05:16 🔗 SketchCow Did we get clearance off of Furraffinity or is there more?
05:20 🔗 SketchCow bsmith093: Clearance on the fanfiction?
05:25 🔗 Froggypwn has joined #archiveteam
05:53 🔗 JesseW has joined #archiveteam
06:02 🔗 bzc6p_ has joined #archiveteam
06:02 🔗 swebb sets mode: +o bzc6p_
06:12 🔗 SketchCow I almost made the same mistake with pomf that I did with baraza
06:12 🔗 SketchCow Items need to be much smaller, not larger.
06:31 🔗 bzc6p_ is now known as bzc6p
06:34 🔗 mistym has quit IRC (Remote host closed the connection)
06:40 🔗 garyrh has joined #archiveteam
06:54 🔗 JesseW has quit IRC (Quit: Leaving.)
06:54 🔗 ripvanwin has quit IRC (Read error: Connection reset by peer)
06:54 🔗 RichardG_ has quit IRC (Remote host closed the connection)
06:55 🔗 ripvanwin has joined #archiveteam
07:05 🔗 nox has quit IRC (Ping timeout: 252 seconds)
07:35 🔗 mistym has joined #archiveteam
07:36 🔗 godane has quit IRC (Read error: Operation timed out)
07:40 🔗 mistym has quit IRC (Ping timeout: 252 seconds)
07:40 🔗 joepie91 PSA
07:41 🔗 joepie91 .title https://torrentfreak.com/elsevier-cracks-down-on-pirated-scientific-articles-150609/
07:41 🔗 joepie91 (no botpie?
07:41 🔗 joepie91 Academic publishing company Elsevier has filed a complaint at a New York District Court, hoping to shut down the Library Genesis project and the SciHub.org search engine. The sites, which are particularly popular in developing nations where access to academic works is relatively expensive, are accused of pirating millions of scientific articles.
07:41 🔗 joepie91 so..
07:41 🔗 joepie91 yeah, maybe it's time for a copy
07:54 🔗 schbirid2 there are several mirrors
07:58 🔗 godane has joined #archiveteam
08:00 🔗 khaoohs_ has joined #archiveteam
08:00 🔗 khaoohs has quit IRC (Read error: Connection reset by peer)
08:07 🔗 bzc6p "net income of more than $1 billion [...] losses, which could run into the millions."
08:08 🔗 bzc6p so it *may* lose 1 of 1000 million dollars
08:28 🔗 DFJustin has quit IRC (Ping timeout: 740 seconds)
08:32 🔗 MMovie has joined #archiveteam
08:33 🔗 Swizzle__ has quit IRC (Read error: Connection reset by peer)
08:35 🔗 MMovie1 has quit IRC (Ping timeout: 306 seconds)
08:35 🔗 jmtd has quit IRC (Quit: ZNC - http://znc.in)
08:35 🔗 primus104 has joined #archiveteam
09:16 🔗 Froggypwn has quit IRC (Read error: Connection reset by peer)
09:16 🔗 primus104 has quit IRC (Leaving.)
09:17 🔗 Froggypwn has joined #archiveteam
09:25 🔗 mistym has joined #archiveteam
09:38 🔗 mistym has quit IRC (Read error: Operation timed out)
10:24 🔗 db48x has quit IRC (Read error: Connection reset by peer)
10:36 🔗 john4 has quit IRC (Ping timeout: 370 seconds)
10:44 🔗 vOYtEC has joined #archiveteam
10:47 🔗 vOYtEC has quit IRC (Read error: Connection reset by peer)
10:47 🔗 john4 has joined #archiveteam
10:48 🔗 vOYtEC has joined #archiveteam
10:50 🔗 vOYtEC has quit IRC (Read error: Connection reset by peer)
10:51 🔗 vOYtEC has joined #archiveteam
10:51 🔗 vOYtEC has quit IRC (Read error: Connection reset by peer)
11:14 🔗 mistym has joined #archiveteam
11:18 🔗 khaoohs has joined #archiveteam
11:18 🔗 bryan1 has joined #archiveteam
11:19 🔗 bryan1 hi
11:19 🔗 bryan1 is now known as _bryan
11:22 🔗 Ymgve has joined #archiveteam
11:24 🔗 mistym has quit IRC (Read error: Operation timed out)
11:24 🔗 khaoohs_ has quit IRC (Read error: Operation timed out)
11:26 🔗 sirdancea has joined #archiveteam
11:30 🔗 primus104 has joined #archiveteam
11:44 🔗 dinomite has quit IRC (Read error: Operation timed out)
11:44 🔗 dinomite has joined #archiveteam
11:48 🔗 nox has joined #archiveteam
12:14 🔗 midas this sucks http://classic.xfire.com/
12:16 🔗 midas still reachable via http://208.88.178.38/profile/%profilename% if someone has the time to grab it
12:40 🔗 BlueMaxim has quit IRC (Quit: Leaving)
13:03 🔗 mistym has joined #archiveteam
13:05 🔗 primus104 has quit IRC (Read error: Connection reset by peer)
13:10 🔗 mistym has quit IRC (Read error: Operation timed out)
13:14 🔗 Marc has joined #archiveteam
13:15 🔗 jmc has quit IRC ()
13:51 🔗 Start has quit IRC (Disconnected.)
13:51 🔗 Start has joined #archiveteam
13:51 🔗 Start has quit IRC (Client Quit)
13:55 🔗 Froggypwn has quit IRC (Ping timeout: 606 seconds)
13:56 🔗 Froggypwn has joined #archiveteam
13:58 🔗 primus104 has joined #archiveteam
14:04 🔗 mistym has joined #archiveteam
14:05 🔗 mistym has quit IRC (Remote host closed the connection)
14:05 🔗 mistym has joined #archiveteam
14:07 🔗 mistym has quit IRC (Remote host closed the connection)
14:09 🔗 DFJustin has joined #archiveteam
14:21 🔗 habi has joined #archiveteam
14:22 🔗 habi has left
14:27 🔗 habi has joined #archiveteam
14:32 🔗 mistym has joined #archiveteam
14:43 🔗 sirdancea has quit IRC (Read error: Operation timed out)
14:44 🔗 mistym has quit IRC (Remote host closed the connection)
14:47 🔗 habi has quit IRC (Quit: Leaving.)
14:48 🔗 Start has joined #archiveteam
14:54 🔗 primus104 has quit IRC (Leaving.)
14:55 🔗 bzc6p midas: deadline is June 12
14:55 🔗 bzc6p added to Deathwatch
14:56 🔗 Start has quit IRC (Disconnected.)
14:58 🔗 chrki has joined #archiveteam
14:58 🔗 chrki hey guys
14:58 🔗 mistym has joined #archiveteam
15:02 🔗 bzc6p_ has joined #archiveteam
15:02 🔗 swebb sets mode: +o bzc6p_
15:03 🔗 Start has joined #archiveteam
15:05 🔗 bzc6p has quit IRC (Read error: Operation timed out)
15:07 🔗 chrki I don't know if this fits your project's scope and all, I haven't found a good solution with archive.org (since they only let me archive pages one by one). The Xfire gaming messenger client is shutting down, along with all user profiles on their website. I found a way to access these still (through some Google trial and error), most links on there still work while the official page and links will all redirect to an "export
15:07 🔗 chrki your data page" or 404. Some example: http://google.comw.profile.xfire.com/profile/lkayral/ (not my profile) vs. http://xfire.com/profile/lkayral/. The google.comw.xfire.com domain seems to be aimed at search crawlers, Javascript links on there won't work, some things (like full gaming history on profiles) are simply hidden with CSS attributes
15:11 🔗 bzc6p_ is now known as bzc6p
15:11 🔗 bzc6p chrki: Welcome
15:11 🔗 bzc6p It absolutely fits into ArchiveTeam's scope, thanks for the report!
15:12 🔗 bzc6p You are not the only one: midas just mentioned it and found another way to access: http://208.88.178.38/profile/%profilename%
15:12 🔗 bzc6p (it might be the same as that google.comw whatever)
15:13 🔗 bzc6p chrki: do you have an idea how much that content is (num of profiles, amount in gigabytes) approximately?
15:13 🔗 bzc6p Just to see the scale.
15:14 🔗 chrki Wikipedia says 24 million users, each one probably has a profile, a friends and a screenshots page (although I would guess they are mostly empty), some profiles might be private, I don't know how much that could be
15:16 🔗 RichardG has joined #archiveteam
15:17 🔗 bzc6p There are videos also, aren't they?
15:17 🔗 chrki Yes there are
15:17 🔗 bzc6p And we have two days...
15:17 🔗 chrki Just checked a 34 second video, that's 3.2MB
15:20 🔗 bzc6p I don't know if we'll be able to set up a project quickly, but by scale it must be a Warrior one. At least let's start the conversation. I'll devote my afternoon to it.
15:20 🔗 bzc6p I suggest the channel #xfired
15:20 🔗 bzc6p chrki: can you stay for some discussion? We need to discover the site structure very first.
15:21 🔗 chrki 39505 videos alone in a month for the top 10 games http://webcache.googleusercontent.com/search?q=cache:r0zq17n_ifcJ:de.xfire.com/cms/stats/+&cd=1&hl=de&ct=clnk&gl=de
15:21 🔗 chrki sure I'll be here
15:22 🔗 bzc6p please come to #xfired and other interested too. We do what we can.
15:50 🔗 Start has quit IRC (Disconnected.)
15:51 🔗 mistym has quit IRC (Remote host closed the connection)
16:08 🔗 vOYtEC has joined #archiveteam
16:08 🔗 chrki has quit IRC (Quit: Leaving)
16:08 🔗 mistym has joined #archiveteam
16:11 🔗 nertzy has joined #archiveteam
16:22 🔗 bzc6p So these xfire guys gave like 2 days for users to export their stuff (links are already broken)
16:22 🔗 bzc6p Waiting queue is 21 houts
16:22 🔗 bzc6p *hours
16:23 🔗 bzc6p The site may be hosting 360,000,000 videos (short game recordings)
16:23 🔗 bzc6p and who knows how many screenshots
16:23 🔗 bzc6p everything nuked on friday
16:25 🔗 kniffy yikes
16:27 🔗 nertzy has quit IRC (Quit: This computer has gone to sleep)
16:30 🔗 GLaDOS has quit IRC (Ping timeout: 252 seconds)
16:32 🔗 antomatic yow
16:36 🔗 GLaDOS has joined #archiveteam
16:49 🔗 aaaaaaaaa has joined #archiveteam
16:49 🔗 swebb sets mode: +o aaaaaaaaa
16:52 🔗 schbirid2 http://www.bbc.com/news/business-33076527 -> https://drive.google.com/file/d/0B-Kg8JC-9TqnN245SU1rT2Q3VDg/view
17:02 🔗 Start has joined #archiveteam
17:22 🔗 sb057 has joined #archiveteam
17:23 🔗 sb057 emojli ("joke", emoji-based social network) just announced they're shutting down July 30, and deleting everything
17:23 🔗 sb057 http://emoj.li/
17:25 🔗 bzc6p sb057: what scale? (e.g. how many users)
17:26 🔗 sb057 not sure, but it got featured in The Independent and Time, so presumably more than six
17:26 🔗 sunnymilk where does all the stuff you guys archive get stored
17:28 🔗 bzc6p sunnymilk: we upload them to the Internet Archive
17:30 🔗 bzc6p sb057: I've added it to our Deathwatch for now. Thanks for reporting.
17:40 🔗 Start has quit IRC (Disconnected.)
17:45 🔗 SimpBrain xfire has broken their profile page btw, cant access my own one. waiting for my data export
17:47 🔗 bzc6p SimpBrain: you may find some useful information on repairing your profile page: http://www.reddit.com/r/Games/comments/39a41v/xfire_social_profiles_shutdown_save_your/
17:47 🔗 SimpBrain cool
17:47 🔗 bzc6p SimpBrain: how long is the queue and the waiting time?
17:47 🔗 SimpBrain dunno
17:47 🔗 bzc6p Didn't it inform you?
17:47 🔗 SimpBrain im number 3958
17:48 🔗 SimpBrain no time
17:48 🔗 sb057 well, looking at the comments
17:48 🔗 bzc6p According to that reddit thread, it'll take long
17:48 🔗 sb057 You are #653 in queue (Approx 1306 minutes)
17:48 🔗 bzc6p I don't know however how accurate that estimation is.
17:49 🔗 primus104 has joined #archiveteam
17:51 🔗 aaaaaaaaa 135 hours if it is
17:52 🔗 aaaaaaaaa actually closer to 136
17:53 🔗 bzc6p We just wondered with achip if we should archive the site even if we could. It would probably just slow down regular users' access even more.
17:57 🔗 c_b has joined #archiveteam
18:10 🔗 nertzy has joined #archiveteam
18:18 🔗 schbirid has joined #archiveteam
18:19 🔗 mutoso has quit IRC (Quit: leaving)
18:21 🔗 SimpBrain 3957 now lol, takes ages for xfire
18:24 🔗 bzc6p Strange. Just 7z-ing a bunch of files. (Although it's just a waste of time to 7z vids and pics.)
18:30 🔗 sirdancea has joined #archiveteam
18:37 🔗 primus104 has quit IRC (Leaving.)
18:39 🔗 c_b has quit IRC (Ping timeout: 252 seconds)
18:50 🔗 nertzy has quit IRC (This computer has gone to sleep)
18:51 🔗 habi has joined #archiveteam
18:53 🔗 Start has joined #archiveteam
18:57 🔗 sirdancea has quit IRC (Read error: Operation timed out)
19:01 🔗 aaaaaaaaa that is probably harder than it looks without affecting regular users.
19:01 🔗 habi has left
19:20 🔗 Start has quit IRC (Disconnected.)
19:21 🔗 Start has joined #archiveteam
19:25 🔗 Start has quit IRC (Client Quit)
19:25 🔗 aNthraXx has quit IRC (Read error: Operation timed out)
19:30 🔗 Start has joined #archiveteam
19:33 🔗 sb057 so how about those subreddits that just got deleted, eh?
19:34 🔗 kniffy /r/fatpeoplehate ?
19:35 🔗 Apathy_ sb057 drama is delicious
19:35 🔗 sb057 and others, and almost certainly more to follow
19:35 🔗 Apathy_ just stand from the sidelines and watch
19:36 🔗 sb057 well, I'm obviously no expert on AT's goals, but don't you think reddit might be worth archiving?
19:36 🔗 kniffy were the others marked as hateful like FPH?
19:36 🔗 Apathy_ kniffy there were like 4 other subs
19:36 🔗 sb057 FPH wasn't "hateful"
19:36 🔗 sb057 it was "harassing"
19:37 🔗 Apathy_ but all under 5k subs
19:37 🔗 mistym has quit IRC (Remote host closed the connection)
19:37 🔗 Apathy_ i dont think fatpeoplehate would be worth archiving
19:38 🔗 kniffy yeah, i think it would be questionable
19:38 🔗 sb057 right, but who can tell what's next?
19:38 🔗 Apathy_ slippery slope etc i understand
19:38 🔗 Apathy_ but finding creepshots of fat people with some ohsowitty caption isnt something we're lacking
19:38 🔗 Apathy_ maybe its just not on reddit/in a single subreddit anymore
19:47 🔗 aNthraXx has joined #archiveteam
19:52 🔗 mistym has joined #archiveteam
19:54 🔗 lolwhydoi has joined #archiveteam
19:54 🔗 lolwhydoi WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD
19:54 🔗 lolwhydoi now I look like a fool.
19:54 🔗 ersi_ lolwhydoi: "yahoosucks" (without quotes)
19:55 🔗 aaaaaaaaa lolwhydoi: yahoosucks
19:55 🔗 ersi_ aaaaaaaaa: old!
19:55 🔗 lolwhydoi ahaha that's the best secret word - thanks guys.
19:56 🔗 lolwhydoi has quit IRC (http://www.kiwiirc.com/ - A hand crafted IRC client)
19:58 🔗 sb058 has joined #archiveteam
19:59 🔗 bzc6p autist neckbeard
20:00 🔗 bzc6p Apathy_, kniffy, sb057: you can always save a single website with archive.org/save/URL
20:00 🔗 bzc6p go right into the wayback machine
20:00 🔗 bzc6p eeeeexcept if it's robots.txt protected, I don't know Reddit
20:00 🔗 sb058 yeah, I know
20:00 🔗 sb057 has quit IRC (Ping timeout: 252 seconds)
20:01 🔗 sb058 is now known as sb057
20:01 🔗 sb057 its just that it can be hard to predict what is going to get the ban hammer next
20:01 🔗 kniffy yeah, there wasnt any warning of the takedowns
20:02 🔗 bzc6p Once an ArchiveTeam project also touched some Reddit and some panic burst out:
20:02 🔗 bzc6p http://www.reddit.com/r/privacy/comments/1emh4r/urgent_delete_any_old_reddit_posts_you_dont_want/
20:02 🔗 sb057 plus, reddit itself shouldn't be that big
20:02 🔗 sb057 since its entirely hyperlinks and comments
20:03 🔗 sb057 plus some custom subreddit styling I guess
20:03 🔗 kniffy if one wanted to grab all images linked to the size would balloon insanely
20:03 🔗 bzc6p I like the part when a guy writes "it's kinda hard to read it when half of the old threads have been butchered by idiots deleting posts that were contributing to the discussion" and then comes like five [deleted]
20:03 🔗 chazchaz_ has joined #archiveteam
20:03 🔗 chazchaz_ has quit IRC (Remote host closed the connection)
20:03 🔗 sb057 yeah kniffy, that would essentially amount to backing up all of imgur
20:03 🔗 kniffy exactly
20:04 🔗 kniffy i assume you guys heard of the whole row imgur caused by suddenly not wanting NSFW images
20:04 🔗 kniffy duno if they're going and deleting stuff
20:04 🔗 sb057 I thought that only applied to comments?
20:04 🔗 sb057 and that it was reversed?
20:04 🔗 kniffy tbh i've got no idea on either of those
20:04 🔗 kniffy i dont use imgur much
20:08 🔗 chazchaz_ has joined #archiveteam
20:15 🔗 jmc has joined #archiveteam
20:16 🔗 xmc hah
20:21 🔗 Start has quit IRC (Disconnected.)
20:23 🔗 bzc6p SimpBrain: someone reported on Reddit that his export finished. Guess if it was a complete export or an 1 mb 7z with zero screenshots out of 4,000.
20:23 🔗 bzc6p (xfire)
20:23 🔗 SimpBrain nice
20:24 🔗 sb058 has joined #archiveteam
20:24 🔗 SimpBrain the 1mb export
20:24 🔗 bzc6p I've waited long for an opportunity to post this
20:24 🔗 bzc6p http://m.cdn.blog.hu/aj/ajemdibi/226166_453437977_big.jpg
20:25 🔗 McGEE has joined #archiveteam
20:25 🔗 Atluxity well done, sir
20:25 🔗 bzc6p what do they actually 7z for hours then, is a mistery
20:26 🔗 pikhq "We run 7z on an 8080"
20:26 🔗 n00b897_ has joined #archiveteam
20:26 🔗 McGEE lol
20:29 🔗 sb058_ has joined #archiveteam
20:31 🔗 Start has joined #archiveteam
20:31 🔗 sb057 has quit IRC (Ping timeout: 492 seconds)
20:32 🔗 sb058__ has joined #archiveteam
20:32 🔗 sb058__ is now known as sb057
20:32 🔗 sb057 guess I should set up my znc for efnet eh
20:33 🔗 SketchCow Movement is happening with POMF and HALO.
20:34 🔗 SketchCow Usage of disk space is up from 6% to 15% but I suspect that's about to get fixed, so we're holding out.
20:34 🔗 McGEE yay
20:35 🔗 sb057 has quit IRC (Client Quit)
20:35 🔗 sb058_ has quit IRC (Ping timeout: 306 seconds)
20:35 🔗 sb058 has quit IRC (Read error: Operation timed out)
20:41 🔗 db48x has joined #archiveteam
20:42 🔗 mistym has quit IRC (Remote host closed the connection)
20:42 🔗 sb057 has joined #archiveteam
20:44 🔗 DFJustin <sb057> well, I'm obviously no expert on AT's goals, but don't you think reddit might be worth archiving? <-- reddit as a whole is really big, we do archive individual subreddits from time to time using #archivebot
20:44 🔗 DFJustin there's nothing we can do if it's already deleted though
20:46 🔗 DFJustin looking at the archivebot dashboard we're currently grabbing /r/IMGXXXX/ /r/thebutton/ /r/Xcom/ /r/internetcollection/ /r/news/
20:48 🔗 DFJustin if you have nominations that would be good to archive let us know, the normal wayback machine crawls do get quite a bit as well though so it's helpful to check https://web.archive.org/ for completeness first
20:50 🔗 DFJustin ones we've done in the past: http://archive.fart.website/archivebot/viewer/domain/www.reddit.com
20:50 🔗 primus104 has joined #archiveteam
20:56 🔗 mistym has joined #archiveteam
21:05 🔗 sivoais has quit IRC (Remote host closed the connection)
21:07 🔗 schbirid /r/fatpeoplehate2/ might be a worthy target right now
21:08 🔗 schbirid considering it's all over /r/all
21:16 🔗 sivoais has joined #archiveteam
21:17 🔗 Start has quit IRC (Disconnected.)
21:17 🔗 DFJustin has quit IRC (Remote host closed the connection)
21:17 🔗 DFJustin has joined #archiveteam
21:17 🔗 swebb sets mode: +o DFJustin
21:18 🔗 DFJustin added
21:20 🔗 Howl has joined #archiveteam
21:31 🔗 SketchCow I always assumed our brothers-in-arms at /r/datahoarders were taking care of buziness.
21:32 🔗 SketchCow 17:31 < BotoX_> dafuq is WARC
21:32 🔗 Apathy_ [it begins]
21:32 🔗 SketchCow changes topic to: Archive Team: We're not archive.org | http://archiveteam.org/ | lengthy/off-topic in #archiveteam-bs | < BotoX_> dafuq is WARC
21:33 🔗 arkiver haha
21:37 🔗 Rotab :D
21:44 🔗 Rotab weeaboos..
21:47 🔗 Lord_Nigh #datahoarders on freenode
21:47 🔗 Lord_Nigh i dropped out of there a few days ago, hand't said anythong on channel in over 6 months
21:49 🔗 wp494 sb057: I think the admins said a few years back on world backup day the total size is like ~1 TB
21:49 🔗 wp494 I'd imagine it's grown to ~5
21:49 🔗 wp494 but then again it might not be a complete backup, maybe only a "just save content but don't save things like css images" one
21:49 🔗 wp494 so real size would likely be bigger
21:50 🔗 SketchCow Statistically, there's got to be one freak out there
21:52 🔗 wp494 here's the post: http://www.redditblog.com/2013/03/3rd-annual-world-backup-day-whats-in.html
21:52 🔗 wp494 so a little more over 1 TB
21:52 🔗 wp494 but it's compressed as hell
21:55 🔗 wp494 as for "would we not have archived FPH just because it's immoral"...I mean I agree that the subreddit needed to go at some point, but we shouldn't be the judges of whether or not something's moral and whether or not to archive it
21:55 🔗 wp494 you gotta stay neutral, otherwise bias reeks everywhere
21:55 🔗 Apathy_ "immoral"
21:55 🔗 Apathy_ nice meme
21:55 🔗 SketchCow What is FPH
21:56 🔗 pikhq r/fatpeoplehate
21:56 🔗 SketchCow Oh, oh. The fat people one.
21:56 🔗 pikhq The reddit drama, it's leaking.
21:56 🔗 SketchCow That's the one that reminds me that if I cut someone off on the highway, they probably post on that board
21:56 🔗 SketchCow And I feel better
21:56 🔗 SketchCow Or when you feel bad because someone donated organs and was just a young kid on a motorcycle
21:56 🔗 SketchCow He was probably on FPH
21:57 🔗 SketchCow Then yu go "oh, that almost makes up for it"
21:58 🔗 SketchCow Also, please
21:58 🔗 SketchCow PLEASE
21:58 🔗 SketchCow P L E A S E
21:58 🔗 SketchCow People stop inviting Botox to do anything
21:58 🔗 SketchCow Just let this little grab project finish and move on
22:00 🔗 Sue_ has quit IRC (Ping timeout: 252 seconds)
22:00 🔗 WubTheCap There is one freak in #datahoarder with over 33 TB content, frequents many many IRC networks and channels
22:00 🔗 WubTheCap As far as I remember
22:00 🔗 Apathy_ SketchCow isnt EVERYONE a valued member of the group?
22:00 🔗 SketchCow No.
22:01 🔗 xmc :)
22:01 🔗 bzc6p I've been making much noise recently in these channels. I'll take back. Sorry.
22:02 🔗 bzc6p Good night
22:02 🔗 Apathy_ o/
22:03 🔗 Kazzy WubTheCap: There is one freak in #datahoarder with 1.4PB of content
22:04 🔗 wp494 PB?!
22:04 🔗 kniffy a petabyte? thats a lotta hentai
22:04 🔗 xmc gitorious continues apace
22:05 🔗 SketchCow A buddy of mine had a gigabyte of disk space in his apartment in 1988.
22:05 🔗 SketchCow He had to haul in ridiculous shit to do it, but he did it!
22:05 🔗 SketchCow Then got bored with it, moved on.
22:05 🔗 xmc gitorious: 663GB 42:37:13 [4.31MB/s] [====> ] 13% ETA 275:35:55
22:05 🔗 SketchCow Gave me some of it.
22:05 🔗 xmc my dad sold single gigabytes when he worked for ibm in the 70s
22:06 🔗 Fletcher would it be worth utilising the reddit api to archive subreddits instead of scraping? (if that isn't what's already being used)
22:06 🔗 xmc probably
22:10 🔗 wp494 I think API has a limit of like 1000 posts
22:10 🔗 wp494 so say if you hit /r/tf2/new, you'd only get the most recent 1000 posts at most
22:10 🔗 wp494 same goes for profiles (some say it's an anti-dox measure here), past 1000 comments/submissions and you're done
22:11 🔗 WubTheCap Public profiles are hardly dox.
22:11 🔗 xmc it's a decision that reddit made, not a thing to argue here
22:11 🔗 Fletcher hmm, I think you can define a starting point, not sure if that will go past 1000
22:16 🔗 wp494 yeah public profiles are hardly dox, that's true WubTheCap
22:16 🔗 wp494 (and is also what people said when people were flipping out in /r/privacy over google reader too)
22:20 🔗 Start has joined #archiveteam
22:21 🔗 arkiver SketchCow: did you remove the directory of trovebox from fos?
22:22 🔗 arkiver Making a small update to the scripts so we can finish that project tomorrow, but FOS isn't taking the files
22:23 🔗 godane SketchCow: at least 1gb of data in 1988 could at least be put on 2 cds
22:23 🔗 SketchCow I..... assume
22:23 🔗 godane so there was a way to off load it
22:23 🔗 SketchCow 18:05 <@SketchCow> A buddy of mine had a gigabyte of disk space in his apartment in 1988.
22:23 🔗 SketchCow Maybe it was a terabyte and 1990
22:23 🔗 SketchCow I am very old
22:23 🔗 godane ok then
22:24 🔗 xmc i vaguely remember terabyte parties being a thing
22:24 🔗 godane terabyte in 1990 your just screwed then
22:24 🔗 godane in less it last for a good 10 to 15 years
22:24 🔗 godane that way it can be at least moved
22:25 🔗 SketchCow xmc: euphemism
22:25 🔗 xmc oh yeah?
22:26 🔗 SketchCow I just like it as a euphemism
22:26 🔗 xmc it could be a good one
22:26 🔗 SketchCow arkiver: I am sure I did.
22:27 🔗 arkiver SketchCow: ok, probably 50G-100G more will come from trovebox. Are you able to create the rsync again for trovebox? or maybe yipdw?
22:28 🔗 SketchCow Ask yipw to
22:28 🔗 SketchCow I am blasting through things on that machine, cleaning it up
22:29 🔗 arkiver Ok, so next is last.fm. Looking into the 18 items that keep failing, after those are finished we have saved the full forum in all languages from last.fm.
22:29 🔗 arkiver That was what we wanted to grab from last.fm right? (user content)
22:30 🔗 Sue_ has joined #archiveteam
22:32 🔗 SketchCow Yes
22:32 🔗 SketchCow Although I think we grabbed a lot, and it was an insider who thinks they're going to fuck it up bad
22:35 🔗 arkiver Ok, so only 18 (problematic) items left for lastfm and then that's done. Baraza is done. And I'll sort all the discovered sites of blogger in the coming days, so we can start on that too
22:35 🔗 DFJustin well there is more to last.fm than forums, for example user profiles and comments on songs and artists
22:35 🔗 arkiver Two big project coming up in the coming months: SourceForge (starting this month), Google Code (starting end of august)
22:36 🔗 Howl has quit IRC (Quit: afk now)
22:36 🔗 DFJustin I think events have comments too
22:37 🔗 arkiver SketchCow: I also think it'd be a good idea to start a torrentwebsites project.
22:38 🔗 arkiver Since torrentsites most of the time go offline without notice and contain a lot of metadata, comments, etc. I think we should start backing them up
22:39 🔗 arkiver We should then keep the .torrent files in an other pack then the rest of the websites and only add those .torrent files to the wayback machine after they're not working anymore (to prevent IA getting in trouble with them)
22:40 🔗 arkiver What do you think of that? Size shouldn't be too big, there's no videos, audios and only some images.
22:41 🔗 arkiver DFJustin: I'll have a look at those, thanks
22:42 🔗 arkiver Magnet urls might be a problem thought ^^
22:43 🔗 arkiver though*
22:52 🔗 n00b897_ has quit IRC (Quit: Page closed)
23:04 🔗 chfoo has joined #archiveteam
23:14 🔗 SketchCow -----------------------------------------
23:15 🔗 SketchCow ARCHIVE.ORG is swapping some internal things
23:15 🔗 SketchCow As a result, things might act a little weird
23:15 🔗 SketchCow if you see some weird today, like missing stuff or timeouts
23:15 🔗 SketchCow now you know why. Not much to do, they're working
23:15 🔗 SketchCow hard to get it done quickly.
23:15 🔗 SketchCow -----------------------------------------
23:37 🔗 Ymgve has quit IRC ()
23:38 🔗 Muad-Dib has quit IRC (Ping timeout: 252 seconds)
23:46 🔗 TheLQ has joined #archiveteam

irclogger-viewer