#archiveteam-bs 2014-05-07,Wed

↑back Search

Time Nickname Message
01:10 πŸ”— dashcloud hi folks- Eurovision is starting to pop up in my twitter feed- can one of you tell me a little about it?
01:10 πŸ”— exmic sure man
01:10 πŸ”— exmic every country in europe gets to decide whose singers are shit
01:10 πŸ”— nico true
01:11 πŸ”— dashcloud actually shit, or just fashionable to say they are?
01:11 πŸ”— nico take the worst cliché of a country
01:11 πŸ”— exmic dashcloud: wikipedia can probably do a much better job explaining it than I can
01:12 πŸ”— nico and amplify them
01:12 πŸ”— nico that's the eurovision
01:13 πŸ”— dashcloud so, if I'm looking for a talent show, watch American Idol, but if I want to see a bad comedy hour, watch Eurovision?
01:13 πŸ”— exmic eurovision is actually really entertaining
01:13 πŸ”— nico for some value of entertaining
01:13 πŸ”— exmic they spend a shitpile of money making the fanciest performance they can
01:13 πŸ”— SketchCow Yes
01:13 πŸ”— SketchCow For a similar reason, the most insane DJ competitions are the same
01:14 πŸ”— SketchCow months of prep for a 5 minute routine
01:15 πŸ”— turnip Mixmaster Mike with the scratch routine
01:20 πŸ”— nico http://www.reddit.com/r/funny/comments/24w2l1/fuck_this_girl_in_particular/
01:20 πŸ”— nico hu?
01:23 πŸ”— SketchCow That's a rave.
01:23 πŸ”— nico look interesting
02:02 πŸ”— dashcloud an 800 page book of colors of all types, hundreds of years before the pantone color book: http://www.thisiscolossal.com/2014/05/color-book/ (also viewable online!)
02:08 πŸ”— Coderjoe back in 2007, this is the act that ukraine voted to enter: https://www.youtube.com/watch?v=hfjHJneVonE
02:14 πŸ”— Coderjoe same year, switzerland entered this song, by the guy behind the chihuahua song: https://www.youtube.com/watch?v=0ydRhwnwk-s
02:15 πŸ”— Coderjoe i kinda liked israel's entry: https://www.youtube.com/watch?v=424dX16SObQ
03:05 πŸ”— godane so i found a way to get video sitemap (maybe) from theguardian.com
03:11 πŸ”— godane here we go: http://spiderbytes.theguardian.com/sitemap/sitemap-2013.xml
03:11 πŸ”— godane master sitemap for the the sub-sections sitemaps for 2013
03:15 πŸ”— godane fun fact i may have don't theguardian.com before or at least i tryed: https://archive.org/search.php?query=collection%3A%22archiveteam-fire%22%20AND%20%28subject%3A%22www.theguardian.com%22%29
03:35 πŸ”— godane so i look at how i did before
03:35 πŸ”— godane it goes somethng like this url: http://www.theguardian.com/books/2001/dec/30/all
03:36 πŸ”— godane the problem with the web sitemap is there not all books urls
03:36 πŸ”— godane some go other sections
03:36 πŸ”— godane also note that the sitemaps maybe incomplete before 2000
03:38 πŸ”— godane they don't even have 1990 sitemap xml when i know there are on the website
04:55 πŸ”— godane i getting another N64 promo video from myspleen
08:47 πŸ”— midas ohhdemgir: lots of people on the _albums one
08:48 πŸ”— midas schbirid: im ordering extra disks tonight
09:06 πŸ”— schbirid midas: yay :))
09:24 πŸ”— ohhdemgir midas, both have been seeding around 40MB/s for the last 15 hours or so
10:12 πŸ”— midas yeah 100mbit box is at 90Mbit all day
10:47 πŸ”— ohhdemgir midas, http://www.reddit.com/r/AmateurArchives/comments/24vr5r/rgonewild_history_20092013_torrents/chbq6fh
10:47 πŸ”— ohhdemgir oops...
11:18 πŸ”— midas oops.
11:18 πŸ”— midas small oops, but still :p
11:18 πŸ”— midas ohhdemgir: when will the blacklisted archive be posted ;)
11:18 πŸ”— ohhdemgir XD
11:18 πŸ”— midas hahaha
11:18 πŸ”— ohhdemgir that's bad (GREAT!!!!) idea
11:18 πŸ”— midas \o/
11:20 πŸ”— ohhdemgir clearly some people already use that as their treasure list, I get pms about it from time to time
11:22 πŸ”— midas haha
11:22 πŸ”— midas wonder why
11:22 πŸ”— ohhdemgir is the a wget flag to ignore certain size files?
11:26 πŸ”— ohhdemgir also, why? forbidden fruits!!
11:27 πŸ”— midas it was sarcasm :p
11:28 πŸ”— ohhdemgir the truth, people want what they're not meant to have
11:28 πŸ”— midas as always
11:32 πŸ”— ohhdemgir http://i.imgur.com/PIusERE.jpg
11:34 πŸ”— midas hahaha
11:38 πŸ”— midas so, Hostdeal Ltd just went belly up
11:38 πŸ”— midas website has been archived, but no way to tell how many sites it killed
11:42 πŸ”— ohhdemgir listen to the dude up to around 1 minutes - https://www.youtube.com/watch?v=7Zpc8VIYppc
12:03 πŸ”— midas ohhdemgir: whats wrong with this picture? http://i.imgur.com/lNG6XeH.png
12:03 πŸ”— midas ;-)
12:03 πŸ”— ohhdemgir XD
12:04 πŸ”— midas it's so limited to 100mbit :<
12:06 πŸ”— ohhdemgir RX bytes:26265137418752 (26.2 TB) TX bytes:91754905220702 (91.7 TB)
12:06 πŸ”— ohhdemgir 14:06:09 up 11 days, 20:41, 2 users, load average: 7.49, 7.88, 7.96
12:06 πŸ”— ohhdemgir uptime
12:06 πŸ”— midas lol
12:07 πŸ”— midas thats pritty badass
12:09 πŸ”— ohhdemgir I get messages now and again from the host with things like "Hey you managed to stay under 300TB this month, well done.."
12:09 πŸ”— midas lol
12:10 πŸ”— schbirid nice
18:05 πŸ”— DFJustin nice, android pirates are uploading straight to ia now https://archive.org/details/Androtreasure.net_20140507_1722
18:06 πŸ”— DFJustin saves us some work
18:09 πŸ”— garyrh http://techcrunch.com/2014/05/07/watch-michael-arringtons-fireside-chat-with-marissa-mayer-here-at-200-pm-edt/
18:33 πŸ”— Smiley lol DFJustin
18:46 πŸ”— exmic hahaha
18:46 πŸ”— exmic :)
19:29 πŸ”— nico Open Source Software >
19:29 πŸ”— nico of course
19:30 πŸ”— nico there are strange things on IA
19:30 πŸ”— nico https://archive.org/details/RA320
19:30 πŸ”— nico someone backup?
19:33 πŸ”— DFJustin I was gonna do a tumblr but then I never got around to updating http://weirdshitonarchivedotorg.tumblr.com/
19:35 πŸ”— nico the worst thing, i have a folder like that on my storage box
19:35 πŸ”— nico downloads folder of mobile device
19:36 πŸ”— schbirid hm, that anal game says it is darked but it isnt
19:38 πŸ”— schbirid i'll mail info@ about that comment line
19:38 πŸ”— schbirid there are 40k items according to google
19:39 πŸ”— DFJustin they were darked by mistake and then undarked
19:41 πŸ”— schbirid oh
19:41 πŸ”— schbirid too late, mail sent :)
19:43 πŸ”— nico https://www.google.com/search?q=lock_delete_darke_user.py
20:09 πŸ”— schbirid lol, some bug "Uncompressed size: 3307158438050 MB (3467806966336326157 bytes)"
20:11 πŸ”— nico zip bomb?
20:11 πŸ”— exmic seems legit
20:13 πŸ”— schbirid looked at a tar.lzo file with lzmainfo but lzma != lzo apparently :)
20:15 πŸ”— exmic lzop, yeah
20:16 πŸ”— schbirid yeah
21:26 πŸ”— SketchCow I've handed the upcoming /join #livingroom
22:29 πŸ”— ohhdemgir anyone got - https://www.fanfiction.net/
22:31 πŸ”— exmic iirc we did a crawl of ff.n about a year ago
22:39 πŸ”— ohhdemgir exmic, - http://www.reddit.com/r/DataHoarder/comments/245ij1/start_your_own_rgonewild_archive_automated_data/chc6shy?context=3
22:39 πŸ”— ohhdemgir "Ffnet throttles severely"
22:40 πŸ”— ohhdemgir would be nice to get it again but ain't no one got time the that limiting!!
22:43 πŸ”— nico the current ffnet downloader code is waiting for 30s every n requests
22:44 πŸ”— midas fuck yeah, im going to archive some internet gold
22:45 πŸ”— nico s/30/3/
22:45 πŸ”— nico 00:45:01 ҏš [nico@Gallifrey:/home/nico/Developpement/DeFFNetIzer] master 2 ± grep sleep deffnet.py time.sleep(3.0)
22:58 πŸ”— tsp__ Is there a difference between archiving and scraping? I've always thought archiving was preserving everything about a site, and scraping was just keeping the bits you want, but I could be wrong
23:03 πŸ”— nico tsp__: nowaday you've to scrape the website because everythings is loaded by ajax and other javascript monstruosities
23:35 πŸ”— DFJustin I would say scraping is pulling information off of a site in an automated way, that it wasn't intended for
23:36 πŸ”— DFJustin like scraping book titles off of amazon
23:36 πŸ”— DFJustin whereas archiving is just saving unmodified copies for later use
23:37 πŸ”— tsp__ For example, I want to pull forum posts out of a forum. I don't have any experience with the archiveteam scripts to actually archive it properly, but I just want its posts; I can do that trivially with python and requests. I guess that would be scraping

irclogger-viewer