Time |
Nickname |
Message |
01:10
π
|
dashcloud |
hi folks- Eurovision is starting to pop up in my twitter feed- can one of you tell me a little about it? |
01:10
π
|
exmic |
sure man |
01:10
π
|
exmic |
every country in europe gets to decide whose singers are shit |
01:10
π
|
nico |
true |
01:11
π
|
dashcloud |
actually shit, or just fashionable to say they are? |
01:11
π
|
nico |
take the worst clichΓΒ© of a country |
01:11
π
|
exmic |
dashcloud: wikipedia can probably do a much better job explaining it than I can |
01:12
π
|
nico |
and amplify them |
01:12
π
|
nico |
that's the eurovision |
01:13
π
|
dashcloud |
so, if I'm looking for a talent show, watch American Idol, but if I want to see a bad comedy hour, watch Eurovision? |
01:13
π
|
exmic |
eurovision is actually really entertaining |
01:13
π
|
nico |
for some value of entertaining |
01:13
π
|
exmic |
they spend a shitpile of money making the fanciest performance they can |
01:13
π
|
SketchCow |
Yes |
01:13
π
|
SketchCow |
For a similar reason, the most insane DJ competitions are the same |
01:14
π
|
SketchCow |
months of prep for a 5 minute routine |
01:15
π
|
turnip |
Mixmaster Mike with the scratch routine |
01:20
π
|
nico |
http://www.reddit.com/r/funny/comments/24w2l1/fuck_this_girl_in_particular/ |
01:20
π
|
nico |
hu? |
01:23
π
|
SketchCow |
That's a rave. |
01:23
π
|
nico |
look interesting |
02:02
π
|
dashcloud |
an 800 page book of colors of all types, hundreds of years before the pantone color book: http://www.thisiscolossal.com/2014/05/color-book/ (also viewable online!) |
02:08
π
|
Coderjoe |
back in 2007, this is the act that ukraine voted to enter: https://www.youtube.com/watch?v=hfjHJneVonE |
02:14
π
|
Coderjoe |
same year, switzerland entered this song, by the guy behind the chihuahua song: https://www.youtube.com/watch?v=0ydRhwnwk-s |
02:15
π
|
Coderjoe |
i kinda liked israel's entry: https://www.youtube.com/watch?v=424dX16SObQ |
03:05
π
|
godane |
so i found a way to get video sitemap (maybe) from theguardian.com |
03:11
π
|
godane |
here we go: http://spiderbytes.theguardian.com/sitemap/sitemap-2013.xml |
03:11
π
|
godane |
master sitemap for the the sub-sections sitemaps for 2013 |
03:15
π
|
godane |
fun fact i may have don't theguardian.com before or at least i tryed: https://archive.org/search.php?query=collection%3A%22archiveteam-fire%22%20AND%20%28subject%3A%22www.theguardian.com%22%29 |
03:35
π
|
godane |
so i look at how i did before |
03:35
π
|
godane |
it goes somethng like this url: http://www.theguardian.com/books/2001/dec/30/all |
03:36
π
|
godane |
the problem with the web sitemap is there not all books urls |
03:36
π
|
godane |
some go other sections |
03:36
π
|
godane |
also note that the sitemaps maybe incomplete before 2000 |
03:38
π
|
godane |
they don't even have 1990 sitemap xml when i know there are on the website |
04:55
π
|
godane |
i getting another N64 promo video from myspleen |
08:47
π
|
midas |
ohhdemgir: lots of people on the _albums one |
08:48
π
|
midas |
schbirid: im ordering extra disks tonight |
09:06
π
|
schbirid |
midas: yay :)) |
09:24
π
|
ohhdemgir |
midas, both have been seeding around 40MB/s for the last 15 hours or so |
10:12
π
|
midas |
yeah 100mbit box is at 90Mbit all day |
10:47
π
|
ohhdemgir |
midas, http://www.reddit.com/r/AmateurArchives/comments/24vr5r/rgonewild_history_20092013_torrents/chbq6fh |
10:47
π
|
ohhdemgir |
oops... |
11:18
π
|
midas |
oops. |
11:18
π
|
midas |
small oops, but still :p |
11:18
π
|
midas |
ohhdemgir: when will the blacklisted archive be posted ;) |
11:18
π
|
ohhdemgir |
XD |
11:18
π
|
midas |
hahaha |
11:18
π
|
ohhdemgir |
that's bad (GREAT!!!!) idea |
11:18
π
|
midas |
\o/ |
11:20
π
|
ohhdemgir |
clearly some people already use that as their treasure list, I get pms about it from time to time |
11:22
π
|
midas |
haha |
11:22
π
|
midas |
wonder why |
11:22
π
|
ohhdemgir |
is the a wget flag to ignore certain size files? |
11:26
π
|
ohhdemgir |
also, why? forbidden fruits!! |
11:27
π
|
midas |
it was sarcasm :p |
11:28
π
|
ohhdemgir |
the truth, people want what they're not meant to have |
11:28
π
|
midas |
as always |
11:32
π
|
ohhdemgir |
http://i.imgur.com/PIusERE.jpg |
11:34
π
|
midas |
hahaha |
11:38
π
|
midas |
so, Hostdeal Ltd just went belly up |
11:38
π
|
midas |
website has been archived, but no way to tell how many sites it killed |
11:42
π
|
ohhdemgir |
listen to the dude up to around 1 minutes - https://www.youtube.com/watch?v=7Zpc8VIYppc |
12:03
π
|
midas |
ohhdemgir: whats wrong with this picture? http://i.imgur.com/lNG6XeH.png |
12:03
π
|
midas |
;-) |
12:03
π
|
ohhdemgir |
XD |
12:04
π
|
midas |
it's so limited to 100mbit :< |
12:06
π
|
ohhdemgir |
RX bytes:26265137418752 (26.2 TB) TX bytes:91754905220702 (91.7 TB) |
12:06
π
|
ohhdemgir |
14:06:09 up 11 days, 20:41, 2 users, load average: 7.49, 7.88, 7.96 |
12:06
π
|
ohhdemgir |
uptime |
12:06
π
|
midas |
lol |
12:07
π
|
midas |
thats pritty badass |
12:09
π
|
ohhdemgir |
I get messages now and again from the host with things like "Hey you managed to stay under 300TB this month, well done.." |
12:09
π
|
midas |
lol |
12:10
π
|
schbirid |
nice |
18:05
π
|
DFJustin |
nice, android pirates are uploading straight to ia now https://archive.org/details/Androtreasure.net_20140507_1722 |
18:06
π
|
DFJustin |
saves us some work |
18:09
π
|
garyrh |
http://techcrunch.com/2014/05/07/watch-michael-arringtons-fireside-chat-with-marissa-mayer-here-at-200-pm-edt/ |
18:33
π
|
Smiley |
lol DFJustin |
18:46
π
|
exmic |
hahaha |
18:46
π
|
exmic |
:) |
19:29
π
|
nico |
Open Source Software > |
19:29
π
|
nico |
of course |
19:30
π
|
nico |
there are strange things on IA |
19:30
π
|
nico |
https://archive.org/details/RA320 |
19:30
π
|
nico |
someone backup? |
19:33
π
|
DFJustin |
I was gonna do a tumblr but then I never got around to updating http://weirdshitonarchivedotorg.tumblr.com/ |
19:35
π
|
nico |
the worst thing, i have a folder like that on my storage box |
19:35
π
|
nico |
downloads folder of mobile device |
19:36
π
|
schbirid |
hm, that anal game says it is darked but it isnt |
19:38
π
|
schbirid |
i'll mail info@ about that comment line |
19:38
π
|
schbirid |
there are 40k items according to google |
19:39
π
|
DFJustin |
they were darked by mistake and then undarked |
19:41
π
|
schbirid |
oh |
19:41
π
|
schbirid |
too late, mail sent :) |
19:43
π
|
nico |
https://www.google.com/search?q=lock_delete_darke_user.py |
20:09
π
|
schbirid |
lol, some bug "Uncompressed size: 3307158438050 MB (3467806966336326157 bytes)" |
20:11
π
|
nico |
zip bomb? |
20:11
π
|
exmic |
seems legit |
20:13
π
|
schbirid |
looked at a tar.lzo file with lzmainfo but lzma != lzo apparently :) |
20:15
π
|
exmic |
lzop, yeah |
20:16
π
|
schbirid |
yeah |
21:26
π
|
SketchCow |
I've handed the upcoming /join #livingroom |
22:29
π
|
ohhdemgir |
anyone got - https://www.fanfiction.net/ |
22:31
π
|
exmic |
iirc we did a crawl of ff.n about a year ago |
22:39
π
|
ohhdemgir |
exmic, - http://www.reddit.com/r/DataHoarder/comments/245ij1/start_your_own_rgonewild_archive_automated_data/chc6shy?context=3 |
22:39
π
|
ohhdemgir |
"Ffnet throttles severely" |
22:40
π
|
ohhdemgir |
would be nice to get it again but ain't no one got time the that limiting!! |
22:43
π
|
nico |
the current ffnet downloader code is waiting for 30s every n requests |
22:44
π
|
midas |
fuck yeah, im going to archive some internet gold |
22:45
π
|
nico |
s/30/3/ |
22:45
π
|
nico |
00:45:01 Γ’ΒΒ [nico@Gallifrey:/home/nico/Developpement/DeFFNetIzer] master 2 ΓΒ± grep sleep deffnet.py time.sleep(3.0) |
22:58
π
|
tsp__ |
Is there a difference between archiving and scraping? I've always thought archiving was preserving everything about a site, and scraping was just keeping the bits you want, but I could be wrong |
23:03
π
|
nico |
tsp__: nowaday you've to scrape the website because everythings is loaded by ajax and other javascript monstruosities |
23:35
π
|
DFJustin |
I would say scraping is pulling information off of a site in an automated way, that it wasn't intended for |
23:36
π
|
DFJustin |
like scraping book titles off of amazon |
23:36
π
|
DFJustin |
whereas archiving is just saving unmodified copies for later use |
23:37
π
|
tsp__ |
For example, I want to pull forum posts out of a forum. I don't have any experience with the archiveteam scripts to actually archive it properly, but I just want its posts; I can do that trivially with python and requests. I guess that would be scraping |