#archiveteam-bs 2016-02-08,Mon

↑back Search

Time Nickname Message
00:45 🔗 JesseW has joined #archiveteam-bs
01:52 🔗 toad2 has joined #archiveteam-bs
01:54 🔗 no2pencil has quit IRC (Read error: Operation timed out)
01:54 🔗 toad1 has quit IRC (Read error: Operation timed out)
02:20 🔗 JesseW ivan`: I'm going to dump a bunch of Youtube channels of folk music into your form: https://docs.google.com/forms/d/1_kkpBe6abFQ5sznrMfWHhP7ZhdktKejJEpvCCcqVues/viewform -- lemme know if you'd like them as a single email instead (probably about a dozen channels or so).
02:23 🔗 kyan ivan`, are those being mirrored to IA? Is a list of items you've archived available, so that I could mirror them to IA off youtube if I wanted?
02:24 🔗 JesseW kyan: I know they aren't being mirrored to IA, because one of the points was to avoid burdening IA's servers with stuff they can't/don't want to hold.
02:25 🔗 kyan Ah, hmm
02:25 🔗 kyan I didn't know they didn't want to hold things
02:25 🔗 JesseW I don't know if a public list is available, but I'd be surprised if ivan` would mind privately sending you a list of what he has.
02:25 🔗 kyan thought they were more like, expanding as use expanded, or something
02:26 🔗 kyan Also, some of my most viewed uploads have been youtube videos I've mirrored to IA
02:26 🔗 kyan so I think that it at least generate more traffic for IA to have discoverable content?
02:26 🔗 kyan Then again, they don't have ads so I guess traffic ≠ money
02:27 🔗 JesseW I don't know that IA minds -- more that I remember seeing ivan` mention he was specifically intending to provide a home for stuff unable to be mirrored at IA.
02:28 🔗 kyan Huh ok
02:28 🔗 kyan I'm not sure how "unable" anything could be, but whatever
02:28 🔗 kyan I mean, if the issue is with copyright, I guess
02:34 🔗 schbirid2 has joined #archiveteam-bs
02:35 🔗 JesseW I have no idea about why.
02:36 🔗 kyan ah :P
02:37 🔗 schbirid has quit IRC (Read error: Operation timed out)
02:39 🔗 kyan Also a question — how to sort search results on IA by size?
02:39 🔗 JesseW I'll be curious what Ivan has to say.
02:39 🔗 JesseW size of what, individual files, whole items, something else?
02:40 🔗 kyan Whole items
02:40 🔗 JesseW I don't *think* that's available through the Advanced Search.
02:40 🔗 JesseW Probably extracting it from the census data is your best bet.
02:40 🔗 kyan I don't see anything that looks promising there
02:40 🔗 kyan Ah, ok. Thanks.
02:40 🔗 * kyan can't be bothered atm
02:40 🔗 JesseW Heh
02:40 🔗 JesseW What were you interested in looking for?
02:40 🔗 kyan Also the census wouldn't fit on my drive lol
02:41 🔗 kyan I've got a bunch of WARCs uploaded
02:41 🔗 kyan the ones from one account have lots of views (10000+ per item, generally)
02:41 🔗 kyan while the ones from the other account have like 10–50 views
02:41 🔗 kyan I'd like to see if there's something wrong with the ones from the other account
02:42 🔗 kyan but a lot of the items are small and only have a few URLs in them, making it understandable that they'd have few views.
02:42 🔗 kyan By sorting by item size, I could see which ones have tens of thousands of URLs and try to see if there's something about them that's making them have few views.
02:43 🔗 kyan Namely these: https://archive.org/search.php?query=uploader%3A%22worldpeacehaven%40gmail.com%22+mediatype%3Aweb&sort=-downloads&page=2
02:43 🔗 kyan 46 views for the most viewed item, and going down from there
02:43 🔗 JesseW Ah, if you are only interested in a limited number of identifiers, I'd just hack up curl to download http://archive.org/metadata/{id} for each one, then sort them locally.
02:43 🔗 JesseW I thought you wanted to sort the whole corpus
02:44 🔗 JesseW s/hack up curl/hack up a shell script *using* curl/
02:44 🔗 kyan Compare to my other account https://archive.org/search.php?query=uploader%3A%22kolubat%40gmail.com%22+mediatype%3Aweb&sort=-downloads
02:44 🔗 kyan most views is 143K
02:44 🔗 kyan makes me think something might be wrong.
02:44 🔗 * JesseW needs to get around to uploading my census results
02:45 🔗 kyan JesseW, cool, that sounds promising! Thanks! :D
02:45 🔗 JesseW (and various shell commands)
02:45 🔗 JesseW but I need to figure out what exactly the next step is, too.
02:57 🔗 JetBalsa has quit IRC (hub.efnet.us irc.colosolutions.net)
02:57 🔗 SadDM has quit IRC (hub.efnet.us irc.colosolutions.net)
02:57 🔗 jspiros has quit IRC (hub.efnet.us irc.colosolutions.net)
02:57 🔗 matthusby has quit IRC (hub.efnet.us irc.colosolutions.net)
03:00 🔗 JesseW has quit IRC (Quit: Leaving.)
03:59 🔗 SN4T14 has quit IRC (Read error: Operation timed out)
03:59 🔗 SN4T14 has joined #archiveteam-bs
03:59 🔗 MrRadar has quit IRC (Read error: Operation timed out)
03:59 🔗 arkiver has quit IRC (Ping timeout: 360 seconds)
04:00 🔗 signius has quit IRC (Read error: Operation timed out)
04:00 🔗 joepie91 has quit IRC (Read error: Operation timed out)
04:01 🔗 phuzion has quit IRC (Read error: Operation timed out)
04:01 🔗 phuzion has joined #archiveteam-bs
04:01 🔗 zenguy has quit IRC (Ping timeout: 360 seconds)
04:01 🔗 dashcloud has quit IRC (Read error: Operation timed out)
04:01 🔗 atlogbot has quit IRC (Ping timeout: 360 seconds)
04:02 🔗 arkiver has joined #archiveteam-bs
04:02 🔗 joepie91 has joined #archiveteam-bs
04:02 🔗 signius has joined #archiveteam-bs
04:03 🔗 atlogbot has joined #archiveteam-bs
04:04 🔗 zenguy has joined #archiveteam-bs
04:04 🔗 dashcloud has joined #archiveteam-bs
04:04 🔗 phuzion has quit IRC (Read error: Operation timed out)
04:04 🔗 beardicus has quit IRC (Read error: Operation timed out)
04:06 🔗 phuzion has joined #archiveteam-bs
04:09 🔗 beardicus has joined #archiveteam-bs
04:14 🔗 kvieta has quit IRC (Ping timeout: 633 seconds)
04:14 🔗 kvieta has joined #archiveteam-bs
04:18 🔗 RedType has quit IRC (Remote host closed the connection)
04:21 🔗 MrRadar has joined #archiveteam-bs
04:22 🔗 beardicus has quit IRC (Read error: Operation timed out)
04:26 🔗 kvieta has quit IRC (Read error: Operation timed out)
04:27 🔗 SimpBrain has quit IRC (Ping timeout: 633 seconds)
04:36 🔗 SimpBrain has joined #archiveteam-bs
04:44 🔗 JesseW has joined #archiveteam-bs
04:46 🔗 toad2 has quit IRC (Ping timeout: 864 seconds)
04:47 🔗 kvieta has joined #archiveteam-bs
04:47 🔗 toad1 has joined #archiveteam-bs
04:47 🔗 beardicus has joined #archiveteam-bs
04:50 🔗 Swizzle has joined #archiveteam-bs
04:54 🔗 zerkalo has joined #archiveteam-bs
04:54 🔗 lbft_ has joined #archiveteam-bs
04:57 🔗 zerkalo_ has quit IRC (hub.efnet.us irc.Prison.NET)
04:57 🔗 chfoo has quit IRC (hub.efnet.us irc.Prison.NET)
04:57 🔗 achip has quit IRC (hub.efnet.us irc.Prison.NET)
04:57 🔗 lbft has quit IRC (hub.efnet.us irc.Prison.NET)
04:59 🔗 chfoo0 has joined #archiveteam-bs
05:04 🔗 achip has joined #archiveteam-bs
05:05 🔗 pikhq_ has quit IRC (hub.dk irc.homelien.no)
05:05 🔗 PurpleSym has quit IRC (hub.dk irc.homelien.no)
05:05 🔗 PotcFdk has quit IRC (hub.dk irc.homelien.no)
05:05 🔗 coretx has quit IRC (hub.dk irc.homelien.no)
05:05 🔗 altlabel has quit IRC (hub.dk irc.homelien.no)
05:05 🔗 limebyte has quit IRC (hub.dk irc.homelien.no)
05:05 🔗 i0npulse has quit IRC (hub.dk irc.homelien.no)
05:06 🔗 Rotab has quit IRC (hub.se irc.du.se)
05:10 🔗 coretx_ has joined #archiveteam-bs
05:11 🔗 vitzli has joined #archiveteam-bs
05:16 🔗 xmc DFJustin: does wayback not support ftp at all, or can you construct ftp warcs and access them somehow
05:35 🔗 SmileyG has quit IRC (Read error: Connection reset by peer)
05:35 🔗 Smiley has joined #archiveteam-bs
05:35 🔗 will has quit IRC (Ping timeout: 252 seconds)
05:35 🔗 Rye has quit IRC (Ping timeout: 252 seconds)
05:38 🔗 will has joined #archiveteam-bs
05:40 🔗 useretail has quit IRC (Ping timeout: 252 seconds)
05:43 🔗 will has quit IRC (Ping timeout: 252 seconds)
05:45 🔗 will has joined #archiveteam-bs
05:45 🔗 Rye has joined #archiveteam-bs
05:45 🔗 Sk1d has quit IRC (Ping timeout: 250 seconds)
05:47 🔗 JesseW Regarding ftp.esri.com, there are 11 wayback machine records from 2013, all of which returned 502 statuscodes.
05:47 🔗 JesseW (there are *only* those 11 records)
05:48 🔗 useretail has joined #archiveteam-bs
05:52 🔗 Sk1d has joined #archiveteam-bs
05:53 🔗 Swizzle has quit IRC (Quit: Leaving)
06:14 🔗 SimpBrain ok...
06:15 🔗 SimpBrain i get my nl vps suspended due to high loads, they send out a message saying they are going to do some work on the server. they send out another update saying they will put that server on a new server
06:15 🔗 SimpBrain win win for them i think
06:37 🔗 pikhq has joined #archiveteam-bs
06:37 🔗 i0npulse has joined #archiveteam-bs
06:37 🔗 altlabel has joined #archiveteam-bs
06:37 🔗 PurpleSym has joined #archiveteam-bs
06:37 🔗 PotcFdk has joined #archiveteam-bs
06:37 🔗 limebyte has joined #archiveteam-bs
06:38 🔗 vitzli JesseW, on IA census, I did IA.BAK census and found something, I don't know if it is only my 'ia search'/my mining script bug or it is common to all - a) both ia-mine and "ia search" return duplicate items b) they miss some items (about 10 or 15 on 600 item collection). I found this when I was doing "parallel --jobs 1" requests, and it happened to cli calls of ia too (week ago).
06:39 🔗 vitzli Right now ia-mine --search --itemlist seems to behave better - no item drops, but ia search returned one duplicate record
06:40 🔗 JesseW vitzli: https://archive.org/download/ia-bak-census_20150304/metamgr-norm-ids-20150304205357.txt.gz has a single duplicate (see http://archiveteam.org/index.php?title=Internet_Archive_Census#Contents_of_the_Census )
06:40 🔗 JesseW The other census files do seem to have a bunch of duplication -- I'm not sure why.
06:40 🔗 JesseW I found that IA search was ... unreliable.
06:41 🔗 vitzli it dropped one item and returned one duplicate, to be precise
06:41 🔗 JesseW For getting a definitive census of items in larger collections.
06:41 🔗 vitzli BUT - doing search multiple times and then sort|uniq it - worked and returned all elements in the collection
06:41 🔗 JesseW Feel free to drop a note to jake about it -- I can certainly confirm I've seen the same issue.
06:42 🔗 JesseW How many searches did you need to do?
06:43 🔗 JesseW I tried to get all the items with addedates in a particular year, and gave up when I couldn't get consistent results from the search. I should hack up something to retry and combine results until the total mactches the provided number (because searches do generate a total number even before any individual results are requested).
06:44 🔗 vitzli maybe 3 on 162 item collection, 3 or 4 on bigger collections (I think it was walnutcreekcdrom collection)
06:44 🔗 JesseW hm
06:46 🔗 vitzli got 162 items on the first 'ia search' run, but maybe 5 were duplicates, and did it again
06:46 🔗 DFJustin xmc: as far as I know wayback doesn't support it at all. it is possible to construct ftp warcs but I don't know what tools are able to use them
06:46 🔗 chfoo0 is now known as chfoo
07:31 🔗 vitzli JesseW, right now: text file from ia search --itemlist 'collection:(prelingeritems)' :
07:31 🔗 vitzli sort prelingeritems.txt | wc -l : 6533
07:31 🔗 vitzli sort -u prelingeritems.txt | wc -l: 4895
07:31 🔗 JesseW ha
07:31 🔗 JesseW yeah, that's ... less than ideal
07:33 🔗 vitzli uh, just prelinger collection, not prelingeritems
07:42 🔗 JesseW vitzli: I got all 6533 distinct values the first time I make the search
07:43 🔗 vitzli 'ia search'?
07:45 🔗 vitzli JesseW, https://paste.ee/p/vkBQU
07:46 🔗 JesseW I was using the python interface.
07:50 🔗 robink has quit IRC (Ping timeout: 190 seconds)
07:50 🔗 vitzli a is a list of identifiers in collection, len(a): 6533; len(set(a)): 5071
07:51 🔗 vitzli is my install somehow broken?
07:51 🔗 robink has joined #archiveteam-bs
07:52 🔗 JesseW I'm not sure. I have to head to sleep now. Good luck.
07:52 🔗 vitzli good night
07:52 🔗 JesseW has quit IRC (Quit: Leaving.)
07:53 🔗 vitzli JesseW, on python/IA search results: https://paste.ee/p/iXqQo
08:23 🔗 kyan has quit IRC (Ping timeout: 260 seconds)
08:47 🔗 RedType has joined #archiveteam-bs
09:58 🔗 HCross2 Hmm. Best way of getting a csv of a collection, listing the file name and the date it was uploaded, as well as the view
10:00 🔗 Rotab has joined #archiveteam-bs
10:41 🔗 lytv has quit IRC (Ping timeout: 250 seconds)
10:41 🔗 vtyl has joined #archiveteam-bs
11:00 🔗 achip has quit IRC (hub.efnet.us irc.Prison.NET)
11:07 🔗 signius has quit IRC (Read error: Operation timed out)
11:17 🔗 achip has joined #archiveteam-bs
11:20 🔗 signius has joined #archiveteam-bs
13:01 🔗 arkiver3 has joined #archiveteam-bs
13:28 🔗 joepie91 SketchCow: https://www.youtube.com/watch?v=cPaij2G3wTQ
13:32 🔗 arkiver3 has quit IRC (Ping timeout: 252 seconds)
14:13 🔗 arkiver3 has joined #archiveteam-bs
14:24 🔗 arkiver3 has quit IRC (Ping timeout: 252 seconds)
14:29 🔗 arkiver3 has joined #archiveteam-bs
14:56 🔗 SketchCow Yeah, I've seen it.
15:05 🔗 ersi what an annoying voice
15:05 🔗 ersi but yes yes and yes for everything in it
15:11 🔗 Start has quit IRC (Quit: Disconnected.)
15:17 🔗 joepie91 ersi: haha, exactly my thoughts
15:17 🔗 joepie91 watched a few eps so far
15:17 🔗 joepie91 "jesus that voice is annoying, but he is so damn right about every single thing he says"
15:17 🔗 joepie91 - every ep
15:31 🔗 ersi I wouldn't watch more than that single episode
15:36 🔗 wednesday has quit IRC (Ping timeout: 252 seconds)
15:37 🔗 wednesday has joined #archiveteam-bs
15:49 🔗 Start has joined #archiveteam-bs
15:50 🔗 wednesday has quit IRC (Ping timeout: 252 seconds)
15:53 🔗 arkiver3 has quit IRC (Quit: Nettalk6 - www.ntalk.de)
16:48 🔗 schbirid2 https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg?hl=en cookies.txt
16:51 🔗 joepie91 schbirid2: handy
16:51 🔗 schbirid2 btw if you wget -x the same url you spent the night downloading it will start from 0 again \o/
17:07 🔗 Start has quit IRC (Quit: Disconnected.)
17:19 🔗 Start has joined #archiveteam-bs
17:24 🔗 Start has quit IRC (Quit: Disconnected.)
17:51 🔗 vitzli has quit IRC (Leaving)
17:52 🔗 dashcloud has quit IRC (Read error: Operation timed out)
17:55 🔗 dashcloud has joined #archiveteam-bs
18:20 🔗 Swizzle has joined #archiveteam-bs
18:43 🔗 Start has joined #archiveteam-bs
19:00 🔗 signius has quit IRC (Ping timeout: 300 seconds)
19:08 🔗 espes__ has quit IRC (Ping timeout: 252 seconds)
19:12 🔗 signius has joined #archiveteam-bs
19:14 🔗 Start has quit IRC (Quit: Disconnected.)
19:20 🔗 Start has joined #archiveteam-bs
19:24 🔗 acridAxid has quit IRC (Quit: marauder)
19:29 🔗 kyan has joined #archiveteam-bs
19:29 🔗 kyan my grab-site server is out of disk space :(
19:29 🔗 kyan downloads faster than it can upload
19:33 🔗 kyan Also TIL don't leave a grab-site --1 of a page that mentions Pinterest running over night unattended if you turned off the dupechecker. 56.5GB downloaded, 193k responses, almost all Pinterest 404s
19:33 🔗 kyan ffs
19:33 🔗 acridAxid has joined #archiveteam-bs
19:34 🔗 godane SketchCow: i'm grabbing more of Network World from google books
19:34 🔗 godane cause in part of what you said about googlebooks twitter account going private
19:42 🔗 joepie91 http://trumpdonald.org/
19:47 🔗 ivan` kyan: https://gist.github.com/ivan/5779ac8d43817092aca6
19:47 🔗 Swizzle has quit IRC (Quit: Leaving)
19:48 🔗 ivan` verify the df line before deploying
19:48 🔗 kyan ivan`: Ooh, cool, thanks! :D
19:49 🔗 ivan` kyan: not mirroring my 2M YouTube videos to IA. The plan is to scan my collection for deleted/private/unlisted videos and upload those. Just need to write software to check all the IDs and upload to IA.
19:50 🔗 midas lol joepie91
19:50 🔗 kyan ivan`, Aah, cool, that's a good solutino
19:51 🔗 kyan Might make sense to add that gist to the readme for grab-site too, that's handy
19:51 🔗 ivan` yeah
19:53 🔗 ivan` does anyone have some existing infrastructure to hit a site through many proxies?
19:53 🔗 ivan` http://crawlera.com/ besides this commercial offering that I don't want to pay for
19:54 🔗 Silvan has quit IRC (Read error: Operation timed out)
19:54 🔗 kyan (Well, the warrior kind of does that)
19:55 🔗 kyan Wow, $25 for 150k requests per month. That's pretty expensive
20:01 🔗 joepie91 lol
20:01 🔗 joepie91 kyan: you want to see expensive?
20:01 🔗 joepie91 kyan: https://luminati.io/
20:02 🔗 kyan HAHAHAHAhaha ha .... ha?
20:02 🔗 kyan do they get any customers?
20:03 🔗 joepie91 kyan: yeah.
20:03 🔗 joepie91 quite a few
20:04 🔗 joepie91 kyan: it's used by companies scraping prices and shit
20:04 🔗 joepie91 from competitors
20:04 🔗 joepie91 their peers are Hola users
20:04 🔗 joepie91 so, almost all residential
20:04 🔗 kyan Hm, interesting
20:04 🔗 kyan I'm not sure how well it would work against sophisticated crawler prevention
20:05 🔗 kyan e.g. if they're scraping sequential IDs, that could be tracked between IP addresses
20:05 🔗 kyan captchas could be required on suspicious requests
20:05 🔗 kyan and also Bing would have search results as good as Google if it worked
20:17 🔗 SilSte has joined #archiveteam-bs
20:22 🔗 ivan` https://github.com/ludios/grab-site#automatically-pausing-grab-site-processes-when-free-disk-is-low
20:26 🔗 ivan` I have a 3.5TB grab-site of http://digitalcollections.nypl.org/ going
20:26 🔗 ivan` and 2.5TB of http://downloads.dell.com/
20:28 🔗 SimpBrain nice old driver/software downloads is good to have
20:28 🔗 SimpBrain only a matter of time before they remove old downloads
20:36 🔗 yipdw joepie91: wtf on luminati
20:37 🔗 yipdw oh it's Hola
20:38 🔗 kyan has quit IRC (Quit: This computer has gone to sleep)
20:38 🔗 yipdw I thought they were actively infecting computers or some shit
20:41 🔗 JW_work has joined #archiveteam-bs
20:42 🔗 kyan has joined #archiveteam-bs
20:43 🔗 JW_work ivan`: Here is a list of 127 youtube channels of contra dance music (with various other random crap mixed in), if you'd like to archive them: https://0bin.net/paste/0MMv2M-eh1hSTydI#SbPWz+5z+HxWt4YurJDQQUEjI7iWskKNDbNLGOBF0ik
20:44 🔗 Start has quit IRC (Quit: Disconnected.)
20:44 🔗 JW_work It's in OPML XML format — I'm glad to work on transforming it into an easier to use format if that'd be helpful.
20:45 🔗 JW_work (the random crap is because various of the channels are their owner's personal channels, so they also uploaded various home video-type stuff — all the channels should have at least some contra dance music, and there aren't any channels focused on other topics, IIRC)
20:48 🔗 ivan` I can transform XML with my mad sublime text skills, don't worry about that part
20:48 🔗 ivan` that sure is a lot of channels
20:49 🔗 JW_work yep, I've been collecting them for a while
20:50 🔗 JW_work I've been (very slowly) working on indexing the contra dance videos on them on to MusicBrainz — when I heard about your archiving effort, I thought I'd send it over.
20:50 🔗 JW_work I can also give you a smaller list of higher value ones, if you'd like.
20:51 🔗 JW_work currently listening to https://www.youtube.com/watch?v=pthkg4f2HAo
20:51 🔗 yipdw the more I use curl, the more I am disgusted by HTTP libraries
20:52 🔗 ivan` JW_work: I will add all of them if you think they're all worth archiving
20:52 🔗 JW_work I think they are all worth archiving.
20:52 🔗 ivan` my script will work through them over about a month
20:53 🔗 JW_work Great! None of them are particularly in danger of vanishing right now, so a month should work fine.
20:53 🔗 * ivan` goes to write a program to turn channels into usernames
20:54 🔗 JW_work Yeah, the XML is just the output of https://www.youtube.com/subscription_manager?action_takeout=1
20:55 🔗 JW_work so if you write something to handle it, it will likely be generally useful
21:21 🔗 ivan` JW_work: OK, all of your subscriptions and spreadsheet submissions are queued
21:22 🔗 ivan` beware my youtube archiver is a stochastic process
21:22 🔗 ivan` and something like 1% of videos fail to download without manual intervention which I almost never bother with
21:22 🔗 ivan` youtube is great. announces formats that it fails to serve.
21:28 🔗 JW_work that shouldn't be a problem — the ones I *have* indexed on MusicBrainz should have already been grabbed by the musicbrainz external links warrior project recently, and I'll likely grab my high-value targets myself too; but it's very good to have another copy elsewhere, so thank you!
21:29 🔗 ivan` np
21:44 🔗 joepie91 yipdw: they are
21:44 🔗 ersi ivan`: Jeez, that's some huge fucking grabs!
21:44 🔗 joepie91 yipdw: with hola
21:44 🔗 joepie91 lol
21:44 🔗 joepie91 yipdw: http://adios-hola.org/
21:45 🔗 ersi yipdw: What in particular are you disgusted about with curl?
21:51 🔗 kyan_ has joined #archiveteam-bs
21:53 🔗 kyan has quit IRC (Ping timeout: 258 seconds)
21:53 🔗 kyan_ is now known as kyan
22:04 🔗 kyan don't have time to look into it right now but these ftp://ftp.us.dell.com/video/ just got posted to /r/opendirectories. Lots of drievrs
22:04 🔗 arkiver ooooh
22:05 🔗 arkiver will check that in for the ftp project
22:05 🔗 kyan might be stuff in the parent dir too
22:06 🔗 arkiver yep, will get that too
22:16 🔗 ersi http://www.bloomberg.com/features/2016-solar-power-buffett-vs-musk/img/buffett_vs_musk.gif
22:16 🔗 ersi hehehe
22:27 🔗 kyan has quit IRC (This computer has gone to sleep)
22:28 🔗 Smiley is there any chance of getting major to autovoice me in #archivebot ?
22:46 🔗 HCross Currently watching my warrior archive the Friends Reunited page for my old school is so satisfying
22:53 🔗 joepie91 wow
22:53 🔗 joepie91 when you think you've seen everything
22:53 🔗 joepie91 http://www.nieuwsbladtransport.nl/Nieuws/Article/tabid/85/ArticleID/40874/ArticleName/Samskipgaatreorganiseren/Default.aspx
22:53 🔗 joepie91 cc arkiver
22:54 🔗 joepie91 "Dear reader, After one year, the pictures in our articles are removed from the site. The texts themselves, however, will remain unchanged."
22:54 🔗 HCross ....
22:54 🔗 HCross wow
22:54 🔗 joepie91 so yeah, one for your newsbot
22:54 🔗 joepie91 lol
22:54 🔗 joepie91 amazing, though
22:54 🔗 joepie91 never seen this before, boggles the mind
22:54 🔗 HCross joepie91, ill add it soon. Not atm though
22:55 🔗 HCross Im not popular atm with the datacenter
22:55 🔗 joepie91 HCross: haha, how come
22:55 🔗 HCross Bandwith, ALL OF THE BANDWITH
22:55 🔗 joepie91 lol
22:55 🔗 joepie91 HCross: oh, they paywall too
22:55 🔗 HCross feck
22:55 🔗 joepie91 might need to make sure you're grabbing it without cookies
22:56 🔗 vtyl has quit IRC (hub.efnet.us irc.servercentral.net)
22:56 🔗 RedType has quit IRC (hub.efnet.us irc.servercentral.net)
22:56 🔗 SimpBrain has quit IRC (hub.efnet.us irc.servercentral.net)
22:56 🔗 phuzion has quit IRC (hub.efnet.us irc.servercentral.net)
22:56 🔗 atlogbot has quit IRC (hub.efnet.us irc.servercentral.net)
22:56 🔗 schbirid2 has quit IRC (hub.efnet.us irc.servercentral.net)
22:56 🔗 Infreq has quit IRC (hub.efnet.us irc.servercentral.net)
22:56 🔗 JW_work has quit IRC (hub.efnet.us irc.servercentral.net)
22:56 🔗 mistym has quit IRC (hub.efnet.us irc.servercentral.net)
22:56 🔗 dxrt has quit IRC (hub.efnet.us irc.servercentral.net)
22:56 🔗 swebb has quit IRC (hub.efnet.us irc.servercentral.net)
22:56 🔗 slyphic has quit IRC (hub.efnet.us irc.servercentral.net)
22:56 🔗 chazchaz has quit IRC (hub.efnet.us irc.servercentral.net)
22:56 🔗 joepie91 yeah, this site is a bit special
22:56 🔗 joepie91 lol
22:57 🔗 RedType_ has joined #archiveteam-bs
22:57 🔗 phuzion_ has joined #archiveteam-bs
22:59 🔗 mistym- has joined #archiveteam-bs
23:00 🔗 lytv has joined #archiveteam-bs
23:00 🔗 dxrt_ has joined #archiveteam-bs
23:01 🔗 ersi joepie91: I guess they.. only license the images for one year?
23:01 🔗 Infreq_ has joined #archiveteam-bs
23:01 🔗 ersi Incredibly stupid though
23:01 🔗 SimpBrai1 has joined #archiveteam-bs
23:01 🔗 joepie91 very much so
23:01 🔗 joepie91 lol
23:01 🔗 joepie91 ersi: also, image licensing for news in NL is not usually time-limited...
23:02 🔗 schbirid has joined #archiveteam-bs
23:02 🔗 joepie91 they must've gotten the short end of the stick with their licensing agency :P
23:02 🔗 ersi or they just wanted cheaper pics
23:06 🔗 swebb has joined #archiveteam-bs
23:07 🔗 JW_work2 has joined #archiveteam-bs
23:08 🔗 chazchaz has joined #archiveteam-bs
23:09 🔗 slyphic has joined #archiveteam-bs
23:10 🔗 Start has joined #archiveteam-bs
23:11 🔗 HCross from argparse import ArgumentParser
23:11 🔗 HCross ImportError: No module named argparse
23:11 🔗 HCross arkiver, ^^
23:12 🔗 HCross nvm
23:22 🔗 xmc or they don't like paying for storage
23:45 🔗 Smiley k, my warrior stats are miles off
23:45 🔗 Smiley I'm on 30Mbit
23:45 🔗 Smiley it's telling me 280MB/s
23:45 🔗 Smiley D:
23:45 🔗 Smiley Oh wait reading the total XD
23:46 🔗 HCross Im waiting to get Debian installed on thsi server then I will have a clue what I am doing
23:50 🔗 dxrt_ is now known as dxrt

irclogger-viewer