#archiveteam-bs 2016-08-20,Sat

↑back Search

Time Nickname Message
00:00 🔗 godane so lifehacker.com and deadspin.com sitemap urls are all updated
00:07 🔗 godane tay.kotaku.com redirect to tay.kinja.com
00:11 🔗 JesseW has joined #archiveteam-bs
00:12 🔗 BartoCH has joined #archiveteam-bs
00:15 🔗 JesseW Suggestions for scripts for owners of a private github repo to use extract the issues and publish them? https://github.com/npm/www/issues/9
00:15 🔗 JesseW I suggested joeyh's github-backup -- but other suggestions would also be very welcome.
00:25 🔗 RichardG_ is now known as RichardG
00:27 🔗 godane https://archive.org/details/Making_of_Antarctica
00:27 🔗 JesseW has quit IRC (Read error: Operation timed out)
00:27 🔗 godane https://archive.org/details/Otaku_JJ_Beineix
00:28 🔗 godane https://archive.org/details/Raid_1954
00:28 🔗 godane https://archive.org/details/Audience_of_One_-_2007
00:29 🔗 godane https://archive.org/details/Showdown_in_Little_Tokyo_Uncut_CG
00:29 🔗 godane https://archive.org/details/Hollywood_Mavericks.1990.Florence_Dauman.Dale_Ann_Steiber.mkv
00:30 🔗 godane thats all of the Cinemageddon videos i uploaded to FOS
00:30 🔗 godane i figure people here would want them
00:37 🔗 hook54321 I'm pretty sure lycos ignores robots.txt. At least partially...
00:39 🔗 hook54321 Does Lycos have any search operators?
00:40 🔗 alembic looks like they lost most of them in 2004 when they switched to Yahoo! DB?
00:40 🔗 alembic http://www.searchengineshowdown.com/features/lycos/
00:45 🔗 hook54321 the advanced page search doesn't seem to exist anymore :/
00:47 🔗 BartoCH has quit IRC (Ping timeout: 260 seconds)
00:57 🔗 BlueMaxim has joined #archiveteam-bs
01:02 🔗 username1 has joined #archiveteam-bs
01:05 🔗 schbirid2 has quit IRC (Read error: Operation timed out)
01:59 🔗 schbirid2 has joined #archiveteam-bs
02:03 🔗 username1 has quit IRC (Read error: Operation timed out)
02:06 🔗 Aranje has joined #archiveteam-bs
02:19 🔗 hook54321 has left
03:05 🔗 hook54321 has joined #archiveteam-bs
03:13 🔗 hook54321 can someone re-op me in #archivebot? svchfoo disappeared
03:28 🔗 mutoso_ has joined #archiveteam-bs
03:29 🔗 Smiley has quit IRC (Read error: Operation timed out)
03:29 🔗 beardicus has quit IRC (Read error: Operation timed out)
03:31 🔗 Whopper_ has joined #archiveteam-bs
03:31 🔗 mutoso has quit IRC (Read error: Operation timed out)
03:31 🔗 VADemon has quit IRC (Quit: left4dead)
03:32 🔗 closure has quit IRC (Read error: Operation timed out)
03:34 🔗 Smiley has joined #archiveteam-bs
03:37 🔗 Whopper has quit IRC (Read error: Operation timed out)
03:47 🔗 closure has joined #archiveteam-bs
04:20 🔗 Sk1d has quit IRC (Ping timeout: 250 seconds)
04:27 🔗 Sk1d has joined #archiveteam-bs
04:28 🔗 Aranje has quit IRC (Ping timeout: 260 seconds)
04:36 🔗 beardicus has joined #archiveteam-bs
04:49 🔗 godane i'm uploading a adland.tv web archive from 2014-09-01
04:50 🔗 godane just know its incomplete meaning it stop before being completed
04:50 🔗 godane but its +500M of it
04:51 🔗 SketchCow Star Trek Beyond
04:52 🔗 SketchCow a-ok
04:52 🔗 godane i watched that on my birthday
04:52 🔗 godane i also went to five guys
04:53 🔗 godane https://archive.org/details/adland.tv-20140901
05:04 🔗 brayden has joined #archiveteam-bs
05:04 🔗 swebb sets mode: +o brayden
05:09 🔗 brayden_ has quit IRC (Read error: Operation timed out)
05:36 🔗 tomwsmf has quit IRC (Ping timeout: 255 seconds)
06:09 🔗 dashcloud has quit IRC (Read error: Operation timed out)
06:12 🔗 ranma wow
06:12 🔗 ranma The Cuban CDN http://hn.premii.com/#/article/12319063
06:13 🔗 dashcloud has joined #archiveteam-bs
07:14 🔗 JesseW has joined #archiveteam-bs
07:27 🔗 acridAxid has quit IRC (marauder)
07:28 🔗 acridAxid has joined #archiveteam-bs
07:51 🔗 Honno has joined #archiveteam-bs
07:59 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
08:29 🔗 GE has joined #archiveteam-bs
08:35 🔗 BartoCH has joined #archiveteam-bs
10:51 🔗 GE_ has joined #archiveteam-bs
10:54 🔗 GE has quit IRC (Ping timeout: 255 seconds)
10:54 🔗 GE_ is now known as GE
11:47 🔗 BartoCH has quit IRC (Ping timeout: 260 seconds)
11:48 🔗 BartoCH has joined #archiveteam-bs
12:03 🔗 BartoCH has quit IRC (Quit: WeeChat 1.5)
12:05 🔗 BartoCH has joined #archiveteam-bs
12:57 🔗 BartoCH has quit IRC (Ping timeout: 260 seconds)
13:02 🔗 BartoCH has joined #archiveteam-bs
13:08 🔗 luckcolor has quit IRC (Read error: Operation timed out)
13:08 🔗 luckcolor has joined #archiveteam-bs
13:21 🔗 davidar has quit IRC (Quit: Connection closed for inactivity)
13:26 🔗 GE has quit IRC (Quit: zzz)
14:08 🔗 atrocity has joined #archiveteam-bs
14:08 🔗 atrocity oh 90%, why was I so young...
14:08 🔗 atrocity 90's...
14:08 🔗 atrocity https://www.youtube.com/watch?v=IY2j_GPIqRA
14:41 🔗 BlueMaxim has quit IRC (Quit: Leaving)
15:04 🔗 RichardG has quit IRC (Quit: Keyboard not found, press F1 to continue)
15:06 🔗 RichardG has joined #archiveteam-bs
15:19 🔗 GE has joined #archiveteam-bs
16:26 🔗 GE_ has joined #archiveteam-bs
16:28 🔗 GE has quit IRC (Ping timeout: 255 seconds)
16:28 🔗 GE_ is now known as GE
17:38 🔗 JesseW has joined #archiveteam-bs
17:59 🔗 godane i'm uploading more pdfs from the Sky and Telescope
18:59 🔗 GE_ has joined #archiveteam-bs
19:00 🔗 GE has quit IRC (Ping timeout: 255 seconds)
19:00 🔗 GE_ is now known as GE
19:27 🔗 Igloo^ Is that the one I tried yesterday HCross
19:27 🔗 Igloo^ 4 million items?
19:28 🔗 HCross is it the NASA funded docs?
19:28 🔗 Igloo^ Yis
19:28 🔗 Igloo^ (the NASA items aren't that big tho)
19:29 🔗 HCross ah I was looking at it too
19:29 🔗 HCross ArchiveBot prob wont get it all
19:29 🔗 HCross ive exported it as XML and am looking at it
19:29 🔗 Igloo^ Archivebot went a bit mad.
19:29 🔗 Igloo^ I was working through it as individual items
19:32 🔗 HCross Igloo^, doing some testing, but it may be easier to create a large list
19:36 🔗 RichardG has quit IRC (Ping timeout: 250 seconds)
19:37 🔗 Igloo^ Yeah, Create a list and whack it through AB
19:42 🔗 godane that nasa docs i uploaded is around 90k
19:42 🔗 RichardG has joined #archiveteam-bs
19:43 🔗 tomwsmf has joined #archiveteam-bs
19:44 🔗 HCross Igloo^, they let you export the ID #'s but its per page. If I list it by 100 then its only 9 pages. Ill then write something that will generate up the URLs
19:45 🔗 Igloo^ !ao works
19:45 🔗 Igloo^ However, Doesn't get the sub images
19:45 🔗 HCross or not. Managed to download a complete list
19:45 🔗 HCross yea
19:45 🔗 Igloo^ Which is a bit ropey.
19:46 🔗 HCross !ao gets too much other stuff
19:46 🔗 Igloo^ ao gets the full site for the waybackmachine
19:46 🔗 HCross it may be that its better if I do a grab-site instance with some custom ignores etc
19:46 🔗 Igloo^ I was thinking of doing a custom Heritrix run
19:46 🔗 Igloo^ BUT I don't think that'll work
19:46 🔗 Igloo^ as it'll be huge
19:46 🔗 HCross want me to generate a full list of URLs anyway
19:47 🔗 Igloo^ Sure
20:07 🔗 HCross Igloo^, www.ncbi.nlm.nih.gov/pmc/articles/PMC4973959
20:07 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4980455
20:07 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4971634
20:07 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4964660
20:07 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4971156
20:07 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4939048
20:07 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4937211
20:07 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4917110
20:07 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4934352
20:07 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4926486
20:07 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4932956
20:07 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4872529
20:07 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4870578
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4896262
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4919777
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4848480
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4831017
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4846461
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4820435
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4814050
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4797119
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4794207
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4808930
20:08 🔗 Igloo^ Fucking patebin it or something
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4866469
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4771323
20:08 🔗 Igloo^ Instead of several hundred lines
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4751316
20:08 🔗 Igloo^ :P
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4750446
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4760178
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4810239
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4738353
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4829277
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4770934
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4731148
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4729913
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4728390
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4727388
20:08 🔗 HCross www.ncbi.nlm.nih.gov/pmc/articles/PMC4718941
20:08 🔗 HCross was kicked by Frogging (HCross)
20:08 🔗 Igloo^ Thank you Frogging
20:10 🔗 Frogging I wonder if he meant to paste the pastebin link but still had the list in his clipboard :po
20:10 🔗 Frogging :p *
20:10 🔗 Igloo^ I think that's what he meant to do :P
20:10 🔗 Igloo^ But yaknow, noob etc
20:12 🔗 HCross2 Now waiting while my hexchat stops having a meltdown over that, sorry
20:12 🔗 Frogging no problem :p
20:12 🔗 Frogging sets mode: +o HCross2
20:13 🔗 HCross has joined #archiveteam-bs
20:13 🔗 HCross there we go
20:13 🔗 HCross http://paste.nerds.io/axorogoxif.avrasm
20:13 🔗 Frogging sets mode: +o HCross
20:13 🔗 HCross thanks
20:15 🔗 JesseW has quit IRC (Quit: Leaving.)
20:15 🔗 JesseW has joined #archiveteam-bs
20:16 🔗 kristian_ has joined #archiveteam-bs
20:26 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
20:29 🔗 arkiver is it 4 million docs?
20:29 🔗 HCross the new ones that they released are just 900 oddd
20:34 🔗 Igloo^ It would be a good warrior job.
20:41 🔗 godane http://www.ncbi.nlm.nih.gov/pmc/journals/1978/
20:41 🔗 godane you grab by journal number
20:41 🔗 godane then grab the links from those pages
20:41 🔗 godane the pdfs are linked there
20:42 🔗 godane http://www.ncbi.nlm.nih.gov/pmc/issues/218561/
21:17 🔗 Coderjoe has quit IRC (Read error: Operation timed out)
21:45 🔗 GE has quit IRC (Quit: zzz)
21:56 🔗 GE has joined #archiveteam-bs
22:17 🔗 GE_ has joined #archiveteam-bs
22:20 🔗 GE has quit IRC (Ping timeout: 255 seconds)
22:20 🔗 GE_ is now known as GE
22:34 🔗 Start has quit IRC (Quit: Disconnected.)
22:34 🔗 Start has joined #archiveteam-bs
22:45 🔗 Coderjoe has joined #archiveteam-bs
22:59 🔗 GE has quit IRC (Remote host closed the connection)
23:13 🔗 kristian_ has quit IRC (Leaving)
23:14 🔗 Honno has quit IRC (Read error: Operation timed out)
23:14 🔗 tomwsmf has quit IRC (Read error: Operation timed out)

irclogger-viewer