Time |
Nickname |
Message |
00:20
🔗
|
godane |
so i should be up to 2008-09-20 with funny or die videos now |
00:22
🔗
|
godane |
btw i'm up to 790k items |
00:22
🔗
|
MrRadar |
Where do you find the time to do so much archiving? |
00:24
🔗
|
godane |
i have autism and i'm unemployed |
00:24
🔗
|
MrRadar |
Ah, that would definitely give you time |
00:25
🔗
|
godane |
that and most of it is scripting |
00:26
🔗
|
godane |
no different then what Jason does when uploading 1000s of items |
00:26
🔗
|
godane |
some stuff is more pickly like the funny or die videos |
00:27
🔗
|
godane |
but thats cause description metadata doesn't always upload |
00:27
🔗
|
|
DoomTay has joined #archiveteam-bs |
00:28
🔗
|
godane |
i have to upload the missing files without descriptions so the at least get uploaded |
00:28
🔗
|
godane |
keywords, title and date metadata is still there |
00:30
🔗
|
MrRadar |
Thanks for all the work you do, you may be the most productive "unemployed" person ever |
00:32
🔗
|
Asparagir |
^ what he/she/they said. You do good work, and lots of it. |
00:32
🔗
|
JesseW |
Heh, there have been a LOT of unemployed people who have done lots of valuable work. You stand in a long tradition. |
00:33
🔗
|
xmc |
^ |
01:09
🔗
|
|
nightpool has quit IRC (Read error: Operation timed out) |
01:29
🔗
|
joepie91 |
JesseW: hrm. it's a very interesting concept, but it's failing to provide any good examples of real-world implementations or what they'd look like |
01:36
🔗
|
JesseW |
joepie91: what, WTFD? |
01:39
🔗
|
JesseW |
well, I came across it from https://github.com/hyperboria/docs/blob/master/achievements.md so that may provide somehting of an example |
01:46
🔗
|
|
RichardG has quit IRC (Ping timeout: 258 seconds) |
01:50
🔗
|
|
anjacks0n has joined #archiveteam-bs |
01:53
🔗
|
|
anjacks0n has quit IRC (Ping timeout: 190 seconds) |
01:54
🔗
|
godane |
i'm starting to download the july 2016 of kpfa |
01:55
🔗
|
Stiletto |
got an easy one - emulation9.com - Archive.org grabs 403, browser sees 403, Google Cache sees it fine. at the very least could someone please find me a workaround to view it? web-based proxy or something? :) |
01:55
🔗
|
Stiletto |
i tried a user-agent changer in Firefox but it didn't do the trick |
01:55
🔗
|
Stiletto |
man I am rusty at these tricks |
01:58
🔗
|
JesseW |
Stiletto: I can't read the language it's written in. What particular URL are you trying to get at, that isn't in Google Cache? |
02:00
🔗
|
Stiletto |
it's Japanese and IIRC they give error 403 to all non-Japan browsers. Google Cache sees it fine. so how can I view it normally and not in Google Cache? ;) |
02:00
🔗
|
joepie91 |
Stiletto: moment |
02:00
🔗
|
Stiletto |
thanks joepie91! :) |
02:01
🔗
|
joepie91 |
I think I might have a Japan-based VPS... |
02:01
🔗
|
* |
joepie91 digs |
02:02
🔗
|
JesseW |
interestingly, it fails using Google Translate, also |
02:02
🔗
|
joepie91 |
yes I do! |
02:03
🔗
|
Stiletto |
cool, thanks joepie91 |
02:03
🔗
|
JesseW |
btw, joepie91 -- was your comment about "it's a very interesting concept" in reference to the essay I posted about WTFM, or something else? |
02:03
🔗
|
Stiletto |
the website isn't in danger or anything, it's been online since 1999 |
02:04
🔗
|
joepie91 |
-bash: less: command not found |
02:04
🔗
|
joepie91 |
... k... |
02:04
🔗
|
joepie91 |
Stiletto: hm, I get a 403 from that VPS also |
02:04
🔗
|
JesseW |
"it's been online since 1999" -- that alone seems like a sign of danger |
02:04
🔗
|
joepie91 |
perhaps it's just broken? |
02:04
🔗
|
joepie91 |
and Google Cache happens to have a good copy |
02:04
🔗
|
joepie91 |
JesseW: WTFM |
02:05
🔗
|
JesseW |
the google cache copy is from Aug 1, 2016 01:09:45 GMT |
02:05
🔗
|
Stiletto |
well it's been dishing out 403 for years and years now. IIRC there was some drama in the emu-scene and the guy locked down the site somewhat |
02:05
🔗
|
Stiletto |
i still have it in my bookmarks tho and tonight i got to wondering about workarounds |
02:05
🔗
|
DoomTay |
We wouldn't happen to have anyone onboard who actually lives in Japan, do we? |
02:05
🔗
|
joepie91 |
fwiw, this is maxmind geolocation for my server: {"as":"AS20473 Choopa, LLC","city":"Heiwajima","country":"Japan","countryCode":"JP","isp":"Choopa, LLC","lat":35.5833,"lon":139.7483,"org":"Choopa, LLC","query":"108.61.200.70","region":"13","regionName":"Tokyo","status":"success","timezone":"Asia/Tokyo","zip":"143-0006"} |
02:06
🔗
|
joepie91 |
it might be locked to Japanese *ISPs* |
02:06
🔗
|
joepie91 |
(Choopa is very much not one :P) |
02:06
🔗
|
JesseW |
and it's now 2:06 GMT, so I don't think it's broken. |
02:07
🔗
|
DoomTay |
If so, this would be the first time I've seen a site restricted by ISP |
02:07
🔗
|
joepie91 |
DoomTay: not for me :P |
02:07
🔗
|
JesseW |
locked to Japanese ISPs and Google, presumably |
02:07
🔗
|
joepie91 |
DoomTay: Tunisian government actually did this during arab spring |
02:07
🔗
|
Stiletto |
hrm |
02:07
🔗
|
joepie91 |
hm |
02:07
🔗
|
joepie91 |
may be Egyptian |
02:07
🔗
|
joepie91 |
not Tunisian |
02:07
🔗
|
JesseW |
can you go from one page to another *via* Google Cache? |
02:07
🔗
|
joepie91 |
can't recall which |
02:07
🔗
|
JesseW |
could have been both |
02:08
🔗
|
joepie91 |
Stiletto: if you have a few $ and you consider it important enough, you could try some actually-Japanese VPS services |
02:08
🔗
|
joepie91 |
see also https://www.exoticvps.com/country/japan |
02:09
🔗
|
joepie91 |
some of those are not really Japanese though |
02:10
🔗
|
JesseW |
Looking at the google cache page, I'm not seeing many internal links -- Stiletto, can you provide some examples? |
02:12
🔗
|
|
nightpool has joined #archiveteam-bs |
02:13
🔗
|
Stiletto |
all the internal links are on the left side between the two graphics linking to amazon |
02:13
🔗
|
Stiletto |
ex. http://www.emulation9.com/emulators/a_mame.html |
02:17
🔗
|
Stiletto |
come to think of it it would be nice to have a legit backup in the archive, they've got 17 years of emulator news headlines. That's pretty rare these days to find a news site that's been going that long. Looks like Google Cache is thorough, but you can't browse from page to page within Google Cache (tho I think there's a Firefox plugin that can do that or something) |
02:18
🔗
|
Stiletto |
ex: http://webcache.googleusercontent.com/search?q=cache:25iDTYHPnpYJ:www.emulation9.com/archives/ |
02:19
🔗
|
Stiletto |
anyhow, its always been an annoyance of mine :) |
02:19
🔗
|
Stiletto |
sorry to bother you guys :) |
02:21
🔗
|
Stiletto |
hm google cache does not have some of those old headlines |
02:26
🔗
|
joepie91 |
Stiletto: try prefixing the URL with cache: in Google search |
02:26
🔗
|
joepie91 |
and wait a few days |
02:26
🔗
|
joepie91 |
chances are they will have it then |
02:26
🔗
|
Stiletto |
hm ok |
02:26
🔗
|
Stiletto |
maybe someone _should_ do a grab with an Japanese ISP for archive.org. I have no money for a VPS right now tho :-/ |
02:26
🔗
|
joepie91 |
(I'm fairly certain that Google obsessively collects URLs from all of their properties, and uses stuff like this to track demand for specific URLs) |
02:26
🔗
|
Stiletto |
(agreed) |
03:22
🔗
|
|
ndiddy has quit IRC (Read error: Connection reset by peer) |
03:22
🔗
|
|
beardicus has quit IRC (Read error: Operation timed out) |
03:23
🔗
|
|
ndiddy has joined #archiveteam-bs |
03:23
🔗
|
|
Coderjoe has quit IRC (Read error: Operation timed out) |
03:24
🔗
|
|
remsen has quit IRC (Read error: Operation timed out) |
03:24
🔗
|
|
remsen has joined #archiveteam-bs |
03:26
🔗
|
|
beardicus has joined #archiveteam-bs |
03:28
🔗
|
|
wp494 has quit IRC (Read error: Connection reset by peer) |
03:33
🔗
|
|
Coderjoe has joined #archiveteam-bs |
04:04
🔗
|
hook54321 |
Stiletto: try changing your useragent to the google cache UA |
04:05
🔗
|
|
RichardG has joined #archiveteam-bs |
04:28
🔗
|
|
robink has quit IRC (Ping timeout: 246 seconds) |
04:34
🔗
|
|
Aranje has joined #archiveteam-bs |
04:38
🔗
|
|
Sk1d has quit IRC (Ping timeout: 194 seconds) |
04:40
🔗
|
|
robink has joined #archiveteam-bs |
04:46
🔗
|
|
Sk1d has joined #archiveteam-bs |
04:46
🔗
|
hook54321 |
Is it immoral to download and archive a movie that is available to rent for a limited time that you think might be at risk of not being streamable or available on a DVD later on? |
04:55
🔗
|
JesseW |
"We are not moral guardians." |
04:59
🔗
|
JesseW |
random survey request I got -- hard to tell if it's spam or not: https://paste2.org/M1PJ3GFa |
05:02
🔗
|
|
Asparagir has quit IRC (Asparagir) |
05:08
🔗
|
|
ndiddy has quit IRC (Quit: Leaving) |
05:13
🔗
|
hook54321 |
JesseW: if you can't be a moral guardian then what's your opinion on it. :P |
05:13
🔗
|
hook54321 |
I keep on getting emails from "The Hindu Daily Digest" and I don't remember signing up for it. |
05:14
🔗
|
|
JesseW has quit IRC (Ping timeout: 370 seconds) |
05:30
🔗
|
|
DoomTay has quit IRC (Quit: Page closed) |
05:34
🔗
|
godane |
with pokemon go being the craze i have this for you guys: https://archive.org/details/Becoming_A_Master_-_The_Ultimate_Pokemon_Experience_1999_VHSRip |
05:34
🔗
|
godane |
i uploaded it awhile back |
05:38
🔗
|
|
wp494 has joined #archiveteam-bs |
05:56
🔗
|
|
nightpool has quit IRC (Read error: Operation timed out) |
06:03
🔗
|
|
Aranje has quit IRC (Quit: Three sheets to the wind) |
06:05
🔗
|
|
nightpool has joined #archiveteam-bs |
06:12
🔗
|
|
nightpool has quit IRC (Read error: Operation timed out) |
06:27
🔗
|
|
Coderjoe has quit IRC (Read error: Operation timed out) |
06:28
🔗
|
|
REiN^ has joined #archiveteam-bs |
06:32
🔗
|
hook54321 |
godane: "FBI Warning" xD |
06:35
🔗
|
|
schbirid has joined #archiveteam-bs |
06:36
🔗
|
|
VADemon has joined #archiveteam-bs |
06:41
🔗
|
|
kristian_ has joined #archiveteam-bs |
06:44
🔗
|
|
Coderjoe has joined #archiveteam-bs |
07:06
🔗
|
|
nightpool has joined #archiveteam-bs |
07:13
🔗
|
|
nightpool has quit IRC (Ping timeout: 244 seconds) |
07:20
🔗
|
godane |
over 2k items in my godanefunnyordie account |
07:27
🔗
|
|
kristian_ has quit IRC (Leaving) |
07:34
🔗
|
godane |
looks like bombjack scanned 3 more Boardwatch Magazines |
07:35
🔗
|
SketchCow |
Where |
07:36
🔗
|
godane |
http://www.bombjack.org/commodore/magazines/boardwatch-magazine/boardwatch-magazine.htm |
07:37
🔗
|
SketchCow |
Exciting |
07:38
🔗
|
godane |
i know |
07:38
🔗
|
godane |
i hope they also get the misssing Byte Magazine issues from the 1990s scanned too |
07:41
🔗
|
godane |
also this guy is uploading Boot Magazine : https://archive.org/details/@ckeck |
07:42
🔗
|
godane |
he uploaded more on reddit: https://www.reddit.com/user/KnuckleSangwich |
07:42
🔗
|
godane |
i have them all even he doesn't upload them all to IA |
07:46
🔗
|
SketchCow |
https://archive.org/search.php?query=collection%3Aopensource+mediatype%3Atexts+magazine is always a fun search. |
08:02
🔗
|
|
Coderjoe has quit IRC (Read error: Operation timed out) |
08:03
🔗
|
|
Coderjoe has joined #archiveteam-bs |
08:09
🔗
|
|
nightpool has joined #archiveteam-bs |
08:13
🔗
|
|
nightpool has quit IRC (Ping timeout: 250 seconds) |
09:11
🔗
|
|
tomwsmf has quit IRC (Read error: Operation timed out) |
09:11
🔗
|
|
tomwsmf has joined #archiveteam-bs |
09:24
🔗
|
|
BlueMaxim has quit IRC (Read error: Operation timed out) |
09:25
🔗
|
|
nightpool has joined #archiveteam-bs |
09:30
🔗
|
|
nightpool has quit IRC (Ping timeout: 250 seconds) |
09:31
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
09:31
🔗
|
|
tomwsmf has quit IRC (Ping timeout: 258 seconds) |
09:48
🔗
|
|
REiN^ has quit IRC () |
09:53
🔗
|
|
signius has quit IRC (Ping timeout: 260 seconds) |
10:05
🔗
|
|
signius has joined #archiveteam-bs |
10:14
🔗
|
|
REiN^ has joined #archiveteam-bs |
10:26
🔗
|
|
nightpool has joined #archiveteam-bs |
10:31
🔗
|
|
nightpool has quit IRC (Ping timeout: 260 seconds) |
11:26
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
11:27
🔗
|
|
nightpool has joined #archiveteam-bs |
11:35
🔗
|
|
nightpool has quit IRC (Read error: Operation timed out) |
12:28
🔗
|
|
nightpool has joined #archiveteam-bs |
12:37
🔗
|
|
nightpool has quit IRC (Read error: Operation timed out) |
12:52
🔗
|
|
nightpool has joined #archiveteam-bs |
12:57
🔗
|
|
nightpool has quit IRC (Ping timeout: 244 seconds) |
13:48
🔗
|
|
Coderjoe has quit IRC (Read error: Operation timed out) |
13:53
🔗
|
|
RichardG has quit IRC (Ping timeout: 633 seconds) |
13:55
🔗
|
|
vitzli has joined #archiveteam-bs |
13:59
🔗
|
|
nightpool has joined #archiveteam-bs |
14:03
🔗
|
|
JSharp___ has quit IRC (Read error: Connection reset by peer) |
14:03
🔗
|
|
Ctrl-S___ has quit IRC (Write error: Connection reset by peer) |
14:04
🔗
|
|
Boltsie has quit IRC (Read error: Connection reset by peer) |
14:04
🔗
|
|
sigkell has quit IRC (Ping timeout: 260 seconds) |
14:04
🔗
|
|
antonizoo has quit IRC (Ping timeout: 260 seconds) |
14:04
🔗
|
|
HCross2 has quit IRC (Read error: Connection reset by peer) |
14:04
🔗
|
|
johtso has quit IRC (Read error: Connection reset by peer) |
14:04
🔗
|
|
nightpool has quit IRC (Read error: Operation timed out) |
14:06
🔗
|
|
Coderjoe has joined #archiveteam-bs |
14:09
🔗
|
|
antonizoo has joined #archiveteam-bs |
14:12
🔗
|
|
JSharp___ has joined #archiveteam-bs |
14:12
🔗
|
|
deathy has quit IRC (Connection closed) |
14:13
🔗
|
Stiletto |
hook54321: I tried a few Googlebot useragents but they didn't work. Not sure if the google-cache useragent is known? this blog post from 2012 didn't seem to think so: http://www.coconutheadphones.com/google-crawling-behavior/ |
14:13
🔗
|
|
Ctrl-S___ has joined #archiveteam-bs |
14:15
🔗
|
|
Boltsie has joined #archiveteam-bs |
14:16
🔗
|
|
sigkell has joined #archiveteam-bs |
14:36
🔗
|
|
HCross2 has joined #archiveteam-bs |
14:37
🔗
|
|
johtso has joined #archiveteam-bs |
14:40
🔗
|
|
VADemon has quit IRC (Quit: left4dead) |
14:47
🔗
|
|
RichardG has joined #archiveteam-bs |
14:47
🔗
|
|
Coderjoe has quit IRC (Read error: Operation timed out) |
14:49
🔗
|
|
metalcamp has joined #archiveteam-bs |
15:26
🔗
|
|
Coderjoe has joined #archiveteam-bs |
15:31
🔗
|
|
nightpool has joined #archiveteam-bs |
15:35
🔗
|
|
JesseW has joined #archiveteam-bs |
15:52
🔗
|
|
Coderjoe_ has joined #archiveteam-bs |
15:52
🔗
|
|
Coderjoe has quit IRC (Read error: Operation timed out) |
15:57
🔗
|
|
DoomTay has joined #archiveteam-bs |
16:06
🔗
|
|
JesseW has quit IRC (Ping timeout: 370 seconds) |
16:39
🔗
|
|
anjacks0n has joined #archiveteam-bs |
16:47
🔗
|
|
RichardG has quit IRC (Quit: Keyboard not found, press F1 to continue) |
16:51
🔗
|
|
nightpool has quit IRC (Read error: Connection reset by peer) |
16:51
🔗
|
|
metalcamp has quit IRC (Quit: Bye) |
16:57
🔗
|
|
sep332 has quit IRC (Read error: Connection reset by peer) |
17:08
🔗
|
|
nightpool has joined #archiveteam-bs |
17:19
🔗
|
|
ndiddy has joined #archiveteam-bs |
17:35
🔗
|
|
Honno has joined #archiveteam-bs |
17:47
🔗
|
|
sep332 has joined #archiveteam-bs |
17:49
🔗
|
|
Honno_ has joined #archiveteam-bs |
17:51
🔗
|
|
Honno__ has joined #archiveteam-bs |
17:59
🔗
|
|
Honno has quit IRC (Read error: Operation timed out) |
18:03
🔗
|
|
Honno_ has quit IRC (Read error: Operation timed out) |
18:37
🔗
|
|
whydomain has joined #archiveteam-bs |
18:39
🔗
|
|
vitzli has quit IRC (Leaving) |
18:50
🔗
|
|
Coderjoe_ has quit IRC (Read error: Operation timed out) |
19:01
🔗
|
hook54321 |
If someone were to setup grab-site with a Facebook cookie and then just had it start archiving, could it potentially start sending friend requests to tons of people? I'm not planning on doing this, i just want to know if that or something similar could happen. |
19:02
🔗
|
yipdw |
not unless Facebook does friend requests via GETs |
19:02
🔗
|
yipdw |
which AFAIK they don't |
19:02
🔗
|
yipdw |
wpull/grab-site follows links, it won't click buttons unless you script it to do otherwise |
19:06
🔗
|
hook54321 |
How can I tell if a website uses GETs? |
19:07
🔗
|
yipdw |
you can investigate HTTP traffic using the web inspector and from there get a gradually better picture of what request verbs are being sent |
19:07
🔗
|
yipdw |
all that said, I don't think Facebook would do something like that |
19:08
🔗
|
hook54321 |
I'm not archiving anything on Facebook at the moment, it's a different site. |
19:13
🔗
|
|
sanquiAFK has quit IRC (Ping timeout: 260 seconds) |
19:14
🔗
|
|
DoomTay has quit IRC (Quit: Page closed) |
19:14
🔗
|
|
Coderjoe has joined #archiveteam-bs |
19:16
🔗
|
|
Honno__ has quit IRC (Read error: Operation timed out) |
19:21
🔗
|
|
Honno__ has joined #archiveteam-bs |
19:22
🔗
|
|
tomwsmf has joined #archiveteam-bs |
19:39
🔗
|
|
Sanqui has joined #archiveteam-bs |
19:43
🔗
|
godane |
looks like IA index is not updating |
19:50
🔗
|
godane |
anyways i'm uploading kpfa for 2016-07 |
19:57
🔗
|
|
DoomTay has joined #archiveteam-bs |
20:05
🔗
|
|
Honno__ has quit IRC (Read error: Operation timed out) |
20:07
🔗
|
godane |
ok now index is updating |
20:25
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
20:39
🔗
|
|
Coderjoe has quit IRC (Ping timeout: 260 seconds) |
20:48
🔗
|
|
Coderjoe has joined #archiveteam-bs |
20:53
🔗
|
|
RichardG has joined #archiveteam-bs |
20:56
🔗
|
|
zhongfu has quit IRC (Remote host closed the connection) |
20:58
🔗
|
|
zhongfu has joined #archiveteam-bs |
21:00
🔗
|
|
DiscantX has joined #archiveteam-bs |
21:01
🔗
|
|
RichardG has quit IRC (Ping timeout: 258 seconds) |
21:02
🔗
|
|
RichardG has joined #archiveteam-bs |
21:07
🔗
|
|
RichardG has quit IRC (Quit: Keyboard not found, press F1 to continue) |
21:45
🔗
|
joepie91 |
[23:40] <salvum> Just an FYI for Defcon goers.... they are using Biometrics for the room locks at bailys and paris. you can opt out. |
21:57
🔗
|
|
whydomain has quit IRC (http://www.kiwiirc.com/ - A hand crafted IRC client) |
21:59
🔗
|
|
REiN^ has quit IRC (Read error: Operation timed out) |
22:04
🔗
|
xmc |
def congoers |
22:04
🔗
|
xmc |
i kinda like the idea of not having to carry around your key |
22:05
🔗
|
xmc |
i don't like the idea of giving fingerprints to the mob |
22:10
🔗
|
|
DoomTay has quit IRC (Quit: Page closed) |
22:48
🔗
|
|
DoomTay has joined #archiveteam-bs |
23:01
🔗
|
|
nightpool has quit IRC (Read error: Operation timed out) |
23:02
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
23:30
🔗
|
|
DoomTay has quit IRC (Quit: Page closed) |
23:31
🔗
|
SketchCow |
ha |
23:34
🔗
|
|
RichardG has joined #archiveteam-bs |
23:47
🔗
|
godane |
SketchCow: http://www.bombjack.org/commodore/disks/?C=M;O=D |
23:47
🔗
|
godane |
alot of disks released in the last month it looks like |