Time |
Nickname |
Message |
00:24
🔗
|
|
ndiddy has quit IRC (Read error: Operation timed out) |
00:44
🔗
|
|
JesseW has joined #archiveteam-bs |
01:05
🔗
|
|
JesseW has quit IRC (Read error: Operation timed out) |
02:10
🔗
|
|
DFJustin has quit IRC (Read error: Operation timed out) |
02:15
🔗
|
|
VADemon has quit IRC (Quit: left4dead) |
02:25
🔗
|
yipdw |
I'm glad IA *does* take a hardline approach to robots.txt, because in the long run it seems like the best way to de-escalate the blocking arms race |
02:25
🔗
|
yipdw |
imagine what would happen if they didn't |
02:25
🔗
|
yipdw |
you'd end up with the Internet equivalent of the 2016 US presidential election with assholes on all sides and a small contingent of reasonable people drowned out |
02:41
🔗
|
|
metalcamp has quit IRC (Ping timeout: 244 seconds) |
03:26
🔗
|
|
JesseW has joined #archiveteam-bs |
03:51
🔗
|
|
JesseW has quit IRC (Read error: Operation timed out) |
03:51
🔗
|
phillipsj |
Ducky_, don't URL shorteners mainly defeat the point of HTTPS? (by acting as a MITM) I suppose a service guaranteeing that any redirects use HTTPS may help against passive adversaries. |
03:55
🔗
|
murk |
phillipsj: there's a scarier bug with URL shorteners, lots of services like twitter shorten user input, often user input contains secret URLS (auth tokens, private image URLs, etc) and turn it into something that is trivial to enumerate. |
04:27
🔗
|
|
tomwsmf-a has quit IRC (Read error: Operation timed out) |
04:30
🔗
|
|
Sk1d has quit IRC (Ping timeout: 194 seconds) |
04:39
🔗
|
|
Sk1d has joined #archiveteam-bs |
04:41
🔗
|
|
DFJustin has joined #archiveteam-bs |
04:41
🔗
|
|
swebb sets mode: +o DFJustin |
04:52
🔗
|
phillipsj |
You mean using https://encryptedpastebin.io/sdjkhhksd#ab3b2b2c22234f22e2222d2222a221a6 then using a url shortener makes the whole exercise a waste of CPU cycles? ;) |
04:58
🔗
|
|
JesseW has joined #archiveteam-bs |
05:00
🔗
|
phillipsj |
I was confused how encrypted pastebins were supposed to work until somebody pointed out to me that data after the '#' is not sent to the server (in a get request anyway). |
05:06
🔗
|
Frogging |
yep |
05:08
🔗
|
Frogging |
That's how mega works now too |
05:13
🔗
|
godane |
i'm gong to slowly upload this youtube channel since it has a lot of old vhs videos: https://www.youtube.com/user/jeburdick/ |
05:15
🔗
|
godane |
i found out about it cause of this: https://www.reddit.com/r/90s/comments/4kp8al/i_accidently_created_a_90s_youtube_archive_by/ |
05:17
🔗
|
JesseW |
ivan: Please grab that channel. |
05:19
🔗
|
hook54321 |
How do I look at a a log for a job from the archivebot viewer? |
05:21
🔗
|
JesseW |
hook54321: link? |
05:21
🔗
|
JesseW |
or job ID, or whatever you are starting with...? |
05:22
🔗
|
hook54321 |
http://archive.fart.website/archivebot/viewer/job/1f84g |
05:23
🔗
|
JesseW |
hook54321: and you are looking for the wpull log? |
05:24
🔗
|
hook54321 |
I think so? What ever lists the URLs I guess... |
05:25
🔗
|
JesseW |
It looks like that job didn't go through for some reason. Normally there are more files listed. |
05:25
🔗
|
JesseW |
e.g. http://archive.fart.website/archivebot/viewer/job/30015 |
05:25
🔗
|
JesseW |
you would download the -meta.warc.gz |
05:25
🔗
|
JesseW |
for the list of URLs |
05:26
🔗
|
hook54321 |
hmm |
05:26
🔗
|
hook54321 |
odd |
05:26
🔗
|
JesseW |
so I'd try running that job again |
05:26
🔗
|
yipdw |
OR |
05:26
🔗
|
yipdw |
it hasn't uploaded yet |
05:26
🔗
|
JesseW |
AHHHH. |
05:26
🔗
|
JesseW |
That's a *much* more likely reason. |
05:27
🔗
|
JesseW |
I didn't realize that the json files could end up in different items from the rest of the job |
05:27
🔗
|
yipdw |
they can |
05:27
🔗
|
yipdw |
two days is a bit long, I grant, but hey |
05:27
🔗
|
hook54321 |
Wouldn't it be fairly quick for it to upload the log though? |
05:27
🔗
|
yipdw |
no |
05:28
🔗
|
yipdw |
the log is a WARC record in the metawarc; that gets uploaded along with all the other warcs |
05:28
🔗
|
yipdw |
also the log can be quite large depending on number of ignore patterns etc |
05:29
🔗
|
hook54321 |
It'll be possible to view just the log for this job though, right? |
05:29
🔗
|
yipdw |
in any case that job was processed on ananiel-falconkirtaran-net-b |
05:29
🔗
|
yipdw |
that pipeline is generally pretty good but if something ends up going wrong talk to FalconK |
05:29
🔗
|
yipdw |
and yes, it will |
05:32
🔗
|
hook54321 |
k. I'm not in a huge rush or anything, At somepoint I just want to look for subdomains and do those aswell if there are any. |
05:33
🔗
|
hook54321 |
Also, is their any reason for me to not ignore these links on my flashalert job? https://data.oregon.gov/w/5gg8-bhnv/k5vp-q3pt?cur=sOjJqG5wT9g&from=xVDEOAAhFUB |
05:34
🔗
|
hook54321 |
They appear to either go to a list or a map. |
05:57
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
06:07
🔗
|
|
JesseW has quit IRC (Ping timeout: 370 seconds) |
06:18
🔗
|
|
Start has joined #archiveteam-bs |
07:56
🔗
|
|
bsmith093 has quit IRC (Ping timeout: 370 seconds) |
08:11
🔗
|
|
schbirid has joined #archiveteam-bs |
08:13
🔗
|
godane |
https://archive.org/details/Child_Safety_In_The_Home_1995_VHSRip |
08:17
🔗
|
|
atlogbot has quit IRC (Ping timeout: 370 seconds) |
08:18
🔗
|
|
bsmith093 has joined #archiveteam-bs |
08:26
🔗
|
midas |
thank you godane :D |
08:38
🔗
|
godane |
this is starting to be uploaded: https://archive.org/details/Becoming_A_Master_-_The_Ultimate_Pokemon_Experience_1999_VHSRip |
08:38
🔗
|
godane |
tape one is on there now |
08:39
🔗
|
godane |
then there is this: https://archive.org/details/How_To_Groom_Your_Pet_Professionally_199x_VHSRip |
08:41
🔗
|
ranma |
was expecting to be more entertained by https://yourmom.likesbuttse.xxx/stuff/Masters_of_Doom-David_Kushner-Read_by_Wil_Wheaton/ |
08:41
🔗
|
ranma |
but kinda zzz'd in the first 5-10 minutes |
08:50
🔗
|
|
Honno has joined #archiveteam-bs |
08:54
🔗
|
|
swebb has quit IRC (Ping timeout: 246 seconds) |
08:57
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
09:52
🔗
|
|
metalcamp has joined #archiveteam-bs |
10:47
🔗
|
|
r3c0d3x has quit IRC (Ping timeout: 260 seconds) |
10:57
🔗
|
|
r3c0d3x has joined #archiveteam-bs |
12:03
🔗
|
godane |
so one of that guy's tapes was blocked by copyright: https://www.youtube.com/watch?v=NquoeowJfv8 |
12:03
🔗
|
godane |
i think its a personal recording he did |
12:37
🔗
|
godane |
i'm uploading this: https://en.wikipedia.org/wiki/If_You_Could_See_What_I_Hear |
12:37
🔗
|
godane |
i got a rare vhs rip of it |
12:40
🔗
|
godane |
i'm also getting some a rare copy of Dreamcast Chistmas Seaman |
12:41
🔗
|
godane |
*Christmas |
12:47
🔗
|
godane |
another video blocked : https://www.youtube.com/watch?v=WgmlqVjQ4lI |
12:47
🔗
|
godane |
by WMG |
13:51
🔗
|
|
RichardG has quit IRC (Ping timeout: 260 seconds) |
14:02
🔗
|
|
RichardG has joined #archiveteam-bs |
14:10
🔗
|
|
swebb has joined #archiveteam-bs |
14:18
🔗
|
|
JesseW has joined #archiveteam-bs |
14:21
🔗
|
|
atlogbot has joined #archiveteam-bs |
14:21
🔗
|
|
Honno has quit IRC (Read error: Operation timed out) |
14:29
🔗
|
|
Honno has joined #archiveteam-bs |
14:30
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
14:39
🔗
|
|
JesseW has quit IRC (Ping timeout: 370 seconds) |
14:41
🔗
|
|
RichardG has quit IRC (Read error: Connection reset by peer) |
14:41
🔗
|
|
RichardG has joined #archiveteam-bs |
14:52
🔗
|
|
RichardG has quit IRC (Ping timeout: 499 seconds) |
14:55
🔗
|
|
RichardG has joined #archiveteam-bs |
16:08
🔗
|
ranma |
there was a guy that did an analysis of the Rhonda Rousey fight that made her famous |
16:08
🔗
|
ranma |
i don't think he used more than 100-200 frames in the course of 4-5 minutes, but it was enough for the UFC to take it down :< |
16:09
🔗
|
ranma |
couldn't get the author to send it to me to upload it and fend off the UFC |
16:09
🔗
|
ranma |
it was 100% fair use :/ |
16:10
🔗
|
ranma |
analyzing posture, center of gravity etc |
16:33
🔗
|
|
Start has joined #archiveteam-bs |
16:54
🔗
|
|
fie has quit IRC (Read error: Operation timed out) |
16:57
🔗
|
|
fie has joined #archiveteam-bs |
17:12
🔗
|
xmc |
:( |
17:29
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
18:11
🔗
|
hook54321 |
Anyone know how I might be able to access this article for free? http://www.scientificamerican.com/article/long-live-the-web/ |
18:14
🔗
|
schbirid |
hook54321: https://ac-me.googlecode.com/files/Long%20Live%20the%20Web_%20A%20Call%20for%20Continued%20Open%20Standards%20and%20Neutrality_%20Scientific%20American.pdf apparently |
18:15
🔗
|
hook54321 |
how did you do that? :P |
18:15
🔗
|
schbirid |
i googled! :P |
18:16
🔗
|
schbirid |
https://www.cs.virginia.edu/~robins/Long_Live_the_Web.pdf is even better |
18:17
🔗
|
Atluxity |
google-fu is strong in this one |
18:20
🔗
|
hook54321 |
Me: Will we get graded down for using a information from a source we didn't pay for? |
18:20
🔗
|
hook54321 |
Teacher: Do what you need to do, I don't care where you get it from. |
18:23
🔗
|
xmc |
modern academia |
18:24
🔗
|
Frogging |
hm? that sounds like a reasonable answer to me |
18:24
🔗
|
xmc |
ditto |
18:25
🔗
|
hook54321 |
I tried putting it into sci-hub and that didn't even work :P |
18:32
🔗
|
|
Start has joined #archiveteam-bs |
18:40
🔗
|
schbirid |
second url was just googling the title btw :P |
19:14
🔗
|
|
tomwsmf-a has joined #archiveteam-bs |
19:45
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
19:50
🔗
|
|
VADemon has joined #archiveteam-bs |
20:22
🔗
|
|
atrocity has quit IRC (Read error: Operation timed out) |
20:30
🔗
|
|
Muad-Dib has quit IRC (Quit: ZNC - http://znc.in) |
20:31
🔗
|
|
tfgbd has joined #archiveteam-bs |
20:48
🔗
|
tfgbd |
Has archiveteam downsized? I don't see much newer stuff listed when I click the archiveteam tag |
20:48
🔗
|
MrRadar |
tfgbd: Sort by date archived |
20:48
🔗
|
MrRadar |
You'll see there's quite a lot of stuff coming through |
20:49
🔗
|
MrRadar |
Don't miss the archivebot tag either: https://archive.org/search.php?query=subject%3A%22archivebot%22&sort=-publicdate |
20:50
🔗
|
tfgbd |
I see. |
20:50
🔗
|
tfgbd |
It's sorted by revalence and seems to be showing some of the stuff I uploaded |
20:51
🔗
|
MrRadar |
There are also a lot of untagged items in the archiveteam collection too: https://archive.org/details/archiveteam?&sort=-publicdate |
20:52
🔗
|
tfgbd |
I just had a HDD crash and lost most of the items I uploaded last year. |
20:52
🔗
|
tfgbd |
Fortunately, a lot of stuff is still public on IA |
20:52
🔗
|
xmc |
:( |
20:52
🔗
|
tfgbd |
Guess I'll just have to redownload it |
20:52
🔗
|
tfgbd |
But that won't get my precious IRC logs back :/ |
20:54
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
21:26
🔗
|
tfgbd |
Could I contact the author and see if he has backups? |
21:43
🔗
|
|
mutoso has quit IRC (Read error: Operation timed out) |
21:44
🔗
|
|
mutoso has joined #archiveteam-bs |
21:48
🔗
|
tfgbd |
So, I just picked up a bunch of CDs from the thrift store |
21:49
🔗
|
tfgbd |
Wordperfect 8! |
21:49
🔗
|
xmc |
\m/ |
21:57
🔗
|
tfgbd |
you prefer bin/cue? |
21:57
🔗
|
xmc |
bin/cue is better but it's not browseable online directly |
21:57
🔗
|
xmc |
so i suggest bin/cue and .iso |
22:00
🔗
|
tfgbd |
Last time I did both bin/cue and iso |
22:00
🔗
|
xmc |
sgtm |
22:01
🔗
|
tfgbd |
I just converted the bin/cue to iso |
22:29
🔗
|
|
wyatt8740 has quit IRC (Read error: Operation timed out) |
22:39
🔗
|
|
wyatt8740 has joined #archiveteam-bs |
22:50
🔗
|
DFJustin |
that works |
23:09
🔗
|
|
metalcamp has quit IRC (Ping timeout: 244 seconds) |
23:12
🔗
|
|
Honno has quit IRC (Read error: Operation timed out) |
23:12
🔗
|
hook54321 |
Is their a quick way to see if a page uses AJAX? |
23:36
🔗
|
yipdw |
there are heuristics like checking for $.ajax, new XMLHttpRequest, etc, but if you want to be absolutely sure you need to execute the page script |
23:45
🔗
|
godane |
https://archive.org/details/If_You_Could_See_What_I_Hear_1982_VHSRip |