Time |
Nickname |
Message |
00:08
π
|
|
RichardG_ has joined #archiveteam-bs |
00:15
π
|
|
RichardG has quit IRC (Ping timeout: 499 seconds) |
00:16
π
|
|
RichardG_ is now known as RichardG |
00:20
π
|
|
ris has quit IRC () |
00:21
π
|
|
JesseW has joined #archiveteam-bs |
00:25
π
|
|
VADemon has quit IRC (left4dead) |
00:27
π
|
|
DoomTay has quit IRC (Ping timeout: 268 seconds) |
00:34
π
|
|
DoomTay has joined #archiveteam-bs |
00:55
π
|
|
Jeroen52 has quit IRC (Ping timeout: 260 seconds) |
00:58
π
|
|
coretx has quit IRC (Ping timeout: 506 seconds) |
01:02
π
|
|
JesseW has quit IRC (Ping timeout: 370 seconds) |
01:03
π
|
|
Jeroen52 has joined #archiveteam-bs |
01:11
π
|
|
coretx has joined #archiveteam-bs |
01:37
π
|
|
mutoso has joined #archiveteam-bs |
01:37
π
|
|
mutoso_ has quit IRC (Read error: Connection reset by peer) |
01:41
π
|
|
davidar_ has quit IRC (Quit: Connection closed for inactivity) |
01:55
π
|
|
arkiver has quit IRC (Read error: Operation timed out) |
01:57
π
|
|
arkiver has joined #archiveteam-bs |
02:00
π
|
|
zenguy has quit IRC (Ping timeout: 370 seconds) |
02:01
π
|
|
zenguy has joined #archiveteam-bs |
02:02
π
|
|
dcmorton has quit IRC (Ping timeout: 370 seconds) |
02:03
π
|
|
winr5r has joined #archiveteam-bs |
02:03
π
|
|
winr4r has quit IRC (Read error: Operation timed out) |
02:07
π
|
|
dcmorton has joined #archiveteam-bs |
02:09
π
|
|
BlueMaxim has quit IRC (Read error: Operation timed out) |
02:09
π
|
|
dcmorton has quit IRC (Excess Flood) |
02:10
π
|
|
dcmorton has joined #archiveteam-bs |
02:10
π
|
|
dcmorton has quit IRC (Excess Flood) |
02:11
π
|
|
dcmorton has joined #archiveteam-bs |
02:12
π
|
|
BlueMaxim has joined #archiveteam-bs |
02:35
π
|
|
dcmorton has quit IRC (Ping timeout: 370 seconds) |
02:41
π
|
|
dcmorton has joined #archiveteam-bs |
02:56
π
|
|
Coderjoe has quit IRC (Read error: Operation timed out) |
03:06
π
|
|
dcmorton has quit IRC (Ping timeout: 370 seconds) |
03:12
π
|
|
dcmorton has joined #archiveteam-bs |
03:21
π
|
|
Coderjoe has joined #archiveteam-bs |
03:21
π
|
|
nickname_ has joined #archiveteam-bs |
03:44
π
|
|
dcmorton has quit IRC (Excess Flood) |
03:47
π
|
|
dcmorton has joined #archiveteam-bs |
04:04
π
|
|
JesseW has joined #archiveteam-bs |
04:16
π
|
|
dcmorton has quit IRC (Max SendQ exceeded) |
04:19
π
|
|
dcmorton has joined #archiveteam-bs |
04:40
π
|
|
DFJustin has quit IRC (Remote host closed the connection) |
04:42
π
|
|
DFJustin has joined #archiveteam-bs |
04:42
π
|
|
swebb sets mode: +o DFJustin |
04:43
π
|
|
nickname_ has quit IRC (Read error: Operation timed out) |
05:01
π
|
|
jut has joined #archiveteam-bs |
05:01
π
|
|
Sk1d has quit IRC (Ping timeout: 250 seconds) |
05:10
π
|
|
Sk1d has joined #archiveteam-bs |
05:11
π
|
|
hook54321 has quit IRC (Quit: Connection closed for inactivity) |
05:30
π
|
|
antomati_ has quit IRC (Ping timeout: 258 seconds) |
05:50
π
|
godane |
SketchCow: i'm up to 2015 with deadspin.com grab |
05:51
π
|
godane |
i'm uploading 2013 to 2015 right now of it |
05:51
π
|
godane |
i'm also grab 2016-01- to 2016-05 of deadspin.com |
05:52
π
|
JesseW |
godane: btw, we're working on grabbing GSOC web pages right now in #archivebot -- your help could probably be useful |
05:58
π
|
|
DoomTay has quit IRC (Quit: Page closed) |
06:11
π
|
|
Cameron_D has quit IRC (Ping timeout: 370 seconds) |
06:17
π
|
|
Cameron_D has joined #archiveteam-bs |
06:22
π
|
|
BlueMaxim has quit IRC (Read error: Operation timed out) |
06:24
π
|
|
BlueMaxim has joined #archiveteam-bs |
06:25
π
|
|
aschmitz has quit IRC (Read error: Operation timed out) |
06:48
π
|
|
JesseW has quit IRC (Ping timeout: 370 seconds) |
06:58
π
|
|
aschmitz has joined #archiveteam-bs |
07:18
π
|
|
vtyl has joined #archiveteam-bs |
07:18
π
|
|
lytv has quit IRC (Ping timeout: 258 seconds) |
07:58
π
|
|
tomwsmf-a has quit IRC (Read error: Operation timed out) |
08:25
π
|
|
schbirid has joined #archiveteam-bs |
09:01
π
|
|
antomatic has joined #archiveteam-bs |
09:01
π
|
|
swebb sets mode: +o antomatic |
09:02
π
|
|
antomatic has quit IRC (Client Quit) |
09:09
π
|
|
antomatic has joined #archiveteam-bs |
09:09
π
|
|
swebb sets mode: +o antomatic |
09:16
π
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
09:20
π
|
|
dashcloud has joined #archiveteam-bs |
10:45
π
|
|
anjacks0n has joined #archiveteam-bs |
10:53
π
|
|
anjacks0n has quit IRC (anjacks0n) |
11:15
π
|
|
signius has quit IRC (Ping timeout: 260 seconds) |
11:22
π
|
|
signius has joined #archiveteam-bs |
11:40
π
|
|
anjacks0n has joined #archiveteam-bs |
11:41
π
|
|
anjacks0n has quit IRC (Client Quit) |
11:48
π
|
|
hook54321 has joined #archiveteam-bs |
12:24
π
|
|
jut has quit IRC (Read error: Connection reset by peer) |
12:37
π
|
|
anjacks0n has joined #archiveteam-bs |
12:53
π
|
|
Boppen has joined #archiveteam-bs |
12:58
π
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
13:02
π
|
|
anjacks0n has quit IRC (anjacks0n) |
13:23
π
|
|
anjacks0n has joined #archiveteam-bs |
13:41
π
|
|
anjacks0n has quit IRC (anjacks0n) |
13:42
π
|
|
anjacks0n has joined #archiveteam-bs |
13:47
π
|
|
kristian_ has joined #archiveteam-bs |
13:47
π
|
|
anjacks0n has quit IRC (anjacks0n) |
13:48
π
|
|
anjacks0n has joined #archiveteam-bs |
13:49
π
|
|
anjacks0n has quit IRC (Client Quit) |
13:50
π
|
|
anjacks0n has joined #archiveteam-bs |
13:50
π
|
|
anjacks0n has quit IRC (Client Quit) |
13:54
π
|
SketchCow |
So, don't spread to social media or post anywhere... |
13:54
π
|
SketchCow |
...there's a new beta version of the next iteration of the Wayback machine. |
13:57
π
|
SketchCow |
https://web-beta.archive.org |
13:57
π
|
SketchCow |
Please consider yourselves invited to bang the living shit out of it. |
13:58
π
|
SketchCow |
If you hit something SUPER broken, mail Mark at mark@archive.org. |
13:58
π
|
SketchCow |
He's head of Wayback |
14:00
π
|
Atluxity |
cool |
14:06
π
|
HCross |
it shows the source of the crawl, that is awesome. https://wayback-beta.archive.org/web/20160312075544/http://www.whtimes.co.uk/home :) |
14:07
π
|
Frogging |
is that good? it may lead to people going after WARCs to get them darked |
14:08
π
|
HCross |
wouldnt they just contact the IA and go "delete xxxx.co.uk please" |
14:08
π
|
HCross |
anyway, without the warc |
14:09
π
|
Frogging |
eh, maybe they'd just throw robots.txt at it |
14:09
π
|
Frogging |
dunno, just idle speculation :p |
14:11
π
|
|
hook54321 has quit IRC (Quit: Connection closed for inactivity) |
14:13
π
|
|
DoomTay has joined #archiveteam-bs |
14:17
π
|
SketchCow |
For everyone who wishes they could look at 1,500 of my Japan photos: https://www.flickr.com/photos/textfiles/albums/72157669136764700 |
14:18
π
|
|
anjacks0n has joined #archiveteam-bs |
14:25
π
|
DoomTay |
So how's that GCI sweeping going in? |
14:37
π
|
|
j08nY has joined #archiveteam-bs |
14:38
π
|
|
anjacks0n has quit IRC (anjacks0n) |
14:39
π
|
|
nickname_ has joined #archiveteam-bs |
15:20
π
|
|
VADemon has joined #archiveteam-bs |
15:40
π
|
|
Kenshin has quit IRC (Remote host closed the connection) |
15:46
π
|
|
JesseW has joined #archiveteam-bs |
15:48
π
|
|
kristian_ has quit IRC (Leaving) |
15:48
π
|
|
nickname_ has quit IRC (Read error: Operation timed out) |
15:50
π
|
|
Kenshin has joined #archiveteam-bs |
15:53
π
|
|
nickname_ has joined #archiveteam-bs |
16:11
π
|
|
JesseW has quit IRC (Ping timeout: 370 seconds) |
16:13
π
|
|
RichardG has quit IRC (Quit: Keyboard not found, press F1 to continue) |
16:16
π
|
|
nickname_ has quit IRC (Read error: Connection reset by peer) |
16:21
π
|
|
RichardG has joined #archiveteam-bs |
17:17
π
|
joepie91 |
SketchCow: minor UI bug that makes it nigh impossible to click the "why" items in the domain timeline because the chart is overlapping it.... should I report that via that email as well, or is that just for severe functionality breakage? |
17:22
π
|
joepie91 |
meh, found some breakage, I'll just combine it into one email |
17:23
π
|
|
metalcamp has joined #archiveteam-bs |
17:24
π
|
joepie91 |
hm. Google is not visible due to robots.txt, in the current wayback machine? wut? |
17:25
π
|
|
Stilett0 has quit IRC () |
17:30
π
|
DoomTay |
Works fine for me |
17:34
π
|
* |
joepie91 plays QA engineer |
17:34
π
|
joepie91 |
up to 6 issues: 2 functionality issues, 2 UI quirks, 2 possible UI improvements |
17:52
π
|
|
VADemon has quit IRC (Quit: left4dead) |
17:53
π
|
|
VADemon has joined #archiveteam-bs |
18:05
π
|
|
JW_work has quit IRC (Quit: Leaving.) |
18:07
π
|
|
mutoso has quit IRC (Read error: Operation timed out) |
18:08
π
|
|
JW_work has joined #archiveteam-bs |
18:09
π
|
|
mutoso has joined #archiveteam-bs |
18:23
π
|
|
ris has joined #archiveteam-bs |
18:27
π
|
luckcolor |
guys just installed latest update of wpull 2.0.1 |
18:27
π
|
luckcolor |
Traceback (most recent call last): |
18:27
π
|
luckcolor |
File "/usr/local/bin/grab-site", line 4, in <module> |
18:27
π
|
luckcolor |
main.main() |
18:27
π
|
luckcolor |
File "/usr/local/lib/python3.4/site-packages/click/core.py", line 716, in __call__ |
18:27
π
|
luckcolor |
return self.main(*args, **kwargs) |
18:27
π
|
luckcolor |
File "/usr/local/lib/python3.4/site-packages/click/core.py", line 696, in main |
18:27
π
|
luckcolor |
rv = self.invoke(ctx) |
18:28
π
|
luckcolor |
File "/usr/local/lib/python3.4/site-packages/click/core.py", line 889, in invoke |
18:28
π
|
luckcolor |
return ctx.invoke(self.callback, **ctx.params) |
18:28
π
|
luckcolor |
File "/usr/local/lib/python3.4/site-packages/click/core.py", line 534, in invoke |
18:28
π
|
luckcolor |
return callback(*args, **kwargs) |
18:28
π
|
luckcolor |
File "/usr/local/lib/python3.4/site-packages/libgrabsite/main.py", line 359, in main |
18:28
π
|
luckcolor |
from wpull.app import Application |
18:28
π
|
luckcolor |
ImportError: No module named 'wpull.app' |
18:28
π
|
luckcolor |
any ideas? |
18:28
π
|
luckcolor |
already tried to re install it |
18:43
π
|
Meroje |
leftover .pyc files ? |
18:44
π
|
luckcolor |
mmh |
18:44
π
|
luckcolor |
i'll do a rm -r *.pyc |
18:44
π
|
Meroje |
this is not recursive |
18:45
π
|
Meroje |
I usually do `find . -name '*.pyc' -delete` |
18:46
π
|
Meroje |
(I missed the -r in your command, sorry) |
18:47
π
|
luckcolor |
nope |
18:47
π
|
luckcolor |
doesn't work |
18:47
π
|
luckcolor |
maybe it's grabsite? |
18:48
π
|
luckcolor |
i'll reboot archivebot and se if it has errors |
18:53
π
|
|
arrith has quit IRC (Read error: Operation timed out) |
19:03
π
|
luckcolor |
ok archivebot works |
19:03
π
|
luckcolor |
wich means is a a grab.site bug |
19:03
π
|
luckcolor |
*grab-site |
19:05
π
|
luckcolor |
will file issue if someone can make a patch soon it would be amazing |
19:06
π
|
godane |
SketchCow: deadspin.com is up to 2016-05 now and all uploaded |
19:06
π
|
godane |
i'm grabbing gizmodo.com right now |
19:11
π
|
luckcolor |
https://github.com/ludios/grab-site/issues/92 |
19:11
π
|
luckcolor |
for who is interested |
19:13
π
|
joepie91 |
[20:46] <Meroje> (I missed the -r in your command, sorry) |
19:13
π
|
joepie91 |
still wouldn't make it recursive I think |
19:14
π
|
joepie91 |
since you're only selecting *.pyc and not all folders |
19:14
π
|
joepie91 |
you'd need something like... **/*.pyc? |
19:14
π
|
Meroje |
yeah I thought of the expansion after that |
19:14
π
|
joepie91 |
("zero or more path segments containing anything, followed by *.pyc") |
19:15
π
|
luckcolor |
anywatìys that wasn't the problem |
19:15
π
|
luckcolor |
well |
19:15
π
|
luckcolor |
for what i know ofc |
19:49
π
|
|
Stiletto has joined #archiveteam-bs |
20:09
π
|
DoomTay |
Okay, I started getting WARCs of a site that is going to change in the 20th. What's the next step? How do I get these into Wayback Machine? |
20:12
π
|
|
tomwsmf-a has joined #archiveteam-bs |
20:15
π
|
|
schbirid has quit IRC (Quit: Leaving) |
21:06
π
|
arkiver |
DoomTay: what site |
21:06
π
|
DoomTay |
portalgraphics.net |
21:07
π
|
DoomTay |
Yes, I sicced ArchiveBot on that site twice, though I remember that yipdw kinds threw a fit the second time |
21:08
π
|
DoomTay |
Plusi f there's another adavantage to the way I'm doing it now, cookie injection means the site will always come out in English |
21:09
π
|
arkiver |
why not just use /en/ for english? |
21:10
π
|
DoomTay |
Hmm..lemme try that real quick.... |
21:12
π
|
DoomTay |
Agh, knew it. Did no good for http://www.portalgraphics.net/pg/illust/?image_id=90308. Putting "&lang=en" did no good either |
21:14
π
|
arkiver |
What cookie are you using |
21:14
π
|
arkiver |
(haven't had a look at them yet) |
21:14
π
|
DoomTay |
langset=en |
21:17
π
|
|
decay has quit IRC (Read error: Operation timed out) |
21:17
π
|
|
decay has joined #archiveteam-bs |
21:17
π
|
|
Lord_Nigh has quit IRC (Read error: Operation timed out) |
21:17
π
|
arkiver |
And why English? It looks like the website wants to server Japanese by default. |
21:18
π
|
arkiver |
so I'm not sure if you'd want to force english |
21:18
π
|
|
Lord_Nigh has joined #archiveteam-bs |
21:18
π
|
arkiver |
maybe do both |
21:18
π
|
arkiver |
Also, a normal grab probably won't grab http://www.portalgraphics.net/pg/illust/?image_id=90301 correctly |
21:19
π
|
|
ring has quit IRC (Read error: Operation timed out) |
21:19
π
|
|
luckcolor has quit IRC (Read error: Operation timed out) |
21:19
π
|
|
SilSte has quit IRC (Read error: Operation timed out) |
21:19
π
|
|
j08nY has quit IRC (Read error: Operation timed out) |
21:19
π
|
|
MrRadar has quit IRC (Read error: Operation timed out) |
21:19
π
|
arkiver |
it looks like the flash player loads http://www.portalgraphics.net/pg/movie/address.php?image%5Fid=90301 |
21:19
π
|
DoomTay |
Oh, right |
21:19
π
|
|
MrRadar has joined #archiveteam-bs |
21:19
π
|
|
luckcolor has joined #archiveteam-bs |
21:19
π
|
DoomTay |
Hmm... |
21:19
π
|
|
chazchaz_ has quit IRC (Read error: Operation timed out) |
21:19
π
|
arkiver |
which contains info, movie and image |
21:19
π
|
|
chazchaz has joined #archiveteam-bs |
21:19
π
|
DoomTay |
Apart from that, language selection seems to be random |
21:20
π
|
DoomTay |
Hell, I don't know if it would even be possible to save both |
21:20
π
|
|
Fletcher has quit IRC (Read error: Operation timed out) |
21:20
π
|
DoomTay |
And havethem both on Wayback Machine |
21:20
π
|
|
alfie has quit IRC (Read error: Operation timed out) |
21:20
π
|
DoomTay |
Well, at least wget could pull them both |
21:20
π
|
arkiver |
It is possible. But language selection in the Wayback Machine would be 'random' too |
21:20
π
|
arkiver |
But it doesn't save the video items correctly |
21:21
π
|
|
brayden has quit IRC (Read error: Operation timed out) |
21:21
π
|
|
alfie has joined #archiveteam-bs |
21:21
π
|
|
Fletcher has joined #archiveteam-bs |
21:21
π
|
|
Baljem_ has quit IRC (Read error: Connection reset by peer) |
21:22
π
|
|
ring has joined #archiveteam-bs |
21:22
π
|
|
SilSte has joined #archiveteam-bs |
21:22
π
|
|
Baljem has joined #archiveteam-bs |
21:23
π
|
|
joepie91 has quit IRC (Excess Flood) |
21:23
π
|
DoomTay |
At least we know the movie file is at http://www.portalgraphics.net/data/movie/90000/90301.mp4 |
21:23
π
|
DoomTay |
The URL would be pretty easy to guess for others |
21:24
π
|
DoomTay |
I could probably fix the "not accessed properly" part on Wayback Machine with a userscript when it comes time |
21:24
π
|
arkiver |
the movie path is in http://www.portalgraphics.net/pg/movie/address.php?image%5Fid=90301 |
21:24
π
|
arkiver |
<uri type="movie" href="http://www.portalgraphics.net/pg/movie/movie.php?movie_path=90000/90301" /> |
21:24
π
|
arkiver |
http://www.portalgraphics.net/pg/movie/movie.php?movie_path=90000/90301 redirects to http://www.portalgraphics.net/data/movie/90000/90301.mp4 |
21:25
π
|
|
joepie91 has joined #archiveteam-bs |
21:25
π
|
|
midas sets mode: +o joepie91 |
21:29
π
|
arkiver |
also, URLs like http://www.portalgraphics.net/pg/movie/pg_player/res_movie_data.php?mid=90301&lang=en won't be extracted by wget or wpull from http://www.portalgraphics.net/pg/illust/?image_id=90301 |
21:29
π
|
|
Aranje has quit IRC (Quit: Three sheets to the wind) |
21:29
π
|
arkiver |
they also contain URLs that should be grabbed |
21:32
π
|
|
Stiletto has quit IRC (Ping timeout: 260 seconds) |
21:33
π
|
DoomTay |
Hmm |
21:38
π
|
|
Aranje has joined #archiveteam-bs |
21:51
π
|
|
hook54321 has joined #archiveteam-bs |
21:56
π
|
|
Stiletto has joined #archiveteam-bs |
22:05
π
|
|
metalcamp has quit IRC (Ping timeout: 244 seconds) |
22:16
π
|
HCross |
https://en.wikipedia.org/wiki/Wikipedia:TWA/Portal this is pretty cool |
22:21
π
|
DoomTay |
Huh, there's http://www.portalgraphics.net/lang.php?lang=en&url=http://www.portalgraphics.net/pg/illust/?image_id=90301 |
22:22
π
|
DoomTay |
Okay, never mind, that completely failed |
22:37
π
|
DoomTay |
Still, getting that "information page" and fullsize image for each thing would be miles better than nothing |
22:39
π
|
|
JesseW has joined #archiveteam-bs |
22:46
π
|
DoomTay |
Besides, the URL for each of those can be guessed easily for other images |
22:46
π
|
|
Aranje has quit IRC (Quit: Three sheets to the wind) |
22:51
π
|
arkiver |
when is portalgraphics closing |
23:10
π
|
|
JesseW has quit IRC (Ping timeout: 370 seconds) |
23:18
π
|
|
BlueMaxim has joined #archiveteam-bs |
23:20
π
|
|
DoomTay has quit IRC (Ping timeout: 268 seconds) |
23:39
π
|
|
DoomTay has joined #archiveteam-bs |
23:39
π
|
DoomTay |
Well, it's not closing per se, by on 7/20, they will be deleting accound user data and associated data, according to http://www.portalgraphics.net/pg/guide/news20160520.html |