Time |
Nickname |
Message |
01:09
🔗
|
mr_archiv |
I got some very interesting JSONs based on my attempt to archive ctgmusic.com before it went down. They contain URLs that are not in the wayback archive because web scrappers could not get past the link which redirects to the real download link. |
01:10
🔗
|
|
Tenebrae has quit IRC (Read error: Operation timed out) |
01:10
🔗
|
mr_archiv |
http://163.172.39.176/ctgmusic/songs.json |
01:10
🔗
|
mr_archiv |
http://163.172.39.176/ctgmusic/artists.json |
01:11
🔗
|
mr_archiv |
http://163.172.39.176/ctgmusic/attemptedURLs.json |
01:11
🔗
|
|
Tenebrae has joined #archiveteam-ot |
01:11
🔗
|
mr_archiv |
I made these a while a go with a Python script and forgot about them. |
01:14
🔗
|
mr_archiv |
Note that this will not be a permanent link, I plan to take it down once people who are interested in this download it. |
02:08
🔗
|
|
vectr0n has quit IRC (Quit: ZNC - https://znc.in) |
02:11
🔗
|
|
vectr0n has joined #archiveteam-ot |
03:08
🔗
|
|
odemg has quit IRC (Ping timeout: 260 seconds) |
03:20
🔗
|
|
odemg has joined #archiveteam-ot |
04:43
🔗
|
|
Muad-Dib has quit IRC (Ping timeout: 260 seconds) |
04:49
🔗
|
|
Muad-Dib has joined #archiveteam-ot |
07:50
🔗
|
|
jut has quit IRC (west.us.hub irc.Prison.NET) |
07:50
🔗
|
|
SketchCow has quit IRC (west.us.hub irc.Prison.NET) |
07:50
🔗
|
|
moufu has quit IRC (west.us.hub irc.Prison.NET) |
07:51
🔗
|
|
ivan` has joined #archiveteam-ot |
07:51
🔗
|
|
Frogging has quit IRC (Ping timeout: 246 seconds) |
07:51
🔗
|
|
JAA has quit IRC (Ping timeout: 246 seconds) |
07:51
🔗
|
|
ivan has quit IRC (Ping timeout: 246 seconds) |
07:55
🔗
|
|
jspiros has quit IRC (Ping timeout: 492 seconds) |
07:56
🔗
|
|
Frogging has joined #archiveteam-ot |
07:56
🔗
|
|
jut has joined #archiveteam-ot |
07:56
🔗
|
|
SketchCow has joined #archiveteam-ot |
07:56
🔗
|
|
moufu has joined #archiveteam-ot |
07:56
🔗
|
|
irc.Prison.NET sets mode: +o SketchCow |
08:03
🔗
|
|
BlueMax has quit IRC (Read error: Connection reset by peer) |
08:03
🔗
|
|
ivan` has quit IRC (Read error: Operation timed out) |
08:03
🔗
|
|
vectr0n_ has joined #archiveteam-ot |
08:03
🔗
|
|
tyzoid has quit IRC (Read error: Operation timed out) |
08:03
🔗
|
|
C4K3 has quit IRC (Read error: Operation timed out) |
08:04
🔗
|
|
sep332 has quit IRC (Read error: Operation timed out) |
08:04
🔗
|
|
djsundog has quit IRC (Read error: Operation timed out) |
08:06
🔗
|
|
BlueMax has joined #archiveteam-ot |
08:08
🔗
|
|
vectr0n has quit IRC (Read error: Operation timed out) |
08:08
🔗
|
|
vectr0n_ has quit IRC (Quit: ZNC - https://znc.in) |
08:10
🔗
|
|
vectr0n has joined #archiveteam-ot |
08:18
🔗
|
|
tyzoid has joined #archiveteam-ot |
08:18
🔗
|
|
C4K3 has joined #archiveteam-ot |
08:20
🔗
|
|
ivan has joined #archiveteam-ot |
08:21
🔗
|
|
svchfoo1 sets mode: +o ivan |
08:50
🔗
|
|
jspiros has joined #archiveteam-ot |
08:50
🔗
|
|
JAA has joined #archiveteam-ot |
08:50
🔗
|
|
svchfoo3 sets mode: +o JAA |
08:51
🔗
|
|
bakJAA sets mode: +o JAA |
09:02
🔗
|
|
sep332 has joined #archiveteam-ot |
09:02
🔗
|
|
djsundog has joined #archiveteam-ot |
09:40
🔗
|
|
BlueMax has quit IRC (Read error: Connection reset by peer) |
10:52
🔗
|
sun_shine |
mr_archiv, what did you use to create this json / do the archiving? I'm dealing with a similar problem right now |
11:19
🔗
|
|
Jens has quit IRC (Remote host closed the connection) |
11:20
🔗
|
|
Jens has joined #archiveteam-ot |
11:39
🔗
|
|
adinbied_ has joined #archiveteam-ot |
11:39
🔗
|
|
adinbied has quit IRC (Ping timeout: 260 seconds) |
13:32
🔗
|
|
icedice has joined #archiveteam-ot |
13:40
🔗
|
|
jut has quit IRC (Remote host closed the connection) |
13:42
🔗
|
|
jut has joined #archiveteam-ot |
14:07
🔗
|
|
adinbied_ is now known as adinbied |
14:24
🔗
|
|
apache2 has quit IRC (Remote host closed the connection) |
14:27
🔗
|
|
apache2 has joined #archiveteam-ot |
14:38
🔗
|
|
Stilett0 has quit IRC (Ping timeout: 268 seconds) |
16:07
🔗
|
|
Mateon1 has quit IRC (Read error: Operation timed out) |
16:08
🔗
|
|
Mateon1 has joined #archiveteam-ot |
16:21
🔗
|
|
sun_shine has quit IRC (Read error: Operation timed out) |
16:36
🔗
|
|
bitBaron has joined #archiveteam-ot |
16:53
🔗
|
|
ivan has quit IRC (Read error: Operation timed out) |
16:53
🔗
|
|
mal has quit IRC (Read error: Operation timed out) |
16:53
🔗
|
|
sep332 has quit IRC (Read error: Operation timed out) |
16:53
🔗
|
|
djsundog has quit IRC (Read error: Operation timed out) |
16:53
🔗
|
|
adinbied has quit IRC (Read error: Operation timed out) |
16:53
🔗
|
|
faolingf_ has joined #archiveteam-ot |
16:53
🔗
|
|
ivan has joined #archiveteam-ot |
16:53
🔗
|
|
svchfoo3 sets mode: +o ivan |
16:54
🔗
|
|
tyzoid has quit IRC (Read error: Operation timed out) |
16:55
🔗
|
|
adinbied has joined #archiveteam-ot |
16:57
🔗
|
|
C4K3 has quit IRC (Read error: Operation timed out) |
16:57
🔗
|
|
faolingfa has quit IRC (Read error: Operation timed out) |
17:04
🔗
|
|
faolingf_ has quit IRC (Quit: Leaving) |
17:05
🔗
|
|
wp494 has quit IRC (Ping timeout: 252 seconds) |
17:05
🔗
|
|
wp494 has joined #archiveteam-ot |
17:06
🔗
|
|
faolingf_ has joined #archiveteam-ot |
17:40
🔗
|
|
jrwr_ has joined #archiveteam-ot |
17:46
🔗
|
|
chfoo-_ has joined #archiveteam-ot |
17:47
🔗
|
|
jrwr has quit IRC (Ping timeout: 633 seconds) |
17:47
🔗
|
|
jrwr_ is now known as jrwr |
17:47
🔗
|
|
bithippo has joined #archiveteam-ot |
17:47
🔗
|
|
chfoo has quit IRC (Ping timeout: 633 seconds) |
17:48
🔗
|
|
bithippo has quit IRC (Client Quit) |
18:14
🔗
|
|
caff has joined #archiveteam-ot |
19:19
🔗
|
|
sep332 has joined #archiveteam-ot |
19:23
🔗
|
|
tyzoid has joined #archiveteam-ot |
19:23
🔗
|
|
C4K3 has joined #archiveteam-ot |
19:24
🔗
|
|
djsundog has joined #archiveteam-ot |
19:28
🔗
|
|
mal has joined #archiveteam-ot |
20:25
🔗
|
|
betamax_ is now known as betamax |
20:26
🔗
|
betamax |
Not anything to archive, but Google are shutting down Inbox in March next year |
20:26
🔗
|
betamax |
(their own alternative to the Gmail app) |
20:26
🔗
|
betamax |
https://www.theverge.com/2018/9/12/17848500/google-inbox-shut-down-sunset-snooze-email-march-2019 |
20:32
🔗
|
ivan |
mobile applications, javascript blobs |
21:08
🔗
|
|
icedice has quit IRC (Quit: Leaving) |
21:12
🔗
|
|
svchfoo3 has quit IRC (Read error: Operation timed out) |
21:13
🔗
|
|
svchfoo3 has joined #archiveteam-ot |
21:13
🔗
|
|
svchfoo1 sets mode: +o svchfoo3 |
21:24
🔗
|
|
kiska has quit IRC (hub.dk irc.underworld.no) |
21:24
🔗
|
|
Flashfire has quit IRC (hub.dk irc.underworld.no) |
21:24
🔗
|
|
w0rmhole has quit IRC (hub.dk irc.underworld.no) |
21:24
🔗
|
|
hook54321 has quit IRC (hub.dk irc.underworld.no) |
21:24
🔗
|
|
dxrt has quit IRC (hub.dk irc.underworld.no) |
21:24
🔗
|
|
rektide_ has quit IRC (hub.dk irc.underworld.no) |
21:27
🔗
|
|
dxrt has joined #archiveteam-ot |
21:27
🔗
|
|
rektide_ has joined #archiveteam-ot |
21:27
🔗
|
|
irc.underworld.no sets mode: +o dxrt |
21:38
🔗
|
|
hook54321 has joined #archiveteam-ot |
21:38
🔗
|
|
BlueMax has joined #archiveteam-ot |
22:12
🔗
|
|
Flashfire has joined #archiveteam-ot |
22:32
🔗
|
|
BlueMax has quit IRC (Read error: Connection reset by peer) |
22:57
🔗
|
mr_archiv |
@sun_shine I will post the source code. What I do is use the session feature of the python library called requests. This keeps track of cookies. |
22:58
🔗
|
mr_archiv |
The website also had a bug where if you visit too many links it would fail. So I had to reset the session every so often. |
23:26
🔗
|
mr_archiv |
Here is the script used to generate the JSONs: 163.172.39.176/ctgmusic/scrapeSongsSmart.py |
23:31
🔗
|
|
caff has quit IRC (Read error: Connection reset by peer) |