Time |
Nickname |
Message |
00:02
🔗
|
|
pizzaiolo has joined #archiveteam-bs |
00:18
🔗
|
bsmith093 |
sho: gpodder, it even has a cli shell with gpo |
00:18
🔗
|
bsmith093 |
literally what it was built to do |
00:19
🔗
|
|
Aranje has quit IRC (Quit: Three sheets to the wind) |
00:30
🔗
|
|
bitBaron has quit IRC (Quit: My computer has gone to sleep. ZZZzzz…) |
00:31
🔗
|
zino |
godane: Good job |
00:35
🔗
|
|
drumstick has quit IRC (Read error: Operation timed out) |
00:52
🔗
|
odemg |
godane, https://www.reddit.com/r/DataHoarder/comments/6r3dc5/youtube_request_so_the_channel_that_one_video/dl2jow8 |
00:54
🔗
|
odemg |
godane, started uploading those playlists already derive is taking it's sweet time so they are their but 'unpublished' so far |
01:00
🔗
|
odemg |
godane, last uploaded was: https://archive.org/history/youtube-20hZnSkhDgs |
01:10
🔗
|
|
schbirid2 has joined #archiveteam-bs |
01:13
🔗
|
|
Swizzle has joined #archiveteam-bs |
01:14
🔗
|
|
schbirid has quit IRC (Read error: Operation timed out) |
01:40
🔗
|
|
Asparagir has joined #archiveteam-bs |
01:46
🔗
|
|
j08nY has quit IRC (Quit: Leaving) |
01:57
🔗
|
|
Swizzle has quit IRC (Quit: Leaving) |
02:01
🔗
|
|
drumstick has joined #archiveteam-bs |
02:03
🔗
|
|
dboard has quit IRC (Remote host closed the connection!) |
02:04
🔗
|
|
dboard has joined #archiveteam-bs |
02:04
🔗
|
|
dboard has quit IRC (Read error: Connection reset by peer) |
02:12
🔗
|
|
dboard has joined #archiveteam-bs |
02:12
🔗
|
|
dboard has quit IRC (Connection closed) |
02:13
🔗
|
godane |
odemg: i found tons of Late Night with David Letterman on youtube |
02:13
🔗
|
odemg |
how much is tons |
02:14
🔗
|
godane |
https://www.youtube.com/channel/UCqkkzIyGnwkEShBIGYRRgqQ/videos |
02:15
🔗
|
odemg |
he's currently uploading too.. expect more!! |
02:15
🔗
|
godane |
that is at least over 100+ videos there |
02:15
🔗
|
godane |
i know |
02:17
🔗
|
odemg |
512 videos |
02:18
🔗
|
odemg |
https://pastebin.com/raw/stksw9Y7 |
02:20
🔗
|
|
dboard2 has joined #archiveteam-bs |
02:32
🔗
|
|
drumstick has quit IRC (Read error: Operation timed out) |
02:36
🔗
|
|
drumstick has joined #archiveteam-bs |
02:50
🔗
|
|
pizzaiolo has quit IRC (Quit: pizzaiolo) |
03:11
🔗
|
|
drumstick has quit IRC (Read error: Operation timed out) |
03:35
🔗
|
|
kristian_ has joined #archiveteam-bs |
04:17
🔗
|
|
kristian_ has quit IRC (Quit: Leaving) |
04:18
🔗
|
|
wabu has quit IRC (Read error: Operation timed out) |
04:33
🔗
|
|
wabu has joined #archiveteam-bs |
04:37
🔗
|
|
Sk1d has quit IRC (Ping timeout: 250 seconds) |
04:44
🔗
|
|
Sk1d has joined #archiveteam-bs |
05:04
🔗
|
|
drumstick has joined #archiveteam-bs |
06:15
🔗
|
|
Asparagir has quit IRC (Asparagir) |
06:17
🔗
|
|
Mateon1 has quit IRC (Read error: Operation timed out) |
06:17
🔗
|
|
Mateon1 has joined #archiveteam-bs |
06:23
🔗
|
|
qw3rty14 has joined #archiveteam-bs |
06:27
🔗
|
|
qw3rty13 has quit IRC (Read error: Operation timed out) |
06:43
🔗
|
|
odemg has quit IRC (Read error: Operation timed out) |
06:49
🔗
|
|
drumstick has quit IRC (Read error: Operation timed out) |
06:52
🔗
|
|
REiN^ has joined #archiveteam-bs |
06:56
🔗
|
|
odemg has joined #archiveteam-bs |
07:44
🔗
|
|
BlueMaxim has quit IRC (Read error: Operation timed out) |
07:45
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
07:49
🔗
|
|
drumstick has joined #archiveteam-bs |
08:39
🔗
|
|
drumstick has quit IRC (Ping timeout: 633 seconds) |
08:57
🔗
|
godane |
leffi: i have a problem with the youtube comment downloader not taking youtube id with dash (-) in front of them |
08:57
🔗
|
godane |
-- doesn't work |
08:57
🔗
|
godane |
\ doesn't work |
08:57
🔗
|
godane |
" and ' don't work |
09:00
🔗
|
schbirid2 |
godane: do you run it in a linux shell? |
09:01
🔗
|
schbirid2 |
if so, try putting a single - in front of the id and have the id to be the last thing in your line. eg "youtube-dl -this --that - -ABCA" if "-ABCA" was such id |
09:02
🔗
|
|
drumstick has joined #archiveteam-bs |
09:10
🔗
|
godane |
python downloader.py --youtube-dl -KaK2SOsiw4 --output -KaK2SOsiw4.json |
09:10
🔗
|
godane |
i'm using this: https://github.com/egbertbouman/youtube-comment-downloader |
09:11
🔗
|
schbirid2 |
ah crap, two times the id |
09:13
🔗
|
schbirid2 |
ah, wont work here |
10:00
🔗
|
HCross2 |
Hmm. Anyone here good with Heritrix at all please? |
10:31
🔗
|
|
BlueMaxim has quit IRC (Read error: Operation timed out) |
10:32
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
10:55
🔗
|
|
j08nY has joined #archiveteam-bs |
11:09
🔗
|
|
username1 has joined #archiveteam-bs |
11:14
🔗
|
|
schbirid2 has quit IRC (Read error: Operation timed out) |
11:28
🔗
|
|
drumstick has quit IRC (Ping timeout: 246 seconds) |
12:00
🔗
|
|
odemg has quit IRC (Read error: Operation timed out) |
12:16
🔗
|
|
odemg has joined #archiveteam-bs |
12:48
🔗
|
|
kristian_ has joined #archiveteam-bs |
12:53
🔗
|
|
Stiletti has quit IRC (Read error: Connection reset by peer) |
12:53
🔗
|
|
Stiletti has joined #archiveteam-bs |
13:12
🔗
|
|
BlueMaxim has quit IRC (Read error: Operation timed out) |
13:18
🔗
|
|
j08nY has quit IRC (Quit: Leaving) |
13:38
🔗
|
|
Stiletti has quit IRC (Read error: Operation timed out) |
13:38
🔗
|
|
Stiletti has joined #archiveteam-bs |
14:03
🔗
|
|
Stiletti has quit IRC (Read error: Operation timed out) |
14:03
🔗
|
|
Stiletti has joined #archiveteam-bs |
14:30
🔗
|
|
Stiletti has quit IRC (Read error: Operation timed out) |
14:30
🔗
|
|
Stiletti has joined #archiveteam-bs |
14:57
🔗
|
|
kristian_ has quit IRC (Quit: Leaving) |
16:03
🔗
|
|
Stiletti has quit IRC (Read error: Operation timed out) |
16:03
🔗
|
|
Stiletti has joined #archiveteam-bs |
16:42
🔗
|
|
Stiletti has quit IRC (Read error: Operation timed out) |
16:43
🔗
|
|
Stiletti has joined #archiveteam-bs |
16:52
🔗
|
|
pizzaiolo has joined #archiveteam-bs |
16:53
🔗
|
|
pizzaiolo has left |
17:11
🔗
|
|
Mateon1 has quit IRC (Remote host closed the connection) |
17:12
🔗
|
|
Mateon1 has joined #archiveteam-bs |
17:14
🔗
|
|
pizzaiolo has joined #archiveteam-bs |
17:24
🔗
|
|
JensRex has quit IRC (Remote host closed the connection) |
17:24
🔗
|
|
JensRex has joined #archiveteam-bs |
18:08
🔗
|
|
Stiletti has quit IRC (Read error: Operation timed out) |
18:08
🔗
|
|
Stiletti has joined #archiveteam-bs |
18:11
🔗
|
|
Mateon1 has quit IRC (Ping timeout: 250 seconds) |
18:34
🔗
|
|
Stiletti has quit IRC (Read error: Operation timed out) |
18:34
🔗
|
|
Stiletti has joined #archiveteam-bs |
19:29
🔗
|
|
Aranje has joined #archiveteam-bs |
19:42
🔗
|
|
Asparagir has joined #archiveteam-bs |
19:50
🔗
|
mundus |
Anyone have a suggested tool for crawling urls off sites? |
19:59
🔗
|
Asparagir |
Do you mean crawling links that a website *links to*? Or only the site itself, plus its outbound links? If the latter, try wpull. |
20:01
🔗
|
mundus |
the latter |
20:02
🔗
|
mundus |
Okay |
20:17
🔗
|
|
Mateon1 has joined #archiveteam-bs |
20:34
🔗
|
|
pikhq has quit IRC (Ping timeout: 245 seconds) |
20:40
🔗
|
|
pikhq has joined #archiveteam-bs |
20:53
🔗
|
|
username1 is now known as schbirid |
21:09
🔗
|
|
Stiletti has quit IRC (Read error: Operation timed out) |
21:09
🔗
|
|
Stiletti has joined #archiveteam-bs |
21:53
🔗
|
JAA |
Asparagir: "don't need much computing power" -- I thought others (FalconK?) said before that pipelines were mainly CPU-bound? |
21:54
🔗
|
Asparagir |
I've been okay running the 4 GB memory / 60 GB disk space droplets on Digital Ocean. But bigger is better, especially if they happen to be running a lot of phantomjs jobs. |
21:54
🔗
|
Asparagir |
But you can't really control if you happen to get a lot of those phantomjs jobs or not. |
21:55
🔗
|
Asparagir |
Also, wpull has known (but still not patched) memory leaks. So you need a little wiggle room...and probably need to restart the whole shebang once every few months. |
21:58
🔗
|
JAA |
Hopefully more often for security updates |
21:58
🔗
|
JAA |
But yeah |
22:29
🔗
|
|
Administr has joined #archiveteam-bs |
22:33
🔗
|
|
HCross has quit IRC (Ping timeout: 268 seconds) |
22:43
🔗
|
|
Administr has quit IRC (Ping timeout: 268 seconds) |
22:44
🔗
|
|
HarryCros has joined #archiveteam-bs |
22:55
🔗
|
|
drumstick has joined #archiveteam-bs |
23:04
🔗
|
|
HCross has joined #archiveteam-bs |
23:05
🔗
|
|
HarryCros has quit IRC (Ping timeout: 268 seconds) |
23:31
🔗
|
|
HCross has quit IRC (Read error: Connection reset by peer) |
23:32
🔗
|
|
HarryCros has joined #archiveteam-bs |