[00:00] *** killsushi has joined #archiveteam-bs [00:05] *** LowLevelM has joined #archiveteam-bs [00:06] *** astrid has joined #archiveteam-bs [00:06] *** Fusl sets mode: +o astrid [00:07] Ok Fusl, How do you get 5Tbps? I assume that is spread across many servers. right? [00:07] 500 servers each 10gbit [00:07] 500 servers. What? [00:08] That must be soo expensive [00:08] ¯\_(ツ)_/¯ [00:09] -ot this? [00:09] no more comments need to be made so no [00:13] *** systwi has joined #archiveteam-bs [00:20] Fusl: :O [00:21] that is niiiiice [00:54] *** qwebirc20 has quit IRC (Ping timeout: 261 seconds) [00:54] *** LowLevelM has quit IRC (Ping timeout: 261 seconds) [01:14] *** LowLevelM has joined #archiveteam-bs [01:19] *** LowLevelM has quit IRC (Ping timeout: 260 seconds) [01:38] *** LowLevelM has joined #archiveteam-bs [01:38] *** HashbangI has quit IRC (Read error: Connection reset by peer) [01:46] *** HashbangI has joined #archiveteam-bs [02:31] *** BlueMax has joined #archiveteam-bs [03:09] *** qw3rty118 has joined #archiveteam-bs [03:15] *** qw3rty117 has quit IRC (Read error: Operation timed out) [03:48] *** odemgi_ has joined #archiveteam-bs [03:49] *** odemg has quit IRC (Read error: Operation timed out) [03:50] *** odemgi has quit IRC (Read error: Operation timed out) [04:04] *** odemg has joined #archiveteam-bs [05:02] *** killsushi has quit IRC (Read error: Operation timed out) [05:42] *** m007a83_ is now known as m007a83 [06:19] *** LowLevelM has quit IRC (Ping timeout: 260 seconds) [06:58] *** schbirid has joined #archiveteam-bs [07:11] *** Atom has quit IRC (Ping timeout: 252 seconds) [07:12] *** Atom has joined #archiveteam-bs [07:53] I'm now running a script to see how many of the mirrored youtube videos are missing from youtube. [07:59] SketchCow: on a related note, have you noticed that youtube is now aggressively blocking/rate-limiting youtube-dl and similar downloaders such that mass mirroring of youtube channels is not easily possible anymore? [08:01] https://torrentfreak.com/youtube-blocks-popular-mp3-stream-ripping-sites-190710/ [08:02] it wasnt as big a problem blocking the sites but now blocking the software itself [08:27] *** MillerBOS has quit IRC (Read error: Connection reset by peer) [08:27] *** pikami_ has quit IRC (Write error: Broken pipe) [08:27] *** odemgi_ has quit IRC (Write error: Broken pipe) [08:27] *** thejsa has quit IRC (Write error: Broken pipe) [08:27] *** dashcloud has quit IRC (Write error: Broken pipe) [08:27] *** m007a83_ has joined #archiveteam-bs [08:27] *** benjinss has joined #archiveteam-bs [08:27] *** odemgi_ has joined #archiveteam-bs [08:27] *** benjinss has quit IRC (Read error: Connection reset by peer) [08:27] *** MillerBOS has joined #archiveteam-bs [08:28] *** thejsa has joined #archiveteam-bs [08:28] *** dashcloud has joined #archiveteam-bs [08:28] *** pikami has joined #archiveteam-bs [08:29] *** benjinss has joined #archiveteam-bs [08:33] *** stapler11 has quit IRC (Read error: Operation timed out) [08:33] *** benjinsmi has quit IRC (Ping timeout: 604 seconds) [08:33] *** m007a83 has quit IRC (Read error: Operation timed out) [08:34] *** stapler11 has joined #archiveteam-bs [08:40] *** Igloo has quit IRC (Read error: Operation timed out) [08:40] *** Igloo has joined #archiveteam-bs [08:44] *** LeG0ax has joined #archiveteam-bs [08:45] *** RichardG has quit IRC (Read error: Operation timed out) [08:45] *** RichardG has joined #archiveteam-bs [08:46] *** Ing3b0rg has quit IRC (Ping timeout: 506 seconds) [08:46] *** LeG0ax is now known as Ing3b0rg [08:47] *** nyany has quit IRC (Read error: Operation timed out) [08:48] *** svchfoo3 has quit IRC (Ping timeout: 506 seconds) [08:49] *** eientei95 has quit IRC (Ping timeout: 506 seconds) [08:49] *** PurpleSym has quit IRC (Read error: Operation timed out) [08:49] *** purplebot has quit IRC (Read error: Operation timed out) [08:49] *** pikami has quit IRC (Ping timeout: 506 seconds) [08:50] *** pikami has joined #archiveteam-bs [08:50] *** PurpleSym has joined #archiveteam-bs [08:51] *** eientei95 has joined #archiveteam-bs [08:51] *** eientei95 has quit IRC (Handshake flooding) [08:53] *** h3ndr1k_ has joined #archiveteam-bs [08:53] *** eientei95 has joined #archiveteam-bs [08:53] *** eientei95 has quit IRC (Handshake flooding) [08:54] *** h3ndr1k has quit IRC (Ping timeout: 740 seconds) [08:56] *** eientei95 has joined #archiveteam-bs [09:00] *** h3ndr1k_ is now known as h3ndr1k [09:43] *** nyany has joined #archiveteam-bs [09:44] *** purplebot has joined #archiveteam-bs [09:44] *** svchfoo3 has joined #archiveteam-bs [09:44] *** Fusl sets mode: +o svchfoo3 [10:10] *** betamax_ is now known as betamax [10:22] *** deevious has joined #archiveteam-bs [11:12] *** Raccoon has joined #archiveteam-bs [11:28] *** BlueMax has quit IRC (Read error: Connection reset by peer) [11:46] JAA: where do you want the nratv stuff uploaded? [12:39] fuzzy8021: you around? [12:45] arkiver: fyi, i'm pulling flickr out of jrwr's storage now and soon doing the others as well so if you have anything still running that pulls data together from there, now is a good time to kill all of that [12:46] Fusl_: Ah, right, NRATV. So you have ~20k WARCs and ~20k video files, right? [12:46] 22188 [12:47] Probably best to coordinate this with IA. [12:47] We'll want the video files as items I assume, with the appropriate metadata. [12:47] do we need JS for this or do you have contact with people at IA? [12:48] Not sure about the WARCs, either as they are or megawarcs I guess. [12:48] they're currently not megawarced [12:48] Jason's probably the guy for that. I haven't spoken with anyone about that. [12:48] What do you need from IA Fusl? [12:48] I can go poke the slack. [12:48] We'll want an "NRATV" collection I think. [12:49] ideally we want two i guess, one for the videos and one for the raw warc files that contains the videos [12:50] Yeah, "NRATV" for the videos and "NRATV WARCs" for the WARCs? [12:50] *** Mateon1 has quit IRC (Read error: Operation timed out) [12:50] whatever is fine for them [12:51] It would be even better if we could throw video and WARC in one item, but that doesn't work I think due to the mediatype. [12:51] *** Mateon1 has joined #archiveteam-bs [12:51] Will have to extract the metadata also. I'll look into that later. [12:59] If you don't get a response from JS or arkiver etc I can ping the slack when we know what we want. [13:26] sup Fusl_ [13:29] fuzzy8021: 95.216.12.47 is yours, right? [13:29] yep [13:30] do you need it? [13:32] *** luckcolor has quit IRC (Ping timeout: 246 seconds) [13:34] if you dont need it anymore, i'd like to take over the server into my hetzner account so you dont have to pay for it anymore [13:37] sure why not. havent gotten around to using it yet [14:36] Fusl: I don´t have anything pulling from there [14:36] and thanks for working on it! [14:38] *** deevious has quit IRC (Quit: deevious) [15:18] *** luckcolor has joined #archiveteam-bs [15:22] What up [15:26] SketchCow: 22k NRATV videos, each has a video file and a WARC (containing the playlist and all video segments) [15:27] *** Verified_ has quit IRC (Ping timeout: 252 seconds) [15:27] Metadata isn't ready yet, but I think I have it somewhere. [15:27] OK... so we want to make a collection? OK. [15:27] Isn't some stuff up [15:27] Yeah [15:29] archiveteam_nratv now exists [15:29] *** killsushi has joined #archiveteam-bs [15:29] Is there a consistency of naming of what's already up I can use to shove them in? [15:45] I don't think anything's uploaded yet. At least not from us. [15:46] OK, so just upload them, I'll shove them into the collection when you're ready. [15:46] Or someone can ping me with access requests [15:46] But I set it up and gave it an NRATV bio and whee [15:46] Fusl_: ^ (Or if you want me to do it, let me know.) [15:46] So I tried an experiment that failed [15:46] I want to take a Youtube iD and know if the video's gone or not. [15:47] I can't find a consistent way to check. [15:47] There MUST be something out there [15:57] SketchCow: `test 200 == $(curl -sfo/dev/null -w '%{http_code}' "http://www.youtube.com/oembed?url=http://www.youtube.com/watch?v=${ID}")` [15:58] Damn, that's dense [15:58] Is that bash? [15:58] aye [16:03] What are the possible outputs [16:03] Because for me it outputs blank [16:03] it will give an exit value of either 0 or 1 [16:03] so you can use it within an if-condition [16:03] Not here [16:03] or follow it with && echo $? [16:04] ; echo $? [16:04] er right [16:04] && echo $? would only print if it succeeds [16:04] I don't want to seem ungrateful [16:04] computers. [16:04] But man, that's dense [16:05] Also, the whole endeavor is getting right into my face how much absolute horseshit people upload to the archive [16:05] Which is not a mood lightener [16:06] Oh, 5,000 hours of thai television..... thank you [16:06] Especially with the 100%, complete and utter lack of metadata [16:06] The robots after I'm dead will thank you [16:07] BOB=`test 200 == $(curl -sfo/dev/null -w '%{http_code}' "http://www.youtube.com/oembed?url=http://www.youtube.com/watch?v=${ID}");echo $?`;echo $BOB [16:08] start a streaming service that requires viewers to fill out metadata for you [16:09] for each in `ia search collection:archiveteam_youtube --itemlist`; do YT=`echo $each | sed 's/youtube-//g'`; FOF=`test 200 == $(curl -sfo/dev/null -w '%{http_code}' "http://www.youtube.com/oembed?url=http://www.youtube.com/watch?v=${ID}");echo $?`;echo "$FOF"; if [ "$FOF" = "1" ]; then echo "$YT exists."; else echo "Oh no.... $YT is gone gone gone!"; echo "$each" >> deads.txt; fi; [16:09] done [16:09] What could possibly go wrong [16:09] Be careful you don't get banned by YT [16:09] Not sure how they're testing that. [16:10] Oh no [16:10] banned by YT [16:10] What will I do [16:10] How will I spend that free time [16:10] It will break your script. [16:11] That's all I was saying. [16:11] for gods sake make JAA use mips for this! :P [16:11] 0idOIGRrbHU exists. [16:11] 1 [16:11] 0ikhVJCblnk exists. [16:11] 1 [16:11] 0j6aV3YSue8 exists. [16:12] I'm mostly interested in seeing how many of these are actually missing [16:12] And how many are straight up mirrors [16:12] I´m putting my money on 0.8% [16:12] 0.3% [16:13] I've only got to work from the de-indexed set [16:13] NON-de-indexed [16:13] I´m putting my second money on Fusl being correct [16:13] arkiver: thats not how it works :P [16:13] :) [16:13] You bid one dollar over [16:13] And fuck them [16:13] (That's how the Price is Right works) [16:14] im too young for this [16:17] By the way - so far none are missing. [16:18] I choose random youtube IDs to go make sure things are fine, and I have not been delighted at the video chosen to be mirrored. [16:18] Which tells me they're not choosing. They're mirroring almost random things [16:19] mirroring whatever they find personally interesting [16:19] No, I don't think so [16:19] No, no. [16:19] although there´s exceptions among those people [16:19] How many times have you been Rick Rolled? [16:19] Not when you mirror 15,000 videos [16:19] No, that's just high-spectrum grab-bag snowplowing through someone else's harddrives [16:20] yeah true [16:24] *** Raccoon has quit IRC (Ping timeout: 265 seconds) [16:25] Yeah, so far, zero percent down. [16:25] Waiting for my ban [16:25] DO IT [16:25] DOOOO IT [16:25] i say 1.5% are gone [16:41] I say that before we're done, two will die, and one will be irrevocably changed [17:39] *** Verified_ has joined #archiveteam-bs [18:06] *** Ryz has joined #archiveteam-bs [18:14] *** m007a83_ is now known as m007a83 [19:57] speaking of YouTube archiving, is ivan still the one running the GDrive-based archiver that only uploads videos once they're taken down? [19:57] or is that now someone else? [20:00] *** icedice has joined #archiveteam-bs [20:03] It was, but it's also been banned mostly. [20:05] ah, shame [20:33] With the caveat that we made this shit up on the spot, 0% of the URLs I had access to are not still in youtube. [20:47] SketchCow: I found that after a few years, ~8% of my YouTube was gone from YouTube [20:47] but I'm not a tubeupper [20:49] *** stapler11 has quit IRC (Leaving) [20:56] Hypothetical question (asking here before I bother info@archive.org), anyone know if IAs system allows for items uploaded to one account to be transferred to another? [20:57] so can we use Google Compute credit? [21:00] Tell me the circumstances this would happen [21:02] guy I know goes "I have £283 worth of Google Compute Dealie credit though, if anyone can think of a use for it?" [21:02] I'm not sure if we can use the warrior scripts on it, or something [21:03] PurpleSym: Awhile ago you asked if the Circavie archives exist anywhere, did you ever find them? [21:06] Smiley: the outbound would be the issue [21:06] I ran grab-site on GCE trial credit and got my servers and API project were removed with no warning [21:06] and I think he was reffering to the movement of items to anothe ruser [21:06] ivan_: dafaq :/ [21:20] betamax: yes it can be done [21:22] good to know, thanks (don't want to waste time on impossible requests) [21:29] betamax: As I wrote: Tell me the circumstances this would happen [21:33] oh, sorry, thought you meant someone else [21:33] In general, we entertain all requests but it should be for a good reason. [21:33] basically I started writing scripts to mirror UK council webcasts (which are deleted after a set time) to IA, and initially used my personal IA account [21:33] If someone's trying to put one over, we'll suss it out. [21:34] But if you're able to prove you can log into both accounts, the effort is trivial. [21:34] now I realise there's so many that it would be better to have a dedicated account as all my other items on that account are getting buried [21:34] Yes. [21:34] What we would do is: [21:34] (this is currently hypothetical as I'm in the midst of re-writing the script and haven't made the second dedicated account yet) [21:35] - Mail your old account's mailing address saying "You requested we do this. Is this you?" [21:35] And you go yes. [21:36] great. It won't be for a few weeks (finishing scripts, updating VM to debian 10, etc...) but knowing it is possible is a big help [22:01] *** LowLevelM has joined #archiveteam-bs [22:06] *** LowLevelM has quit IRC (Ping timeout: 260 seconds) [22:10] *** LowLevelM has joined #archiveteam-bs [22:22] *** LowLevelM has quit IRC (Ping timeout: 260 seconds) [22:23] It is. [22:23] You can come to me. [22:59] *** BlueMax has joined #archiveteam-bs [23:01] *** LowLevelM has joined #archiveteam-bs [23:04] *** schbirid has quit IRC (Remote host closed the connection) [23:35] *** yano_ is now known as yano [23:57] SketchCow, get this shit.... people think I'm you/you're me and that it's you that runs the-eye