[00:17] so i'm going to be reuploading the koreanet 1 chuncheon pg butitv [00:18] thats cause i screwed up the id names [00:18] *** BlueMaxim has joined #archiveteam-bs [00:19] joepie91: I'll have to do some more testing tomorrow, but I think I managed to port it correctly! :-) [00:37] You're missing the pattern +!![] (== 1) in your script, by the way, and I think ++![] is a syntax error. [01:26] *** ola_norsk has quit IRC (it's christmas! Drink your christmas beers https://youtu.be/NtV-EB8kvf8) [01:43] *** LastNinja has quit IRC (Ping timeout: 260 seconds) [01:47] *** ZexaronS- has joined #archiveteam-bs [01:47] *** ZexaronS has quit IRC (Read error: Connection reset by peer) [01:53] *** Pixi has quit IRC (Ping timeout: 255 seconds) [01:55] *** Pixi has joined #archiveteam-bs [02:33] *** Stilett0 is now known as Stiletto [02:46] *** Odd0002 has quit IRC (Quit: ZNC - http://znc.in) [02:53] *** Odd0002 has joined #archiveteam-bs [03:22] *** ola_norsk has joined #archiveteam-bs [03:24] if a WARC item is uploaded with 'Noindex: true' , (or even with 30-days 'test item'). Does it still go to waybackmachine? [03:24] regardless of how freakishly inclomplete it may or may not be [03:29] ola_norsk: Only WARCs from trusted sources go into the Wayback Machine. "trusted" is a subtle and not-exactly-documented quality, though. [03:33] ok [04:03] *** qw3rty112 has joined #archiveteam-bs [04:07] *** qw3rty111 has quit IRC (Read error: Operation timed out) [04:14] *** pizzaiolo has quit IRC (pizzaiolo) [04:17] Somebody2: not sure this is true any more, anyone can start uploading mediatype:web into Community Texts [04:17] I assume you can still get blacklisted [04:25] ivan: Hm, neat. [04:25] ivan: have you verified that random mediatype:web items are included in the Wayback Machine, though? [04:26] well, mine are, but I doubt I got special treatment [04:26] try it and see! [04:27] ivan: i'm not risking blacklisted :D [04:28] ivan: e.g what could cause that, btw? [04:28] 1) cause what? 2) I have no idea probably [04:29] < ivan> I assume you can still get blacklisted [04:29] I assume if someone at IA notices your WARCs are full of fake responses [04:29] or ISP/DNS content swaparoos [04:30] but e.g warcs from webrecorder should be ok i guess? [04:30] I would guess so [04:31] would making them "NoIndex: true" prevent it from getting to waybackmachine? [04:32] webrecorder makes these 'patches', i mean, that'd i'd rather not see on my listing [04:33] preferably, i'd make it test items that are deleted after 30 days, as long as the warcs are processed by then [04:38] if "Noindex: true" could prevent it being shown as item, and it still got submitted to wayback, that would be 100% nice :D [04:40] e.g if that is/was the case, i could make an item containing the webrecorder warc, and it's 'patch' warcs i guess [04:46] ola_norsk: there's nothing wrong with having an item [04:46] ivan: i have to look at it :D [04:47] I don't really look at my items [04:47] ivan: and, it would be quite rudimentary, since i would'nt really bother with topics etc [04:48] just tag with topic warcarchives [04:51] then only one way to find out i guess :/ .. I just know the lack of thumbnail is going to be pester me though :D [04:53] I can only suggest OCDing about something else [04:53] ;) [04:58] community web is not listed in ia browser uploader, is there argument for 'ia' tool to create an item of that sort? [04:59] i've never use 'ia' tool to create item, only upload to or alter [05:00] no community web, only Community Texts, and I think it'll land there by default [05:00] k [05:01] #internetarchive [05:01] *** qw3rty113 has joined #archiveteam-bs [05:04] ill just wing it with a dummyfile with random data in a test item and see where it lands when picking text [05:04] aren't test items going to land in the test collection [05:05] *** qw3rty112 has quit IRC (Read error: Operation timed out) [05:09] i think that is a secondary entry of them [05:10] can't be in two collections [05:10] ivan: https://archive.org/details/dummy_test_data [05:11] ah, I totally forgot [05:11] look at this messed up thing though :D https://archive.org/details/vidme_AfterPrisonJoe [05:12] 'community data' :D [05:12] i'm thinking that's where random warcs might really belong at :D [05:13] it's not an option when uploading in browser though i think [05:14] might be what happens if making an item with 'ia' tool, without specifying anything, i cant remember [05:15] it's still in 'texts' collection, but mediatype is 'data' :DD [05:18] when using 'ia upload *' , i mean [05:22] ivan: man, i just realized what you meant, i noticed first now the 'collection:' field on that dummyfile :D [05:22] sry [05:23] that could be 'collection: web' ? [05:32] ivan: should i un-gz the warcs first? [05:33] webrecorder downloads gzip'ed warcs it seems [05:33] ola_norsk: mediatype:web, not collection [05:33] don't un-gz [05:33] k [05:34] no idea how to use ia but over the S3 interface it's header x-archive-meta-mediatype:web [05:39] i tried 'ia upload ola_norsk_AGP_warcs theangrygranpa-20171217053032.warc.gz' , and it did upload [05:41] though, the item is (not yet) listed on my profile [05:41] adding the later 'patch' to that item seems to go as well [05:43] 'ia ls ola_norsk_AGP_warcs' [05:44] seems to work fine :D , it's created the same derivs as when i added a warc to another item [05:45] and it's 'data' though :/ [05:46] and mediatype can not be changed trough web ui [05:46] ivan: I'm pretty sure you are whitelisted -- you are certainly known. The test would be to create a new account, and upload a WARC [05:46] from that, and see if it gets included. [05:46] (Or we could just ask, I suppose.) [05:49] (which I've now done) [05:50] does it matter if 'mediatype' is set to 'web' on an item, for it to be applied to wayback ? [05:50] Yes, I'm pretty sure mediatype:web is required. [05:51] dang, how can https://archive.org/details/ola_norsk_AGP_warcs be changed from 'data' to 'web' ? [05:52] info@archive.org [05:52] i was afraid of that would be the answer :D [05:53] actually, has anyone tried changing mediatype over S3? is the change ignored? [05:54] "only Archive admins can make that change." https://archive.org/post/1064443/change-media-type [05:57] aye, and it can not be done trough web ui [06:00] ill type the mail later today when i've slep and sober i think, since also the afterprisonjoe item needs changing [06:03] for future reference, how might i specify 'mediatype: web' at upload with ia command line? [06:07] nevermind, i realize what ivan meant by header 'x-archive-meta-mediatype:web' [06:10] 'ia upload --header=x-archive-meta-mediatype:web ' i think [06:11] No, I don't think that's right. [06:11] -H, --header=... S3 HTTP headers to send with your request. [06:11] You should use --metadata instead [06:12] --metadata=mediatype:web [06:12] The header form may work, too, though. [06:13] https://github.com/vmbrasseur/IAS3API/blob/master/headers.md [06:15] Yeah, that suggests that either way should work. [06:16] ok [06:17] i'm going to have to stay away from writing ia command lines i think, and just make e.g 'webiaarchive.sh' and 'videoiaarchive.sh' :D [06:20] better yet, an 'archivefordummy.sh' using dialog :D [06:21] ha [06:21] Please do write them, yes. [06:22] consider it halfassed! https://youtu.be/ATBl4qH9I54 [06:23] ...(serioudly though, i'll try') [06:24] "What type of media would you like to upload?" ..kind of thing [06:25] *** ola_norsk has quit IRC (kicked by ICANN for internetting under the influence) [06:43] *** RichardG_ has quit IRC (Ping timeout: 255 seconds) [07:08] *** kimmer12 has joined #archiveteam-bs [07:14] *** kimmer1 has quit IRC (Read error: Operation timed out) [07:23] *** kimmer1 has joined #archiveteam-bs [07:28] *** kimmer13 has joined #archiveteam-bs [07:30] *** kimmer12 has quit IRC (Ping timeout: 633 seconds) [07:34] *** kimmer1 has quit IRC (Read error: Operation timed out) [07:38] *** kimmer1 has joined #archiveteam-bs [07:44] *** kimmer13 has quit IRC (Ping timeout: 633 seconds) [08:37] *** ZexaronS- has quit IRC (Read error: Connection reset by peer) [08:38] *** ZexaronS- has joined #archiveteam-bs [09:17] is nforce entertainment b.v in any way related to the old(?) nforce site? [09:17] the one with all the NFOs [09:35] *** schbirid has joined #archiveteam-bs [09:59] Seems to be, but don't see them outright saying it anywhere. [11:12] *** schbirid has quit IRC (Quit: Leaving) [11:39] *** BlueMaxim has quit IRC (Quit: Leaving) [12:13] *** pizzaiolo has joined #archiveteam-bs [12:17] *** jschwart has joined #archiveteam-bs [12:33] *** odemg has quit IRC (Quit: Leaving) [12:41] *** kimmer1 has quit IRC (Remote host closed the connection) [12:42] *** kimmer1 has joined #archiveteam-bs [12:46] *** icedice has joined #archiveteam-bs [12:49] ivan, Somebody2: My first (and so far only) upload a few weeks ago was included in the WM within a few hours, I believe. I don't know whether that was manually approved or not though. I'm pretty sure that the derive task ran immediately, but that doesn't really mean much I guess. [12:50] *** ZexaronS- has quit IRC (Quit: Leaving) [12:52] MrRadar, Sanqui: FYI, #msgbored is open again, we managed to cycle it. (I sent you an invite, but I guess you might've missed it.) [13:12] *** odemg has joined #archiveteam-bs [13:31] *** LastNinja has joined #archiveteam-bs [14:31] *** RichardG has joined #archiveteam-bs [14:39] *** icedice2 has joined #archiveteam-bs [14:46] *** icedice has quit IRC (Ping timeout: 506 seconds) [15:20] *** icedice2 has quit IRC (Quit: Leaving) [15:40] *** kimmer12 has joined #archiveteam-bs [15:47] *** kimmer1 has quit IRC (Ping timeout: 632 seconds) [17:15] *** Stiletto has quit IRC (Read error: Operation timed out) [17:49] JAA: What's your account name on IA? [18:18] *** schbirid has joined #archiveteam-bs [18:36] *** du_ has joined #archiveteam-bs [19:12] *** Mateon1 has quit IRC (Ping timeout: 248 seconds) [19:12] *** Mateon1 has joined #archiveteam-bs [19:34] *** kimmer12 has quit IRC (Quit: Yaaic - Yet another Android IRC client - http://www.yaaic.org) [19:34] *** kimmer1 has joined #archiveteam-bs [19:34] *** Stilett0 has joined #archiveteam-bs [19:49] Ngh. Good old ContentID. "This video has 11 seconds of grass and men and balls being kicked. Blocked worldwide!" .... [20:58] Somebody2: JustAnotherArchivist [21:01] *** BlueMaxim has joined #archiveteam-bs [21:52] *** schbirid has quit IRC (Quit: Leaving) [22:11] Whoops, now I'm in the right channel. [22:11] :-) [22:44] *** jschwart has quit IRC (Quit: Konversation terminated!) [22:52] *** pizzaiolo has quit IRC (pizzaiolo) [22:54] *** RichardG_ has joined #archiveteam-bs [22:54] biography.com video urls don't download anymore : https://pastebin.com/dRhf6y8U [22:54] i figure i ask people here to see if anyone can fix it [22:54] *** ndiddy_ has joined #archiveteam-bs [22:54] last time i download was from site was 2017-10-28 [22:55] *** K4k_ has joined #archiveteam-bs [22:57] *** ppsym has joined #archiveteam-bs [23:02] *** tuluu_ has joined #archiveteam-bs [23:05] *** RichardG has quit IRC (se.hub irc.underworld.no) [23:05] *** MrDignity has quit IRC (se.hub irc.underworld.no) [23:05] *** ndiddy has quit IRC (se.hub irc.underworld.no) [23:05] *** espes__ has quit IRC (se.hub irc.underworld.no) [23:05] *** tuluu has quit IRC (se.hub irc.underworld.no) [23:05] *** purplebot has quit IRC (se.hub irc.underworld.no) [23:05] *** PurpleSym has quit IRC (se.hub irc.underworld.no) [23:05] *** K4k has quit IRC (se.hub irc.underworld.no) [23:05] *** Rai-chan has quit IRC (se.hub irc.underworld.no) [23:05] *** i0npulse has quit IRC (se.hub irc.underworld.no) [23:05] *** medowar has quit IRC (se.hub irc.underworld.no) [23:08] *** espes___ has joined #archiveteam-bs [23:20] *** ppsym is now known as PurpleSym [23:45] *** BlueMaxim has quit IRC (Read error: Connection reset by peer) [23:46] *** BlueMaxim has joined #archiveteam-bs [23:58] *** MrDignity has joined #archiveteam-bs