[00:16] SketchCow: maybe we fork this project: https://github.com/ikreymer/webarchiveplayer [00:17] we have to make search a folder cause right now you need to point to one warc.gz file [00:18] ok nevermind [00:18] looks like i can point to a folder and it will work [00:19] ok there is a bug [00:19] it needs full path of web archive [00:20] i copy a dump of my breitbart.com news index into a tmp does give me a index of the warc.gz [00:20] but not the links works [00:20] 2018-01-09 19:19:16,579: [INFO]: www.breitbart.com-news-index-20160314.warc.gz: Archive File Not Found [00:33] *** bwn has quit IRC (Read error: Operation timed out) [00:47] *** Valentin- has quit IRC (Read error: Operation timed out) [00:48] *** bwn has joined #archiveteam-bs [00:55] *** Valentine has joined #archiveteam-bs [01:07] *** zyphlar has joined #archiveteam-bs [01:09] *** zyphlar has left [01:12] *** RichardG has quit IRC (Ping timeout: 250 seconds) [01:16] *** RichardG has joined #archiveteam-bs [01:21] *** Valentine has quit IRC (Read error: Operation timed out) [01:32] *** Valentine has joined #archiveteam-bs [02:20] *** schbirid2 has joined #archiveteam-bs [02:24] *** schbirid has quit IRC (Read error: Operation timed out) [02:42] *** Ing3b0rg has quit IRC (Ping timeout: 260 seconds) [02:42] *** robink has quit IRC (Read error: Connection reset by peer) [02:42] *** robink has joined #archiveteam-bs [02:47] *** Ing3b0rg has joined #archiveteam-bs [03:21] *** k_o_ has quit IRC (Ping timeout: 260 seconds) [03:33] *** MatrixBri has joined #archiveteam-bs [03:34] *** zyphlar has joined #archiveteam-bs [03:37] hmm [03:43] *** MatrixBri has quit IRC (Remote host closed the connection) [03:44] *** MatrixBri has joined #archiveteam-bs [03:44] *** root[m] has joined #archiveteam-bs [03:48] ~~~~ bs ~~~~ [03:56] *** root[m] has quit IRC (Remote host closed the connection) [03:56] *** MatrixBri has quit IRC (Remote host closed the connection) [03:56] *** MatrixBri has joined #archiveteam-bs [03:56] *** root[m] has joined #archiveteam-bs [03:57] ?? [04:01] *** WillBradl has joined #archiveteam-bs [04:02] *** k_o has joined #archiveteam-bs [04:02] *** root[m] has left User left [04:08] *** WillBradl has quit IRC (Remote host closed the connection) [04:08] *** MatrixBri has quit IRC (Remote host closed the connection) [04:08] *** jacketcha has joined #archiveteam-bs [04:09] *** MatrixBri has joined #archiveteam-bs [04:09] *** WillBradl has joined #archiveteam-bs [04:09] *** WillBradl has quit IRC (Remote host closed the connection) [04:09] *** MatrixBri has quit IRC (Remote host closed the connection) [04:11] *** MatrixBri has joined #archiveteam-bs [04:12] *** WillBradl has joined #archiveteam-bs [04:12] *** WillBradl has quit IRC (Remote host closed the connection) [04:12] *** MatrixBri has quit IRC (Remote host closed the connection) [04:23] *** WillBradl has joined #archiveteam-bs [04:23] *** WillBradl has left [04:24] *** WillBradl has joined #archiveteam-bs [04:26] okay [04:28] *** WillBradl is now known as M-WillBra [04:35] i think the BS is done [04:35] *** zyphlar has left [04:46] ? [04:47] i set up a matrix.org bridge and this was my BS channel (also the one i care about joining <3 ) [04:48] cool [04:49] *** qw3rty15 has joined #archiveteam-bs [04:52] *** qw3rty14 has quit IRC (Read error: Operation timed out) [05:12] *** Asparagir has joined #archiveteam-bs [05:14] *** RichardG has quit IRC (Ping timeout: 245 seconds) [05:32] *** godane has quit IRC (Read error: Operation timed out) [06:01] *** Asparag-1 has joined #archiveteam-bs [06:03] hey does anybody know the password for the archivebot logs at archive.fart.websiite? [06:03] *** Asparagir has quit IRC (Read error: Operation timed out) [06:33] *** k_o has quit IRC (Quit: Page closed) [06:45] *** zyphlar has joined #archiveteam-bs [06:49] *** zyphlar has left [07:13] *** godane has joined #archiveteam-bs [07:25] so i'm on my new comcast modem now [07:46] how is it? [08:22] *** Asparagir has joined #archiveteam-bs [08:26] *** Asparag-1 has quit IRC (Ping timeout: 600 seconds) [08:32] SketchCow: I’ll update my scripts so they upload to collection:archiveteam_yahoogroups directly. [08:33] Great [08:33] Give them a logo if you can [08:38] *** schbirid2 has quit IRC (Quit: Leaving) [08:50] *** SilSte has quit IRC (Read error: Operation timed out) [09:08] *** Valentine has quit IRC (Ping timeout: 506 seconds) [09:25] *** Valentine has joined #archiveteam-bs [09:42] *** Valentine has quit IRC (Read error: Operation timed out) [09:46] *** RichardG has joined #archiveteam-bs [10:01] *** BlueMaxim has quit IRC (Leaving) [10:26] *** Asparagir has quit IRC (Ping timeout: 600 seconds) [10:29] *** Asparagir has joined #archiveteam-bs [10:46] *** Ing3b0rg has quit IRC (hub.dk irc.underworld.no) [10:46] *** Rai-chan has quit IRC (hub.dk irc.underworld.no) [10:46] *** i0npulse has quit IRC (hub.dk irc.underworld.no) [10:46] *** purplebot has quit IRC (hub.dk irc.underworld.no) [10:48] *** robink has quit IRC (Ping timeout: 246 seconds) [10:51] *** robink has joined #archiveteam-bs [10:53] *** Valentine has joined #archiveteam-bs [10:55] *** LeG0ax has joined #archiveteam-bs [11:02] *** LeG0ax is now known as Ing3b0rg [11:16] *** Valentine has quit IRC (Ping timeout: 506 seconds) [12:09] *** purplebot has joined #archiveteam-bs [12:11] *** Rai-chan has joined #archiveteam-bs [12:21] *** Asparag-1 has joined #archiveteam-bs [12:24] *** Asparagir has quit IRC (Read error: Operation timed out) [12:26] *** HCross2 has joined #archiveteam-bs [12:27] *** svchfoo3 sets mode: +o HCross2 [12:33] *** medowar has joined #archiveteam-bs [13:14] *** Valentine has joined #archiveteam-bs [13:19] *** Mateon1 has quit IRC (Ping timeout: 255 seconds) [13:19] *** Mateon1 has joined #archiveteam-bs [13:57] SketchCow: Done. You might need to bulk-move a few items that were created before the change. [14:12] *** Lord_Nigh has quit IRC (Read error: Operation timed out) [14:12] *** Lord_Nigh has joined #archiveteam-bs [14:23] *** Asparagir has joined #archiveteam-bs [14:24] *** Asparag-1 has quit IRC (Read error: Operation timed out) [15:12] *** Valentine has quit IRC (Read error: Connection reset by peer) [15:23] *** Valentin- has joined #archiveteam-bs [15:30] *** Valentin- has quit IRC (Read error: Connection reset by peer) [15:32] *** Valentine has joined #archiveteam-bs [15:33] *** Valentine has quit IRC (Read error: Connection reset by peer) [15:37] *** Valentin- has joined #archiveteam-bs [15:41] *** Valentin- has quit IRC (Read error: Connection reset by peer) [15:48] *** Valentine has joined #archiveteam-bs [15:50] *** Valentine has quit IRC (Read error: Connection reset by peer) [15:53] *** Asparagir has quit IRC (Asparagir) [15:54] *** Valentine has joined #archiveteam-bs [15:54] *** atrocity has joined #archiveteam-bs [15:57] *** tomaspark has joined #archiveteam-bs [16:06] *** Valentine has quit IRC (Read error: Operation timed out) [16:07] *** Valentine has joined #archiveteam-bs [16:11] That's fine, I'll find them. [16:13] So, Software just became more interesting, CD-ROM wise. [16:19] *** Valentine has quit IRC (Read error: Connection reset by peer) [16:23] *** Valentine has joined #archiveteam-bs [16:43] *** Kimmer has quit IRC (Read error: Connection reset by peer) [17:49] *** schbirid has joined #archiveteam-bs [17:59] *** Valentine has quit IRC (Read error: Connection reset by peer) [18:01] *** Valentine has joined #archiveteam-bs [18:11] *** Valentine has quit IRC (Read error: Connection reset by peer) [18:20] *** Valentine has joined #archiveteam-bs [18:35] *** Valentine has quit IRC (Read error: Connection reset by peer) [18:35] *** Valentine has joined #archiveteam-bs [18:40] *** Valentine has quit IRC (Read error: Connection reset by peer) [18:40] there is still no way for an uploader to change the collection an item has been put into, is there? [18:41] *** Valentine has joined #archiveteam-bs [18:42] i think you can if you have access to the target collection [18:42] *** Valentine has quit IRC (Read error: Connection reset by peer) [18:42] *** Valentine has joined #archiveteam-bs [18:44] for me the whole "collection" input box is non-editable [18:44] *** Valentine has quit IRC (Read error: Connection reset by peer) [18:46] snag the "undisable" bookmarklet from here and then you can do it ;) https://www.squarefree.com/bookmarklets/forms.html [18:46] hm, I don't really think that will work :) [18:46] works for me! [18:47] yes you can edit it then, but I don't think clicking "update" will apply this. It would be a security leak if it did ... [18:48] *** second has quit IRC (Read error: Connection reset by peer) [18:48] idk it worked for me last time i did it [18:49] ok let me try [18:50] *** second has joined #archiveteam-bs [18:51] We just had the Reckoning of All Reckonings about uploading WARCs to the Archive [18:51] Impromptu 8 person meeting on the slack, it was glorious [18:51] *** Valentine has joined #archiveteam-bs [18:52] So, the default is that just anyone can't add WARCs to archive and it will automatically go into Wayback. That'll be only if I clear off an account's uploads. [18:52] *** Valentine has quit IRC (Read error: Connection reset by peer) [18:52] So I need to find out who a couple people are. [18:52] I don't want to drop their e-mails anywhere [18:52] heh, seems that it indeed works, at least for putting it into open_source_software ... I still have no access to put it where it belongs though ;-) [18:52] *** Valentine has joined #archiveteam-bs [18:53] but at least it's not in "media" anymore... [18:53] Darkstar: i think SketchCow can help find the right collection maybe [18:54] the right collection would be "cdromsoftware" or "cd-roms" (don't know the difference between those two), and "cdinstall" for some others, but I have not yet been able to find out how to get access [18:55] I usually just wait, and after some days/weeks the items magically move to one of these collections (I think someone moves them, but they forget the occasional item from time to time ;-) [19:32] *** qw3rty15 has quit IRC (Nettalk6 - www.ntalk.de) [19:32] *** AeonG_ has quit IRC (Ping timeout: 600 seconds) [20:02] *** AeonG_ has joined #archiveteam-bs [20:20] ola_norsk: i don't know about the api, but maybe they never intended to "drop" a file type? kinda like "unknown" is "nil" so it'd be removing a value instead of changing it [20:25] *** AeonG_ has quit IRC (Read error: Operation timed out) [20:27] *** ola_norsk has joined #archiveteam-bs [20:27] Harzilein: btw, would you happen to know what happens to Items flagged as "Broken" ? [20:29] *** ZexaronS has joined #archiveteam-bs [20:30] ola_norsk: as i said, i know nothing at all [20:31] Harzilein: then we are two :D [20:31] my completion was broken and i thought you were here already. i wrote: "ola_norsk: i don't know about the api, but maybe they never intended to "drop" a file type? kinda like "unknown" is "nil" so it'd be removing a value instead of changing it" [20:31] *** AeonG_ has joined #archiveteam-bs [20:34] ah sry. But what i notice that the item have become more messed up/broken, even since i posted that issue on github. Now suddenly there's not only a few .d64 roms wrongly detected as "DV Audio" and "DV Video". there were no zip files at that time. [20:35] Internet Archive is haunted! :[ [20:35] lol [20:35] the playlist is funny :D [20:35] wonder what it's about these files that make it detect it as dv [20:36] aye [20:36] ah, okay, the playlist is for "aac files" [20:36] ola_norsk: i have a squashfs file i uploaded that thinks its a wav file [20:37] oh, no, it's for the oggs actually [20:37] https://archive.org/details/slackwarearm-14.2-20170906-kiwix [20:38] what could cause it though? [20:38] for the simpsons and steve keene private spy ones, i can actually get a glitchy sound by playing and then seeking in them :D [20:38] (with the web player) [20:39] the zak one does not seem to have even valid frames ;) [20:40] it must be some kind of fileheader/data detection thingy that does it? [20:40] ola_norsk: so only the uploader can change the types through the web interface? [20:41] Harzilein: it's possible to do it by using _ia_ tool, but like the issue shows there seems to be a bug [20:42] ola_norsk: so i think what happened is that 106 files got mis-detected as aac, then of those 3 happened to contain at least one good aac frame in them [20:43] Harzilein: sadly the issue is that choosing format trough the web gui for 2000+ files is something i gave up on trying to do. [20:43] there were no where near 106 a while back :D [20:43] it's just that i don't see that option anywhere in the web gui [20:44] it's a drop down menu at every file [20:44] the good thing is that the name did not get changed. so the system saw an opportunity to provide unencumbered oggs instead [20:45] when i click on "show all" i only get the download view [20:47] what seems to have gone wrong is that "Format" was wrongly detected/set on some during upload. [20:48] even if name extension was the same. My hope was to set all to same "Unknown" format metadata by using [20:48] ia metadata 2813_d64_C64_roms_wwwC64com --target="Metal_Warrior_3.d64" --modify="format:Unknown" [20:48] yeah. but you mentioned that isn't possible due to an api bug. [20:49] but you also mentioned that it'd be cumbersome but possible to change the type in the web gui [20:49] it is [20:49] but i can't find that gui element and i assume that's because i'm not the uploader [20:49] yeah, that is done in "Edit Item/Edit files" [20:50] *** AeonG_ has quit IRC (Read error: Operation timed out) [20:51] if you check one of your own uploaded items, by cliking "Edit" > "Edit metadata", it will be on the bottom of that page. [20:54] *** Asparagir has joined #archiveteam-bs [20:54] to be fair though, Internet Archive does have a warning against items containing >1000 files i think :D [20:55] yeah it hard limits you somewhere around 1000 i think [20:55] i have a slightly different interface (i kind of get the "non-expert" wording: "Edit" > "I want to change the information (metadata) about my item. [20:55] For example, I want to change my [title] or [description]." [20:55] ) [20:55] astrid: i didn't hard limit my item :/ [20:55] astrid: it* [20:55] how many files did you put into it? [20:56] 2813 [20:56] ok i suspect there's a limit somewhere but idk where it is [20:56] most seems to be ok [20:57] it seemed to me as it's just a caution notice/warning [20:57] one sec [20:58] *** Asparagir has quit IRC (Client Quit) [20:58] "Because items can "break" we typically recommend that you not exceed 1,000 files and/or 50GB per item page." [20:58] so i think i should've heeded that :D [20:59] that, or not have done them all in one upload session [21:00] ola_norsk: i think you should try checking "block archive.php from queueing a derive (needed only in special circumstances; if you’re unsure, leave blank)" next time you initially upload d64 images [21:00] i figured since the files were so miniscule, it would be ok [21:01] Harzilein: doesn't that just apply to audio (and or video) files? [21:01] ola_norsk: i'm pretty sure the files uploaded fine, but whatever tries to make oggs for the web player might have too aggressively checked the file types [21:02] ola_norsk: for all the system knows they _are_ audio files w/ a weird extension ;) [21:02] *** AeonG_ has joined #archiveteam-bs [21:02] ill try that then [21:03] ola_norsk: i'm kind of curious what caused it, let me download the simpsons, zak mc cracken and steve keene private spy items [21:04] it must have been some heuristic detection thingy that did it [21:04] perhaps there's such a thing as "raw aac frames"? [21:05] then every file that roughly fits some constraint could be a glitchy aac [21:05] some of the d64 did not even pass virus checking at upload [21:05] that's why there's not 2813 of them [21:06] *** Asparagir has joined #archiveteam-bs [21:06] "In addition to the MP4, 3GP and other container formats based on ISO base media file format for file storage, AAC audio data was first packaged in a file for the MPEG-2 standard using Audio Data Interchange Format (ADIF),[43] consisting of a single header followed by the raw AAC audio data blocks." [21:10] maybe if running a diff comparance on a couple of files, including godane's slackware squashfs file..one might find where similarity comparing failed? [21:11] well, the decoding fails really early, hence the very small derived files [21:11] *** BlueMaxim has joined #archiveteam-bs [21:12] and i'd expect the values in the derived frames to be severely clipped too [21:14] Zak_Mckracken_and_the_Alien_Mindbenders_[Boot].d64 file info: [21:14] RAW [21:14] Error: Bitstream value not allowed by specification [21:14] how does Internet Archive try to detect filetypes though? [21:15] what software used, i mean [21:17] it would be lovely if it was as easy to just run queries on the item's sqlite database hehe [21:18] *** AeonG_ has quit IRC (Ping timeout: 633 seconds) [21:28] *** second has quit IRC (Read error: Connection reset by peer) [21:30] in hinsight, i think it might've been smarter of me to use the torrent upload ability of IA on that item [21:30] *** second has joined #archiveteam-bs [21:33] damn, even the item's torrent is broken :D [21:41] i'll just wait and see what happens now that's it's flagged as Broken item [22:16] *** schbirid has quit IRC (Quit: Leaving) [22:31] *** RichardG has quit IRC (Ping timeout: 506 seconds) [22:36] *** godane has quit IRC (Read error: Operation timed out) [22:45] *** dashcloud has joined #archiveteam-bs [22:47] *** godane has joined #archiveteam-bs [23:16] *** qw3rty15 has joined #archiveteam-bs [23:17] *** BartoCH has quit IRC (Ping timeout: 260 seconds) [23:27] *** AeonG_ has joined #archiveteam-bs [23:30] *** AeonG__ has joined #archiveteam-bs [23:38] *** AeonG_ has quit IRC (Read error: Operation timed out) [23:46] *** RichardG has joined #archiveteam-bs