#archiveteam-bs 2018-01-10,Wed

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)

WhoWhatWhen
godaneSketchCow: maybe we fork this project: https://github.com/ikreymer/webarchiveplayer
we have to make search a folder cause right now you need to point to one warc.gz file
ok nevermind
looks like i can point to a folder and it will work
ok there is a bug
it needs full path of web archive
i copy a dump of my breitbart.com news index into a tmp does give me a index of the warc.gz
but not the links works
2018-01-09 19:19:16,579: [INFO]: www.breitbart.com-news-index-20160314.warc.gz: Archive File Not Found
[00:16]
***bwn has quit IRC (Read error: Operation timed out) [00:33]
Valentin- has quit IRC (Read error: Operation timed out)
bwn has joined #archiveteam-bs
[00:47]
Valentine has joined #archiveteam-bs [00:55]
zyphlar has joined #archiveteam-bs
zyphlar has left
RichardG has quit IRC (Ping timeout: 250 seconds)
RichardG has joined #archiveteam-bs
[01:07]
Valentine has quit IRC (Read error: Operation timed out) [01:21]
Valentine has joined #archiveteam-bs [01:32]
.......... (idle for 48mn)
schbirid2 has joined #archiveteam-bs
schbirid has quit IRC (Read error: Operation timed out)
[02:20]
.... (idle for 18mn)
Ing3b0rg has quit IRC (Ping timeout: 260 seconds)
robink has quit IRC (Read error: Connection reset by peer)
robink has joined #archiveteam-bs
[02:42]
Ing3b0rg has joined #archiveteam-bs [02:47]
....... (idle for 34mn)
k_o_ has quit IRC (Ping timeout: 260 seconds) [03:21]
MatrixBri has joined #archiveteam-bs
zyphlar has joined #archiveteam-bs
[03:33]
zyphlarhmm [03:37]
***MatrixBri has quit IRC (Remote host closed the connection)
MatrixBri has joined #archiveteam-bs
root[m] has joined #archiveteam-bs
[03:43]
zyphlar~~~~ bs ~~~~ [03:48]
***root[m] has quit IRC (Remote host closed the connection)
MatrixBri has quit IRC (Remote host closed the connection)
MatrixBri has joined #archiveteam-bs
root[m] has joined #archiveteam-bs
[03:56]
root[m]?? [03:57]
***WillBradl has joined #archiveteam-bs
k_o has joined #archiveteam-bs
root[m] has left User left
[04:01]
WillBradl has quit IRC (Remote host closed the connection)
MatrixBri has quit IRC (Remote host closed the connection)
jacketcha has joined #archiveteam-bs
MatrixBri has joined #archiveteam-bs
WillBradl has joined #archiveteam-bs
WillBradl has quit IRC (Remote host closed the connection)
MatrixBri has quit IRC (Remote host closed the connection)
MatrixBri has joined #archiveteam-bs
WillBradl has joined #archiveteam-bs
WillBradl has quit IRC (Remote host closed the connection)
MatrixBri has quit IRC (Remote host closed the connection)
[04:08]
WillBradl has joined #archiveteam-bs
WillBradl has left
WillBradl has joined #archiveteam-bs
[04:23]
zyphlarokay [04:26]
***WillBradl is now known as M-WillBra [04:28]
M-WillBrai think the BS is done [04:35]
***zyphlar has left [04:35]
jacketcha? [04:46]
M-WillBrai set up a matrix.org bridge and this was my BS channel (also the one i care about joining <3 ) [04:47]
jacketchacool [04:48]
***qw3rty15 has joined #archiveteam-bs
qw3rty14 has quit IRC (Read error: Operation timed out)
[04:49]
..... (idle for 20mn)
Asparagir has joined #archiveteam-bs
RichardG has quit IRC (Ping timeout: 245 seconds)
[05:12]
.... (idle for 18mn)
godane has quit IRC (Read error: Operation timed out) [05:32]
...... (idle for 29mn)
Asparag-1 has joined #archiveteam-bs [06:01]
jacketchahey does anybody know the password for the archivebot logs at archive.fart.websiite? [06:03]
***Asparagir has quit IRC (Read error: Operation timed out) [06:03]
....... (idle for 30mn)
k_o has quit IRC (Quit: Page closed) [06:33]
zyphlar has joined #archiveteam-bs
zyphlar has left
[06:45]
..... (idle for 24mn)
godane has joined #archiveteam-bs [07:13]
godaneso i'm on my new comcast modem now [07:25]
..... (idle for 21mn)
jacketchahow is it? [07:46]
........ (idle for 36mn)
***Asparagir has joined #archiveteam-bs
Asparag-1 has quit IRC (Ping timeout: 600 seconds)
[08:22]
PurpleSymSketchCow: I’ll update my scripts so they upload to collection:archiveteam_yahoogroups directly. [08:32]
SketchCowGreat
Give them a logo if you can
[08:33]
***schbirid2 has quit IRC (Quit: Leaving) [08:38]
SilSte has quit IRC (Read error: Operation timed out) [08:50]
.... (idle for 18mn)
Valentine has quit IRC (Ping timeout: 506 seconds) [09:08]
.... (idle for 17mn)
Valentine has joined #archiveteam-bs [09:25]
.... (idle for 17mn)
Valentine has quit IRC (Read error: Operation timed out)
RichardG has joined #archiveteam-bs
[09:42]
.... (idle for 15mn)
BlueMaxim has quit IRC (Leaving) [10:01]
...... (idle for 25mn)
Asparagir has quit IRC (Ping timeout: 600 seconds)
Asparagir has joined #archiveteam-bs
[10:26]
.... (idle for 17mn)
Ing3b0rg has quit IRC (hub.dk irc.underworld.no)
Rai-chan has quit IRC (hub.dk irc.underworld.no)
i0npulse has quit IRC (hub.dk irc.underworld.no)
purplebot has quit IRC (hub.dk irc.underworld.no)
robink has quit IRC (Ping timeout: 246 seconds)
robink has joined #archiveteam-bs
Valentine has joined #archiveteam-bs
LeG0ax has joined #archiveteam-bs
[10:46]
LeG0ax is now known as Ing3b0rg [11:02]
Valentine has quit IRC (Ping timeout: 506 seconds) [11:16]
........... (idle for 53mn)
purplebot has joined #archiveteam-bs
Rai-chan has joined #archiveteam-bs
[12:09]
Asparag-1 has joined #archiveteam-bs
Asparagir has quit IRC (Read error: Operation timed out)
HCross2 has joined #archiveteam-bs
svchfoo3 sets mode: +o HCross2
[12:21]
medowar has joined #archiveteam-bs [12:33]
......... (idle for 41mn)
Valentine has joined #archiveteam-bs [13:14]
Mateon1 has quit IRC (Ping timeout: 255 seconds)
Mateon1 has joined #archiveteam-bs
[13:19]
........ (idle for 38mn)
PurpleSymSketchCow: Done. You might need to bulk-move a few items that were created before the change. [13:57]
.... (idle for 15mn)
***Lord_Nigh has quit IRC (Read error: Operation timed out)
Lord_Nigh has joined #archiveteam-bs
[14:12]
Asparagir has joined #archiveteam-bs
Asparag-1 has quit IRC (Read error: Operation timed out)
[14:23]
.......... (idle for 48mn)
Valentine has quit IRC (Read error: Connection reset by peer) [15:12]
Valentin- has joined #archiveteam-bs [15:23]
Valentin- has quit IRC (Read error: Connection reset by peer)
Valentine has joined #archiveteam-bs
Valentine has quit IRC (Read error: Connection reset by peer)
Valentin- has joined #archiveteam-bs
Valentin- has quit IRC (Read error: Connection reset by peer)
[15:30]
Valentine has joined #archiveteam-bs
Valentine has quit IRC (Read error: Connection reset by peer)
Asparagir has quit IRC (Asparagir)
Valentine has joined #archiveteam-bs
atrocity has joined #archiveteam-bs
tomaspark has joined #archiveteam-bs
[15:48]
Valentine has quit IRC (Read error: Operation timed out)
Valentine has joined #archiveteam-bs
[16:06]
SketchCowThat's fine, I'll find them.
So, Software just became more interesting, CD-ROM wise.
[16:11]
***Valentine has quit IRC (Read error: Connection reset by peer)
Valentine has joined #archiveteam-bs
[16:19]
..... (idle for 20mn)
Kimmer has quit IRC (Read error: Connection reset by peer) [16:43]
.............. (idle for 1h6mn)
schbirid has joined #archiveteam-bs [17:49]
Valentine has quit IRC (Read error: Connection reset by peer)
Valentine has joined #archiveteam-bs
[17:59]
Valentine has quit IRC (Read error: Connection reset by peer) [18:11]
Valentine has joined #archiveteam-bs [18:20]
.... (idle for 15mn)
Valentine has quit IRC (Read error: Connection reset by peer)
Valentine has joined #archiveteam-bs
[18:35]
Valentine has quit IRC (Read error: Connection reset by peer) [18:40]
Darkstarthere is still no way for an uploader to change the collection an item has been put into, is there? [18:40]
***Valentine has joined #archiveteam-bs [18:41]
astridi think you can if you have access to the target collection [18:42]
***Valentine has quit IRC (Read error: Connection reset by peer)
Valentine has joined #archiveteam-bs
[18:42]
Darkstarfor me the whole "collection" input box is non-editable [18:44]
***Valentine has quit IRC (Read error: Connection reset by peer) [18:44]
astridsnag the "undisable" bookmarklet from here and then you can do it ;) https://www.squarefree.com/bookmarklets/forms.html [18:46]
Darkstarhm, I don't really think that will work :) [18:46]
astridworks for me! [18:46]
Darkstaryes you can edit it then, but I don't think clicking "update" will apply this. It would be a security leak if it did ... [18:47]
***second has quit IRC (Read error: Connection reset by peer) [18:48]
astrididk it worked for me last time i did it [18:48]
Darkstarok let me try [18:49]
***second has joined #archiveteam-bs [18:50]
SketchCowWe just had the Reckoning of All Reckonings about uploading WARCs to the Archive
Impromptu 8 person meeting on the slack, it was glorious
[18:51]
***Valentine has joined #archiveteam-bs [18:51]
SketchCowSo, the default is that just anyone can't add WARCs to archive and it will automatically go into Wayback. That'll be only if I clear off an account's uploads. [18:52]
***Valentine has quit IRC (Read error: Connection reset by peer) [18:52]
SketchCowSo I need to find out who a couple people are.
I don't want to drop their e-mails anywhere
[18:52]
Darkstarheh, seems that it indeed works, at least for putting it into open_source_software ... I still have no access to put it where it belongs though ;-) [18:52]
***Valentine has joined #archiveteam-bs [18:52]
Darkstarbut at least it's not in "media" anymore... [18:53]
astridDarkstar: i think SketchCow can help find the right collection maybe [18:53]
Darkstarthe right collection would be "cdromsoftware" or "cd-roms" (don't know the difference between those two), and "cdinstall" for some others, but I have not yet been able to find out how to get access
I usually just wait, and after some days/weeks the items magically move to one of these collections (I think someone moves them, but they forget the occasional item from time to time ;-)
[18:54]
........ (idle for 37mn)
***qw3rty15 has quit IRC (Nettalk6 - www.ntalk.de)
AeonG_ has quit IRC (Ping timeout: 600 seconds)
[19:32]
....... (idle for 30mn)
AeonG_ has joined #archiveteam-bs [20:02]
.... (idle for 18mn)
Harzileinola_norsk: i don't know about the api, but maybe they never intended to "drop" a file type? kinda like "unknown" is "nil" so it'd be removing a value instead of changing it [20:20]
***AeonG_ has quit IRC (Read error: Operation timed out)
ola_norsk has joined #archiveteam-bs
[20:25]
ola_norskHarzilein: btw, would you happen to know what happens to Items flagged as "Broken" ? [20:27]
***ZexaronS has joined #archiveteam-bs [20:29]
Harzileinola_norsk: as i said, i know nothing at all [20:30]
ola_norskHarzilein: then we are two :D [20:31]
Harzileinmy completion was broken and i thought you were here already. i wrote: "ola_norsk: i don't know about the api, but maybe they never intended to "drop" a file type? kinda like "unknown" is "nil" so it'd be removing a value instead of changing it" [20:31]
***AeonG_ has joined #archiveteam-bs [20:31]
ola_norskah sry. But what i notice that the item have become more messed up/broken, even since i posted that issue on github. Now suddenly there's not only a few .d64 roms wrongly detected as "DV Audio" and "DV Video". there were no zip files at that time.
Internet Archive is haunted! :[
lol
[20:34]
Harzileinthe playlist is funny :D
wonder what it's about these files that make it detect it as dv
[20:35]
ola_norskaye [20:36]
Harzileinah, okay, the playlist is for "aac files" [20:36]
godaneola_norsk: i have a squashfs file i uploaded that thinks its a wav file [20:36]
Harzileinoh, no, it's for the oggs actually [20:37]
godanehttps://archive.org/details/slackwarearm-14.2-20170906-kiwix [20:37]
ola_norskwhat could cause it though? [20:38]
Harzileinfor the simpsons and steve keene private spy ones, i can actually get a glitchy sound by playing and then seeking in them :D
(with the web player)
the zak one does not seem to have even valid frames ;)
[20:38]
ola_norskit must be some kind of fileheader/data detection thingy that does it? [20:40]
Harzileinola_norsk: so only the uploader can change the types through the web interface? [20:40]
ola_norskHarzilein: it's possible to do it by using _ia_ tool, but like the issue shows there seems to be a bug [20:41]
Harzileinola_norsk: so i think what happened is that 106 files got mis-detected as aac, then of those 3 happened to contain at least one good aac frame in them
<ola_norsk> Harzilein: sadly the issue is that choosing format trough the web gui for 2000+ files is something i gave up on trying to do.
[20:42]
ola_norskthere were no where near 106 a while back :D [20:43]
Harzileinit's just that i don't see that option anywhere in the web gui [20:43]
ola_norskit's a drop down menu at every file [20:44]
Harzileinthe good thing is that the name did not get changed. so the system saw an opportunity to provide unencumbered oggs instead
when i click on "show all" i only get the download view
[20:44]
ola_norskwhat seems to have gone wrong is that "Format" was wrongly detected/set on some during upload.
even if name extension was the same. My hope was to set all to same "Unknown" format metadata by using
ia metadata 2813_d64_C64_roms_wwwC64com --target="Metal_Warrior_3.d64" --modify="format:Unknown"
[20:47]
Harzileinyeah. but you mentioned that isn't possible due to an api bug.
but you also mentioned that it'd be cumbersome but possible to change the type in the web gui
[20:48]
ola_norskit is [20:49]
Harzileinbut i can't find that gui element and i assume that's because i'm not the uploader [20:49]
ola_norskyeah, that is done in "Edit Item/Edit files" [20:49]
***AeonG_ has quit IRC (Read error: Operation timed out) [20:50]
ola_norskif you check one of your own uploaded items, by cliking "Edit" > "Edit metadata", it will be on the bottom of that page. [20:51]
***Asparagir has joined #archiveteam-bs [20:54]
ola_norskto be fair though, Internet Archive does have a warning against items containing >1000 files i think :D [20:54]
astridyeah it hard limits you somewhere around 1000 i think [20:55]
Harzileini have a slightly different interface (i kind of get the "non-expert" wording: "Edit" > "I want to change the information (metadata) about my item.
For example, I want to change my [title] or [description]."
)
[20:55]
ola_norskastrid: i didn't hard limit my item :/
astrid: it*
[20:55]
astridhow many files did you put into it? [20:55]
ola_norsk2813 [20:56]
astridok i suspect there's a limit somewhere but idk where it is [20:56]
ola_norskmost seems to be ok
it seemed to me as it's just a caution notice/warning
one sec
[20:56]
***Asparagir has quit IRC (Client Quit) [20:58]
ola_norsk"Because items can "break" we typically recommend that you not exceed 1,000 files and/or 50GB per item page."
so i think i should've heeded that :D
that, or not have done them all in one upload session
[20:58]
Harzileinola_norsk: i think you should try checking "block archive.php from queueing a derive (needed only in special circumstances; if you’re unsure, leave blank)" next time you initially upload d64 images [21:00]
ola_norski figured since the files were so miniscule, it would be ok
Harzilein: doesn't that just apply to audio (and or video) files?
[21:00]
Harzileinola_norsk: i'm pretty sure the files uploaded fine, but whatever tries to make oggs for the web player might have too aggressively checked the file types
ola_norsk: for all the system knows they _are_ audio files w/ a weird extension ;)
[21:01]
***AeonG_ has joined #archiveteam-bs [21:02]
ola_norskill try that then [21:02]
Harzileinola_norsk: i'm kind of curious what caused it, let me download the simpsons, zak mc cracken and steve keene private spy items [21:03]
ola_norskit must have been some heuristic detection thingy that did it [21:04]
Harzileinperhaps there's such a thing as "raw aac frames"?
then every file that roughly fits some constraint could be a glitchy aac
[21:04]
ola_norsksome of the d64 did not even pass virus checking at upload
that's why there's not 2813 of them
[21:05]
***Asparagir has joined #archiveteam-bs [21:06]
Harzilein"In addition to the MP4, 3GP and other container formats based on ISO base media file format for file storage, AAC audio data was first packaged in a file for the MPEG-2 standard using Audio Data Interchange Format (ADIF),[43] consisting of a single header followed by the raw AAC audio data blocks." [21:06]
ola_norskmaybe if running a diff comparance on a couple of files, including godane's slackware squashfs file..one might find where similarity comparing failed? [21:10]
Harzileinwell, the decoding fails really early, hence the very small derived files [21:11]
***BlueMaxim has joined #archiveteam-bs [21:11]
Harzileinand i'd expect the values in the derived frames to be severely clipped too
Zak_Mckracken_and_the_Alien_Mindbenders_[Boot].d64 file info:
RAW
Error: Bitstream value not allowed by specification
[21:12]
ola_norskhow does Internet Archive try to detect filetypes though?
what software used, i mean
it would be lovely if it was as easy to just run queries on the item's sqlite database hehe
[21:14]
***AeonG_ has quit IRC (Ping timeout: 633 seconds) [21:18]
second has quit IRC (Read error: Connection reset by peer) [21:28]
ola_norskin hinsight, i think it might've been smarter of me to use the torrent upload ability of IA on that item [21:30]
***second has joined #archiveteam-bs [21:30]
ola_norskdamn, even the item's torrent is broken :D [21:33]
i'll just wait and see what happens now that's it's flagged as Broken item [21:41]
........ (idle for 35mn)
***schbirid has quit IRC (Quit: Leaving) [22:16]
.... (idle for 15mn)
RichardG has quit IRC (Ping timeout: 506 seconds) [22:31]
godane has quit IRC (Read error: Operation timed out) [22:36]
dashcloud has joined #archiveteam-bs
godane has joined #archiveteam-bs
[22:45]
...... (idle for 29mn)
qw3rty15 has joined #archiveteam-bs
BartoCH has quit IRC (Ping timeout: 260 seconds)
[23:16]
AeonG_ has joined #archiveteam-bs
AeonG__ has joined #archiveteam-bs
[23:27]
AeonG_ has quit IRC (Read error: Operation timed out) [23:38]
RichardG has joined #archiveteam-bs [23:46]

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)