[04:06] *** odemg has quit IRC (Read error: Operation timed out) [04:11] *** odemg has joined #internetarchive [06:28] *** sahya has joined #internetarchive [12:02] Why does this not update existing metadata with the internetarchive library? https://github.com/WikiTeam/wikiteam/commit/5db991bfbbf10dc86a29243eceeb6aa6fd22cbd9#diff-78e323ef0b4f5972e99865640365018bR281 [12:06] *** HCross has quit IRC (Read error: Operation timed out) [12:07] *** HCross has joined #internetarchive [12:10] Huh. That's exactly what 'ia metadata' does as well, which has always worked for me (except during a derive, which is an issue I've just reported to Jake yesterday). [12:11] Unless it returns an error and that's ignored. 'ia metadata' does this error checking: https://github.com/jjjake/internetarchive/blob/27e387a7245699a1ead14e2214261bff5629333d/internetarchive/cli/ia_metadata.py#L72-L87 [12:12] Item.modify_metadata does no error-checking whatsoever, it just sends the request and returns the response. [12:13] "except during a derive" is probably the issue, since we're launching that right after the upload [12:13] I could try to set the upload to not trigger a derive, or wait some seconds [12:15] Even during a derive, most metadata should be fine. I've only had it happen for the 'date' field, which gets reset to the previous value when the derive finishes on mediatype:web items with WARC files. [12:16] Ah right, I had forgotten this. [12:17] At any rate, let's disable the derive. [12:17] Also, it seems that it should be sufficient to specify the metadata on the upload call. [12:18] Though the documentation for Item.upload_file about the metadata parameter says "Metadata used to create a new item.", so not sure. [12:18] It's a shame that the backend isn't open source. [12:20] The IA S3 API had a specific option to make the new metadata override the old, but I rarely had success using it. [12:22] o to update _meta.xml do a bucket PUT with the header [12:22] x-archive-ignore-preexisting-bucket:1 [12:22] this will erase the old _meta.xml and replace it with [12:22] a new _meta.xml generated from the x-archive-meta-* headers in the PUT [12:22] https://archive.org/help/abouts3.txt [12:22] Yeah, just saw that. [12:23] So I guess the metadata dict is really only used on item creation. [12:23] Unless you specify that header, that is. [12:25] Header which I think we're no longer supposed to use since 2013, maybe https://blog.archive.org/2013/07/04/metadata-api/ [12:26] The only mention I see in the repo is this curl response https://github.com/jjjake/internetarchive/issues/48#issuecomment-33986273 [12:27] :-| The API situation at IA is really a mess... [12:27] But yeah, that metadata API is the one Item.modify_metadata uses. [12:34] *** HCross has quit IRC (Read error: Connection reset by peer) [12:39] *** HCross has joined #internetarchive [12:56] *** mistym has quit IRC (Quit: ZNC - http://znc.in) [13:05] *** mistym has joined #internetarchive [13:11] *** HCross_ has joined #internetarchive [13:16] *** HCross has quit IRC (Read error: Operation timed out) [13:16] *** HCross_ is now known as HCross [17:58] *** sahya has quit IRC (Read error: Operation timed out) [18:59] *** sahya has joined #internetarchive [19:47] *** sahya has quit IRC (Read error: Operation timed out)