#internetarchive 2018-05-04,Fri

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)

WhoWhatWhen
***odemg has quit IRC (Read error: Operation timed out) [04:06]
odemg has joined #internetarchive [04:11]
............................ (idle for 2h17mn)
sahya has joined #internetarchive [06:28]
................................................................... (idle for 5h34mn)
Nemo_bisWhy does this not update existing metadata with the internetarchive library? https://github.com/WikiTeam/wikiteam/commit/5db991bfbbf10dc86a29243eceeb6aa6fd22cbd9#diff-78e323ef0b4f5972e99865640365018bR281 [12:02]
***HCross has quit IRC (Read error: Operation timed out)
HCross has joined #internetarchive
[12:06]
JAAHuh. That's exactly what 'ia metadata' does as well, which has always worked for me (except during a derive, which is an issue I've just reported to Jake yesterday).
Unless it returns an error and that's ignored. 'ia metadata' does this error checking: https://github.com/jjjake/internetarchive/blob/27e387a7245699a1ead14e2214261bff5629333d/internetarchive/cli/ia_metadata.py#L72-L87
Item.modify_metadata does no error-checking whatsoever, it just sends the request and returns the response.
[12:10]
Nemo_bis"except during a derive" is probably the issue, since we're launching that right after the upload
I could try to set the upload to not trigger a derive, or wait some seconds
[12:13]
JAAEven during a derive, most metadata should be fine. I've only had it happen for the 'date' field, which gets reset to the previous value when the derive finishes on mediatype:web items with WARC files. [12:15]
Nemo_bisAh right, I had forgotten this.
At any rate, let's disable the derive.
[12:16]
JAAAlso, it seems that it should be sufficient to specify the metadata on the upload call.
Though the documentation for Item.upload_file about the metadata parameter says "Metadata used to create a new item.", so not sure.
It's a shame that the backend isn't open source.
[12:17]
Nemo_bisThe IA S3 API had a specific option to make the new metadata override the old, but I rarely had success using it.
o to update _meta.xml do a bucket PUT with the header
x-archive-ignore-preexisting-bucket:1
this will erase the old _meta.xml and replace it with
a new _meta.xml generated from the x-archive-meta-* headers in the PUT
https://archive.org/help/abouts3.txt
[12:20]
JAAYeah, just saw that.
So I guess the metadata dict is really only used on item creation.
Unless you specify that header, that is.
[12:22]
Nemo_bisHeader which I think we're no longer supposed to use since 2013, maybe https://blog.archive.org/2013/07/04/metadata-api/
The only mention I see in the repo is this curl response https://github.com/jjjake/internetarchive/issues/48#issuecomment-33986273
[12:25]
JAA:-| The API situation at IA is really a mess...
But yeah, that metadata API is the one Item.modify_metadata uses.
[12:27]
***HCross has quit IRC (Read error: Connection reset by peer) [12:34]
HCross has joined #internetarchive [12:39]
.... (idle for 17mn)
mistym has quit IRC (Quit: ZNC - http://znc.in) [12:56]
mistym has joined #internetarchive [13:05]
HCross_ has joined #internetarchive [13:11]
HCross has quit IRC (Read error: Operation timed out)
HCross_ is now known as HCross
[13:16]
......................................................... (idle for 4h42mn)
sahya has quit IRC (Read error: Operation timed out) [17:58]
............. (idle for 1h1mn)
sahya has joined #internetarchive [18:59]
.......... (idle for 48mn)
sahya has quit IRC (Read error: Operation timed out) [19:47]

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)