| Time |
Nickname |
Message |
|
00:17
🔗
|
godane |
so i'm going to be reuploading the koreanet 1 chuncheon pg butitv |
|
00:18
🔗
|
godane |
thats cause i screwed up the id names |
|
00:18
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
|
00:19
🔗
|
JAA |
joepie91: I'll have to do some more testing tomorrow, but I think I managed to port it correctly! :-) |
|
00:37
🔗
|
JAA |
You're missing the pattern +!![] (== 1) in your script, by the way, and I think ++![] is a syntax error. |
|
01:26
🔗
|
|
ola_norsk has quit IRC (it's christmas! Drink your christmas beers https://youtu.be/NtV-EB8kvf8) |
|
01:43
🔗
|
|
LastNinja has quit IRC (Ping timeout: 260 seconds) |
|
01:47
🔗
|
|
ZexaronS- has joined #archiveteam-bs |
|
01:47
🔗
|
|
ZexaronS has quit IRC (Read error: Connection reset by peer) |
|
01:53
🔗
|
|
Pixi has quit IRC (Ping timeout: 255 seconds) |
|
01:55
🔗
|
|
Pixi has joined #archiveteam-bs |
|
02:33
🔗
|
|
Stilett0 is now known as Stiletto |
|
02:46
🔗
|
|
Odd0002 has quit IRC (Quit: ZNC - http://znc.in) |
|
02:53
🔗
|
|
Odd0002 has joined #archiveteam-bs |
|
03:22
🔗
|
|
ola_norsk has joined #archiveteam-bs |
|
03:24
🔗
|
ola_norsk |
if a WARC item is uploaded with 'Noindex: true' , (or even with 30-days 'test item'). Does it still go to waybackmachine? |
|
03:24
🔗
|
ola_norsk |
regardless of how freakishly inclomplete it may or may not be |
|
03:29
🔗
|
Somebody2 |
ola_norsk: Only WARCs from trusted sources go into the Wayback Machine. "trusted" is a subtle and not-exactly-documented quality, though. |
|
03:33
🔗
|
ola_norsk |
ok |
|
04:03
🔗
|
|
qw3rty112 has joined #archiveteam-bs |
|
04:07
🔗
|
|
qw3rty111 has quit IRC (Read error: Operation timed out) |
|
04:14
🔗
|
|
pizzaiolo has quit IRC (pizzaiolo) |
|
04:17
🔗
|
ivan |
Somebody2: not sure this is true any more, anyone can start uploading mediatype:web into Community Texts |
|
04:17
🔗
|
ivan |
I assume you can still get blacklisted |
|
04:25
🔗
|
Somebody2 |
ivan: Hm, neat. |
|
04:25
🔗
|
Somebody2 |
ivan: have you verified that random mediatype:web items are included in the Wayback Machine, though? |
|
04:26
🔗
|
ivan |
well, mine are, but I doubt I got special treatment |
|
04:26
🔗
|
ivan |
try it and see! |
|
04:27
🔗
|
ola_norsk |
ivan: i'm not risking blacklisted :D |
|
04:28
🔗
|
ola_norsk |
ivan: e.g what could cause that, btw? |
|
04:28
🔗
|
ivan |
1) cause what? 2) I have no idea probably |
|
04:29
🔗
|
ola_norsk |
< ivan> I assume you can still get blacklisted |
|
04:29
🔗
|
ivan |
I assume if someone at IA notices your WARCs are full of fake responses |
|
04:29
🔗
|
ivan |
or ISP/DNS content swaparoos |
|
04:30
🔗
|
ola_norsk |
but e.g warcs from webrecorder should be ok i guess? |
|
04:30
🔗
|
ivan |
I would guess so |
|
04:31
🔗
|
ola_norsk |
would making them "NoIndex: true" prevent it from getting to waybackmachine? |
|
04:32
🔗
|
ola_norsk |
webrecorder makes these 'patches', i mean, that'd i'd rather not see on my listing |
|
04:33
🔗
|
ola_norsk |
preferably, i'd make it test items that are deleted after 30 days, as long as the warcs are processed by then |
|
04:38
🔗
|
ola_norsk |
if "Noindex: true" could prevent it being shown as item, and it still got submitted to wayback, that would be 100% nice :D |
|
04:40
🔗
|
ola_norsk |
e.g if that is/was the case, i could make an item containing the webrecorder warc, and it's 'patch' warcs i guess |
|
04:46
🔗
|
ivan |
ola_norsk: there's nothing wrong with having an item |
|
04:46
🔗
|
ola_norsk |
ivan: i have to look at it :D |
|
04:47
🔗
|
ivan |
I don't really look at my items |
|
04:47
🔗
|
ola_norsk |
ivan: and, it would be quite rudimentary, since i would'nt really bother with topics etc |
|
04:48
🔗
|
ivan |
just tag with topic warcarchives |
|
04:51
🔗
|
ola_norsk |
then only one way to find out i guess :/ .. I just know the lack of thumbnail is going to be pester me though :D |
|
04:53
🔗
|
ivan |
I can only suggest OCDing about something else |
|
04:53
🔗
|
ola_norsk |
;) |
|
04:58
🔗
|
ola_norsk |
community web is not listed in ia browser uploader, is there argument for 'ia' tool to create an item of that sort? |
|
04:59
🔗
|
ola_norsk |
i've never use 'ia' tool to create item, only upload to or alter |
|
05:00
🔗
|
ivan |
no community web, only Community Texts, and I think it'll land there by default |
|
05:00
🔗
|
ola_norsk |
k |
|
05:01
🔗
|
ivan |
#internetarchive |
|
05:01
🔗
|
|
qw3rty113 has joined #archiveteam-bs |
|
05:04
🔗
|
ola_norsk |
ill just wing it with a dummyfile with random data in a test item and see where it lands when picking text |
|
05:04
🔗
|
ivan |
aren't test items going to land in the test collection |
|
05:05
🔗
|
|
qw3rty112 has quit IRC (Read error: Operation timed out) |
|
05:09
🔗
|
ola_norsk |
i think that is a secondary entry of them |
|
05:10
🔗
|
ivan |
can't be in two collections |
|
05:10
🔗
|
ola_norsk |
ivan: https://archive.org/details/dummy_test_data |
|
05:11
🔗
|
ivan |
ah, I totally forgot |
|
05:11
🔗
|
ola_norsk |
look at this messed up thing though :D https://archive.org/details/vidme_AfterPrisonJoe |
|
05:12
🔗
|
ola_norsk |
'community data' :D |
|
05:12
🔗
|
ola_norsk |
i'm thinking that's where random warcs might really belong at :D |
|
05:13
🔗
|
ola_norsk |
it's not an option when uploading in browser though i think |
|
05:14
🔗
|
ola_norsk |
might be what happens if making an item with 'ia' tool, without specifying anything, i cant remember |
|
05:15
🔗
|
ola_norsk |
it's still in 'texts' collection, but mediatype is 'data' :DD |
|
05:18
🔗
|
ola_norsk |
when using 'ia upload <somerandomitemid> *' , i mean |
|
05:22
🔗
|
ola_norsk |
ivan: man, i just realized what you meant, i noticed first now the 'collection:' field on that dummyfile :D |
|
05:22
🔗
|
ola_norsk |
sry |
|
05:23
🔗
|
ola_norsk |
that could be 'collection: web' ? |
|
05:32
🔗
|
ola_norsk |
ivan: should i un-gz the warcs first? |
|
05:33
🔗
|
ola_norsk |
webrecorder downloads gzip'ed warcs it seems |
|
05:33
🔗
|
ivan |
ola_norsk: mediatype:web, not collection |
|
05:33
🔗
|
ivan |
don't un-gz |
|
05:33
🔗
|
ola_norsk |
k |
|
05:34
🔗
|
ivan |
no idea how to use ia but over the S3 interface it's header x-archive-meta-mediatype:web |
|
05:39
🔗
|
ola_norsk |
i tried 'ia upload ola_norsk_AGP_warcs theangrygranpa-20171217053032.warc.gz' , and it did upload |
|
05:41
🔗
|
ola_norsk |
though, the item is (not yet) listed on my profile |
|
05:41
🔗
|
ola_norsk |
adding the later 'patch' to that item seems to go as well |
|
05:43
🔗
|
ola_norsk |
'ia ls ola_norsk_AGP_warcs' |
|
05:44
🔗
|
ola_norsk |
seems to work fine :D , it's created the same derivs as when i added a warc to another item |
|
05:45
🔗
|
ola_norsk |
and it's 'data' though :/ |
|
05:46
🔗
|
ola_norsk |
and mediatype can not be changed trough web ui |
|
05:46
🔗
|
Somebody2 |
ivan: I'm pretty sure you are whitelisted -- you are certainly known. The test would be to create a new account, and upload a WARC |
|
05:46
🔗
|
Somebody2 |
from that, and see if it gets included. |
|
05:46
🔗
|
Somebody2 |
(Or we could just ask, I suppose.) |
|
05:49
🔗
|
Somebody2 |
(which I've now done) |
|
05:50
🔗
|
ola_norsk |
does it matter if 'mediatype' is set to 'web' on an item, for it to be applied to wayback ? |
|
05:50
🔗
|
Somebody2 |
Yes, I'm pretty sure mediatype:web is required. |
|
05:51
🔗
|
ola_norsk |
dang, how can https://archive.org/details/ola_norsk_AGP_warcs be changed from 'data' to 'web' ? |
|
05:52
🔗
|
ivan |
info@archive.org |
|
05:52
🔗
|
ola_norsk |
i was afraid of that would be the answer :D |
|
05:53
🔗
|
ivan |
actually, has anyone tried changing mediatype over S3? is the change ignored? |
|
05:54
🔗
|
ivan |
"only Archive admins can make that change." https://archive.org/post/1064443/change-media-type |
|
05:57
🔗
|
ola_norsk |
aye, and it can not be done trough web ui |
|
06:00
🔗
|
ola_norsk |
ill type the mail later today when i've slep and sober i think, since also the afterprisonjoe item needs changing |
|
06:03
🔗
|
ola_norsk |
for future reference, how might i specify 'mediatype: web' at upload with ia command line? |
|
06:07
🔗
|
ola_norsk |
nevermind, i realize what ivan meant by header 'x-archive-meta-mediatype:web' |
|
06:10
🔗
|
ola_norsk |
'ia upload --header=x-archive-meta-mediatype:web <item> <file>' i think |
|
06:11
🔗
|
Somebody2 |
No, I don't think that's right. |
|
06:11
🔗
|
ola_norsk |
-H, --header=<key:value>... S3 HTTP headers to send with your request. |
|
06:11
🔗
|
Somebody2 |
You should use --metadata instead |
|
06:12
🔗
|
Somebody2 |
--metadata=mediatype:web |
|
06:12
🔗
|
Somebody2 |
The header form may work, too, though. |
|
06:13
🔗
|
ola_norsk |
https://github.com/vmbrasseur/IAS3API/blob/master/headers.md |
|
06:15
🔗
|
Somebody2 |
Yeah, that suggests that either way should work. |
|
06:16
🔗
|
ola_norsk |
ok |
|
06:17
🔗
|
ola_norsk |
i'm going to have to stay away from writing ia command lines i think, and just make e.g 'webiaarchive.sh' and 'videoiaarchive.sh' :D |
|
06:20
🔗
|
ola_norsk |
better yet, an 'archivefordummy.sh' using dialog :D |
|
06:21
🔗
|
Somebody2 |
ha |
|
06:21
🔗
|
Somebody2 |
Please do write them, yes. |
|
06:22
🔗
|
ola_norsk |
consider it halfassed! https://youtu.be/ATBl4qH9I54 |
|
06:23
🔗
|
ola_norsk |
...(serioudly though, i'll try') |
|
06:24
🔗
|
ola_norsk |
"What type of media would you like to upload?" ..kind of thing |
|
06:25
🔗
|
|
ola_norsk has quit IRC (kicked by ICANN for internetting under the influence) |
|
06:43
🔗
|
|
RichardG_ has quit IRC (Ping timeout: 255 seconds) |
|
07:08
🔗
|
|
kimmer12 has joined #archiveteam-bs |
|
07:14
🔗
|
|
kimmer1 has quit IRC (Read error: Operation timed out) |
|
07:23
🔗
|
|
kimmer1 has joined #archiveteam-bs |
|
07:28
🔗
|
|
kimmer13 has joined #archiveteam-bs |
|
07:30
🔗
|
|
kimmer12 has quit IRC (Ping timeout: 633 seconds) |
|
07:34
🔗
|
|
kimmer1 has quit IRC (Read error: Operation timed out) |
|
07:38
🔗
|
|
kimmer1 has joined #archiveteam-bs |
|
07:44
🔗
|
|
kimmer13 has quit IRC (Ping timeout: 633 seconds) |
|
08:37
🔗
|
|
ZexaronS- has quit IRC (Read error: Connection reset by peer) |
|
08:38
🔗
|
|
ZexaronS- has joined #archiveteam-bs |
|
09:17
🔗
|
ranma |
is nforce entertainment b.v in any way related to the old(?) nforce site? |
|
09:17
🔗
|
ranma |
the one with all the NFOs |
|
09:35
🔗
|
|
schbirid has joined #archiveteam-bs |
|
09:59
🔗
|
vantec |
Seems to be, but don't see them outright saying it anywhere. |
|
11:12
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
|
11:39
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
|
12:13
🔗
|
|
pizzaiolo has joined #archiveteam-bs |
|
12:17
🔗
|
|
jschwart has joined #archiveteam-bs |
|
12:33
🔗
|
|
odemg has quit IRC (Quit: Leaving) |
|
12:41
🔗
|
|
kimmer1 has quit IRC (Remote host closed the connection) |
|
12:42
🔗
|
|
kimmer1 has joined #archiveteam-bs |
|
12:46
🔗
|
|
icedice has joined #archiveteam-bs |
|
12:49
🔗
|
JAA |
ivan, Somebody2: My first (and so far only) upload a few weeks ago was included in the WM within a few hours, I believe. I don't know whether that was manually approved or not though. I'm pretty sure that the derive task ran immediately, but that doesn't really mean much I guess. |
|
12:50
🔗
|
|
ZexaronS- has quit IRC (Quit: Leaving) |
|
12:52
🔗
|
JAA |
MrRadar, Sanqui: FYI, #msgbored is open again, we managed to cycle it. (I sent you an invite, but I guess you might've missed it.) |
|
13:12
🔗
|
|
odemg has joined #archiveteam-bs |
|
13:31
🔗
|
|
LastNinja has joined #archiveteam-bs |
|
14:31
🔗
|
|
RichardG has joined #archiveteam-bs |
|
14:39
🔗
|
|
icedice2 has joined #archiveteam-bs |
|
14:46
🔗
|
|
icedice has quit IRC (Ping timeout: 506 seconds) |
|
15:20
🔗
|
|
icedice2 has quit IRC (Quit: Leaving) |
|
15:40
🔗
|
|
kimmer12 has joined #archiveteam-bs |
|
15:47
🔗
|
|
kimmer1 has quit IRC (Ping timeout: 632 seconds) |
|
17:15
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
|
17:49
🔗
|
Somebody2 |
JAA: What's your account name on IA? |
|
18:18
🔗
|
|
schbirid has joined #archiveteam-bs |
|
18:36
🔗
|
|
du_ has joined #archiveteam-bs |
|
19:12
🔗
|
|
Mateon1 has quit IRC (Ping timeout: 248 seconds) |
|
19:12
🔗
|
|
Mateon1 has joined #archiveteam-bs |
|
19:34
🔗
|
|
kimmer12 has quit IRC (Quit: Yaaic - Yet another Android IRC client - http://www.yaaic.org) |
|
19:34
🔗
|
|
kimmer1 has joined #archiveteam-bs |
|
19:34
🔗
|
|
Stilett0 has joined #archiveteam-bs |
|
19:49
🔗
|
antomatic |
Ngh. Good old ContentID. "This video has 11 seconds of grass and men and balls being kicked. Blocked worldwide!" .... |
|
20:58
🔗
|
JAA |
Somebody2: JustAnotherArchivist |
|
21:01
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
|
21:52
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
|
22:11
🔗
|
Somebody2 |
Whoops, now I'm in the right channel. |
|
22:11
🔗
|
JAA |
:-) |
|
22:44
🔗
|
|
jschwart has quit IRC (Quit: Konversation terminated!) |
|
22:52
🔗
|
|
pizzaiolo has quit IRC (pizzaiolo) |
|
22:54
🔗
|
|
RichardG_ has joined #archiveteam-bs |
|
22:54
🔗
|
godane |
biography.com video urls don't download anymore : https://pastebin.com/dRhf6y8U |
|
22:54
🔗
|
godane |
i figure i ask people here to see if anyone can fix it |
|
22:54
🔗
|
|
ndiddy_ has joined #archiveteam-bs |
|
22:54
🔗
|
godane |
last time i download was from site was 2017-10-28 |
|
22:55
🔗
|
|
K4k_ has joined #archiveteam-bs |
|
22:57
🔗
|
|
ppsym has joined #archiveteam-bs |
|
23:02
🔗
|
|
tuluu_ has joined #archiveteam-bs |
|
23:05
🔗
|
|
RichardG has quit IRC (se.hub irc.underworld.no) |
|
23:05
🔗
|
|
MrDignity has quit IRC (se.hub irc.underworld.no) |
|
23:05
🔗
|
|
ndiddy has quit IRC (se.hub irc.underworld.no) |
|
23:05
🔗
|
|
espes__ has quit IRC (se.hub irc.underworld.no) |
|
23:05
🔗
|
|
tuluu has quit IRC (se.hub irc.underworld.no) |
|
23:05
🔗
|
|
purplebot has quit IRC (se.hub irc.underworld.no) |
|
23:05
🔗
|
|
PurpleSym has quit IRC (se.hub irc.underworld.no) |
|
23:05
🔗
|
|
K4k has quit IRC (se.hub irc.underworld.no) |
|
23:05
🔗
|
|
Rai-chan has quit IRC (se.hub irc.underworld.no) |
|
23:05
🔗
|
|
i0npulse has quit IRC (se.hub irc.underworld.no) |
|
23:05
🔗
|
|
medowar has quit IRC (se.hub irc.underworld.no) |
|
23:08
🔗
|
|
espes___ has joined #archiveteam-bs |
|
23:20
🔗
|
|
ppsym is now known as PurpleSym |
|
23:45
🔗
|
|
BlueMaxim has quit IRC (Read error: Connection reset by peer) |
|
23:46
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
|
23:58
🔗
|
|
MrDignity has joined #archiveteam-bs |