Time |
Nickname |
Message |
00:17
🔗
|
godane |
so i'm going to be reuploading the koreanet 1 chuncheon pg butitv |
00:18
🔗
|
godane |
thats cause i screwed up the id names |
00:18
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
00:19
🔗
|
JAA |
joepie91: I'll have to do some more testing tomorrow, but I think I managed to port it correctly! :-) |
00:37
🔗
|
JAA |
You're missing the pattern +!![] (== 1) in your script, by the way, and I think ++![] is a syntax error. |
01:26
🔗
|
|
ola_norsk has quit IRC (it's christmas! Drink your christmas beers https://youtu.be/NtV-EB8kvf8) |
01:43
🔗
|
|
LastNinja has quit IRC (Ping timeout: 260 seconds) |
01:47
🔗
|
|
ZexaronS- has joined #archiveteam-bs |
01:47
🔗
|
|
ZexaronS has quit IRC (Read error: Connection reset by peer) |
01:53
🔗
|
|
Pixi has quit IRC (Ping timeout: 255 seconds) |
01:55
🔗
|
|
Pixi has joined #archiveteam-bs |
02:33
🔗
|
|
Stilett0 is now known as Stiletto |
02:46
🔗
|
|
Odd0002 has quit IRC (Quit: ZNC - http://znc.in) |
02:53
🔗
|
|
Odd0002 has joined #archiveteam-bs |
03:22
🔗
|
|
ola_norsk has joined #archiveteam-bs |
03:24
🔗
|
ola_norsk |
if a WARC item is uploaded with 'Noindex: true' , (or even with 30-days 'test item'). Does it still go to waybackmachine? |
03:24
🔗
|
ola_norsk |
regardless of how freakishly inclomplete it may or may not be |
03:29
🔗
|
Somebody2 |
ola_norsk: Only WARCs from trusted sources go into the Wayback Machine. "trusted" is a subtle and not-exactly-documented quality, though. |
03:33
🔗
|
ola_norsk |
ok |
04:03
🔗
|
|
qw3rty112 has joined #archiveteam-bs |
04:07
🔗
|
|
qw3rty111 has quit IRC (Read error: Operation timed out) |
04:14
🔗
|
|
pizzaiolo has quit IRC (pizzaiolo) |
04:17
🔗
|
ivan |
Somebody2: not sure this is true any more, anyone can start uploading mediatype:web into Community Texts |
04:17
🔗
|
ivan |
I assume you can still get blacklisted |
04:25
🔗
|
Somebody2 |
ivan: Hm, neat. |
04:25
🔗
|
Somebody2 |
ivan: have you verified that random mediatype:web items are included in the Wayback Machine, though? |
04:26
🔗
|
ivan |
well, mine are, but I doubt I got special treatment |
04:26
🔗
|
ivan |
try it and see! |
04:27
🔗
|
ola_norsk |
ivan: i'm not risking blacklisted :D |
04:28
🔗
|
ola_norsk |
ivan: e.g what could cause that, btw? |
04:28
🔗
|
ivan |
1) cause what? 2) I have no idea probably |
04:29
🔗
|
ola_norsk |
< ivan> I assume you can still get blacklisted |
04:29
🔗
|
ivan |
I assume if someone at IA notices your WARCs are full of fake responses |
04:29
🔗
|
ivan |
or ISP/DNS content swaparoos |
04:30
🔗
|
ola_norsk |
but e.g warcs from webrecorder should be ok i guess? |
04:30
🔗
|
ivan |
I would guess so |
04:31
🔗
|
ola_norsk |
would making them "NoIndex: true" prevent it from getting to waybackmachine? |
04:32
🔗
|
ola_norsk |
webrecorder makes these 'patches', i mean, that'd i'd rather not see on my listing |
04:33
🔗
|
ola_norsk |
preferably, i'd make it test items that are deleted after 30 days, as long as the warcs are processed by then |
04:38
🔗
|
ola_norsk |
if "Noindex: true" could prevent it being shown as item, and it still got submitted to wayback, that would be 100% nice :D |
04:40
🔗
|
ola_norsk |
e.g if that is/was the case, i could make an item containing the webrecorder warc, and it's 'patch' warcs i guess |
04:46
🔗
|
ivan |
ola_norsk: there's nothing wrong with having an item |
04:46
🔗
|
ola_norsk |
ivan: i have to look at it :D |
04:47
🔗
|
ivan |
I don't really look at my items |
04:47
🔗
|
ola_norsk |
ivan: and, it would be quite rudimentary, since i would'nt really bother with topics etc |
04:48
🔗
|
ivan |
just tag with topic warcarchives |
04:51
🔗
|
ola_norsk |
then only one way to find out i guess :/ .. I just know the lack of thumbnail is going to be pester me though :D |
04:53
🔗
|
ivan |
I can only suggest OCDing about something else |
04:53
🔗
|
ola_norsk |
;) |
04:58
🔗
|
ola_norsk |
community web is not listed in ia browser uploader, is there argument for 'ia' tool to create an item of that sort? |
04:59
🔗
|
ola_norsk |
i've never use 'ia' tool to create item, only upload to or alter |
05:00
🔗
|
ivan |
no community web, only Community Texts, and I think it'll land there by default |
05:00
🔗
|
ola_norsk |
k |
05:01
🔗
|
ivan |
#internetarchive |
05:01
🔗
|
|
qw3rty113 has joined #archiveteam-bs |
05:04
🔗
|
ola_norsk |
ill just wing it with a dummyfile with random data in a test item and see where it lands when picking text |
05:04
🔗
|
ivan |
aren't test items going to land in the test collection |
05:05
🔗
|
|
qw3rty112 has quit IRC (Read error: Operation timed out) |
05:09
🔗
|
ola_norsk |
i think that is a secondary entry of them |
05:10
🔗
|
ivan |
can't be in two collections |
05:10
🔗
|
ola_norsk |
ivan: https://archive.org/details/dummy_test_data |
05:11
🔗
|
ivan |
ah, I totally forgot |
05:11
🔗
|
ola_norsk |
look at this messed up thing though :D https://archive.org/details/vidme_AfterPrisonJoe |
05:12
🔗
|
ola_norsk |
'community data' :D |
05:12
🔗
|
ola_norsk |
i'm thinking that's where random warcs might really belong at :D |
05:13
🔗
|
ola_norsk |
it's not an option when uploading in browser though i think |
05:14
🔗
|
ola_norsk |
might be what happens if making an item with 'ia' tool, without specifying anything, i cant remember |
05:15
🔗
|
ola_norsk |
it's still in 'texts' collection, but mediatype is 'data' :DD |
05:18
🔗
|
ola_norsk |
when using 'ia upload <somerandomitemid> *' , i mean |
05:22
🔗
|
ola_norsk |
ivan: man, i just realized what you meant, i noticed first now the 'collection:' field on that dummyfile :D |
05:22
🔗
|
ola_norsk |
sry |
05:23
🔗
|
ola_norsk |
that could be 'collection: web' ? |
05:32
🔗
|
ola_norsk |
ivan: should i un-gz the warcs first? |
05:33
🔗
|
ola_norsk |
webrecorder downloads gzip'ed warcs it seems |
05:33
🔗
|
ivan |
ola_norsk: mediatype:web, not collection |
05:33
🔗
|
ivan |
don't un-gz |
05:33
🔗
|
ola_norsk |
k |
05:34
🔗
|
ivan |
no idea how to use ia but over the S3 interface it's header x-archive-meta-mediatype:web |
05:39
🔗
|
ola_norsk |
i tried 'ia upload ola_norsk_AGP_warcs theangrygranpa-20171217053032.warc.gz' , and it did upload |
05:41
🔗
|
ola_norsk |
though, the item is (not yet) listed on my profile |
05:41
🔗
|
ola_norsk |
adding the later 'patch' to that item seems to go as well |
05:43
🔗
|
ola_norsk |
'ia ls ola_norsk_AGP_warcs' |
05:44
🔗
|
ola_norsk |
seems to work fine :D , it's created the same derivs as when i added a warc to another item |
05:45
🔗
|
ola_norsk |
and it's 'data' though :/ |
05:46
🔗
|
ola_norsk |
and mediatype can not be changed trough web ui |
05:46
🔗
|
Somebody2 |
ivan: I'm pretty sure you are whitelisted -- you are certainly known. The test would be to create a new account, and upload a WARC |
05:46
🔗
|
Somebody2 |
from that, and see if it gets included. |
05:46
🔗
|
Somebody2 |
(Or we could just ask, I suppose.) |
05:49
🔗
|
Somebody2 |
(which I've now done) |
05:50
🔗
|
ola_norsk |
does it matter if 'mediatype' is set to 'web' on an item, for it to be applied to wayback ? |
05:50
🔗
|
Somebody2 |
Yes, I'm pretty sure mediatype:web is required. |
05:51
🔗
|
ola_norsk |
dang, how can https://archive.org/details/ola_norsk_AGP_warcs be changed from 'data' to 'web' ? |
05:52
🔗
|
ivan |
info@archive.org |
05:52
🔗
|
ola_norsk |
i was afraid of that would be the answer :D |
05:53
🔗
|
ivan |
actually, has anyone tried changing mediatype over S3? is the change ignored? |
05:54
🔗
|
ivan |
"only Archive admins can make that change." https://archive.org/post/1064443/change-media-type |
05:57
🔗
|
ola_norsk |
aye, and it can not be done trough web ui |
06:00
🔗
|
ola_norsk |
ill type the mail later today when i've slep and sober i think, since also the afterprisonjoe item needs changing |
06:03
🔗
|
ola_norsk |
for future reference, how might i specify 'mediatype: web' at upload with ia command line? |
06:07
🔗
|
ola_norsk |
nevermind, i realize what ivan meant by header 'x-archive-meta-mediatype:web' |
06:10
🔗
|
ola_norsk |
'ia upload --header=x-archive-meta-mediatype:web <item> <file>' i think |
06:11
🔗
|
Somebody2 |
No, I don't think that's right. |
06:11
🔗
|
ola_norsk |
-H, --header=<key:value>... S3 HTTP headers to send with your request. |
06:11
🔗
|
Somebody2 |
You should use --metadata instead |
06:12
🔗
|
Somebody2 |
--metadata=mediatype:web |
06:12
🔗
|
Somebody2 |
The header form may work, too, though. |
06:13
🔗
|
ola_norsk |
https://github.com/vmbrasseur/IAS3API/blob/master/headers.md |
06:15
🔗
|
Somebody2 |
Yeah, that suggests that either way should work. |
06:16
🔗
|
ola_norsk |
ok |
06:17
🔗
|
ola_norsk |
i'm going to have to stay away from writing ia command lines i think, and just make e.g 'webiaarchive.sh' and 'videoiaarchive.sh' :D |
06:20
🔗
|
ola_norsk |
better yet, an 'archivefordummy.sh' using dialog :D |
06:21
🔗
|
Somebody2 |
ha |
06:21
🔗
|
Somebody2 |
Please do write them, yes. |
06:22
🔗
|
ola_norsk |
consider it halfassed! https://youtu.be/ATBl4qH9I54 |
06:23
🔗
|
ola_norsk |
...(serioudly though, i'll try') |
06:24
🔗
|
ola_norsk |
"What type of media would you like to upload?" ..kind of thing |
06:25
🔗
|
|
ola_norsk has quit IRC (kicked by ICANN for internetting under the influence) |
06:43
🔗
|
|
RichardG_ has quit IRC (Ping timeout: 255 seconds) |
07:08
🔗
|
|
kimmer12 has joined #archiveteam-bs |
07:14
🔗
|
|
kimmer1 has quit IRC (Read error: Operation timed out) |
07:23
🔗
|
|
kimmer1 has joined #archiveteam-bs |
07:28
🔗
|
|
kimmer13 has joined #archiveteam-bs |
07:30
🔗
|
|
kimmer12 has quit IRC (Ping timeout: 633 seconds) |
07:34
🔗
|
|
kimmer1 has quit IRC (Read error: Operation timed out) |
07:38
🔗
|
|
kimmer1 has joined #archiveteam-bs |
07:44
🔗
|
|
kimmer13 has quit IRC (Ping timeout: 633 seconds) |
08:37
🔗
|
|
ZexaronS- has quit IRC (Read error: Connection reset by peer) |
08:38
🔗
|
|
ZexaronS- has joined #archiveteam-bs |
09:17
🔗
|
ranma |
is nforce entertainment b.v in any way related to the old(?) nforce site? |
09:17
🔗
|
ranma |
the one with all the NFOs |
09:35
🔗
|
|
schbirid has joined #archiveteam-bs |
09:59
🔗
|
vantec |
Seems to be, but don't see them outright saying it anywhere. |
11:12
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
11:39
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
12:13
🔗
|
|
pizzaiolo has joined #archiveteam-bs |
12:17
🔗
|
|
jschwart has joined #archiveteam-bs |
12:33
🔗
|
|
odemg has quit IRC (Quit: Leaving) |
12:41
🔗
|
|
kimmer1 has quit IRC (Remote host closed the connection) |
12:42
🔗
|
|
kimmer1 has joined #archiveteam-bs |
12:46
🔗
|
|
icedice has joined #archiveteam-bs |
12:49
🔗
|
JAA |
ivan, Somebody2: My first (and so far only) upload a few weeks ago was included in the WM within a few hours, I believe. I don't know whether that was manually approved or not though. I'm pretty sure that the derive task ran immediately, but that doesn't really mean much I guess. |
12:50
🔗
|
|
ZexaronS- has quit IRC (Quit: Leaving) |
12:52
🔗
|
JAA |
MrRadar, Sanqui: FYI, #msgbored is open again, we managed to cycle it. (I sent you an invite, but I guess you might've missed it.) |
13:12
🔗
|
|
odemg has joined #archiveteam-bs |
13:31
🔗
|
|
LastNinja has joined #archiveteam-bs |
14:31
🔗
|
|
RichardG has joined #archiveteam-bs |
14:39
🔗
|
|
icedice2 has joined #archiveteam-bs |
14:46
🔗
|
|
icedice has quit IRC (Ping timeout: 506 seconds) |
15:20
🔗
|
|
icedice2 has quit IRC (Quit: Leaving) |
15:40
🔗
|
|
kimmer12 has joined #archiveteam-bs |
15:47
🔗
|
|
kimmer1 has quit IRC (Ping timeout: 632 seconds) |
17:15
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
17:49
🔗
|
Somebody2 |
JAA: What's your account name on IA? |
18:18
🔗
|
|
schbirid has joined #archiveteam-bs |
18:36
🔗
|
|
du_ has joined #archiveteam-bs |
19:12
🔗
|
|
Mateon1 has quit IRC (Ping timeout: 248 seconds) |
19:12
🔗
|
|
Mateon1 has joined #archiveteam-bs |
19:34
🔗
|
|
kimmer12 has quit IRC (Quit: Yaaic - Yet another Android IRC client - http://www.yaaic.org) |
19:34
🔗
|
|
kimmer1 has joined #archiveteam-bs |
19:34
🔗
|
|
Stilett0 has joined #archiveteam-bs |
19:49
🔗
|
antomatic |
Ngh. Good old ContentID. "This video has 11 seconds of grass and men and balls being kicked. Blocked worldwide!" .... |
20:58
🔗
|
JAA |
Somebody2: JustAnotherArchivist |
21:01
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
21:52
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
22:11
🔗
|
Somebody2 |
Whoops, now I'm in the right channel. |
22:11
🔗
|
JAA |
:-) |
22:44
🔗
|
|
jschwart has quit IRC (Quit: Konversation terminated!) |
22:52
🔗
|
|
pizzaiolo has quit IRC (pizzaiolo) |
22:54
🔗
|
|
RichardG_ has joined #archiveteam-bs |
22:54
🔗
|
godane |
biography.com video urls don't download anymore : https://pastebin.com/dRhf6y8U |
22:54
🔗
|
godane |
i figure i ask people here to see if anyone can fix it |
22:54
🔗
|
|
ndiddy_ has joined #archiveteam-bs |
22:54
🔗
|
godane |
last time i download was from site was 2017-10-28 |
22:55
🔗
|
|
K4k_ has joined #archiveteam-bs |
22:57
🔗
|
|
ppsym has joined #archiveteam-bs |
23:02
🔗
|
|
tuluu_ has joined #archiveteam-bs |
23:05
🔗
|
|
RichardG has quit IRC (se.hub irc.underworld.no) |
23:05
🔗
|
|
MrDignity has quit IRC (se.hub irc.underworld.no) |
23:05
🔗
|
|
ndiddy has quit IRC (se.hub irc.underworld.no) |
23:05
🔗
|
|
espes__ has quit IRC (se.hub irc.underworld.no) |
23:05
🔗
|
|
tuluu has quit IRC (se.hub irc.underworld.no) |
23:05
🔗
|
|
purplebot has quit IRC (se.hub irc.underworld.no) |
23:05
🔗
|
|
PurpleSym has quit IRC (se.hub irc.underworld.no) |
23:05
🔗
|
|
K4k has quit IRC (se.hub irc.underworld.no) |
23:05
🔗
|
|
Rai-chan has quit IRC (se.hub irc.underworld.no) |
23:05
🔗
|
|
i0npulse has quit IRC (se.hub irc.underworld.no) |
23:05
🔗
|
|
medowar has quit IRC (se.hub irc.underworld.no) |
23:08
🔗
|
|
espes___ has joined #archiveteam-bs |
23:20
🔗
|
|
ppsym is now known as PurpleSym |
23:45
🔗
|
|
BlueMaxim has quit IRC (Read error: Connection reset by peer) |
23:46
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
23:58
🔗
|
|
MrDignity has joined #archiveteam-bs |