Time |
Nickname |
Message |
00:00
🔗
|
Darkstar |
the error was "500 internal server error" |
00:00
🔗
|
Darkstar |
I get the feeling that it has to do with the item I'm trying to upload (or one specific file in it), however I uploaded many similar items just fine the last hour or so... |
00:01
🔗
|
Darkstar |
is there a blacklist on filenames or any scans that are being done while I upload (as opposed to post-processing) that I'm triggering? |
00:03
🔗
|
FalconK |
Darkstar: just out of curiosity, when you look at http://catalogd.archive.org/catalog.php?justme=1 do you have a ton of outstanding jobs? |
00:04
🔗
|
FalconK |
it looks like things have been sluggish with lots of failed disks this week |
00:04
🔗
|
Darkstar |
yeah, 2 jobs in queue (one waiting, one running), but they were from ~2h ago and I uploaded a few other items after that just fine... |
00:05
🔗
|
FalconK |
could be that there are globally too many jobs too |
00:05
🔗
|
FalconK |
somewhere in the S3 documentation is a place you can check your quota |
00:05
🔗
|
xmc |
i've noticed derives taking a really long time lately |
00:05
🔗
|
FalconK |
also in the archivebot uploader there is code to do it |
00:05
🔗
|
Darkstar |
I'm wondering because it seems to trigger only with that (slightly bigger) item I'm uploading (and by "bigger" I mean ~220 mb, nothing too fancy) |
00:05
🔗
|
FalconK |
derives have been taking a long time all month |
00:06
🔗
|
xmc |
yep |
00:06
🔗
|
Darkstar |
the item with the 2 jobs in my queue is ~23mb so I'm wondering why it takes so long |
00:06
🔗
|
|
kristian_ has joined #archiveteam |
00:06
🔗
|
xmc |
computers are tired |
00:06
🔗
|
FalconK |
Darkstar: probably not - I upload 2-3GB files probably 50 times a day at least and it keeps chugging away. |
00:06
🔗
|
Darkstar |
I uploaded a ~200mb one after that and that one seems to have cleared the queue already |
00:07
🔗
|
Darkstar |
that's why I was wondering if it's maybe something with the item. but it's just a ZIP file, a few JPGs and a few disk image files (of which I uploaded a few already today without problems) |
00:09
🔗
|
Darkstar |
I'll try uploading a different (smaller) item ... |
00:11
🔗
|
|
nertzy has joined #archiveteam |
00:14
🔗
|
Darkstar |
okay, that worked... |
00:15
🔗
|
Darkstar |
strange. I don't get it... |
00:19
🔗
|
Darkstar |
ok, let's try again |
00:20
🔗
|
Darkstar |
I changed the order of the files in the uploader now, maybe I can find out if it stops at a specific file or something |
00:21
🔗
|
Darkstar |
if it doesn't work I'll try again tomorrow by uploading only one file and adding the others one by one later... |
00:24
🔗
|
Darkstar |
strange. now it worked |
00:25
🔗
|
Darkstar |
oh well. probably cosmic rays or something :) |
00:25
🔗
|
Darkstar |
I'll keep an eye on my job queue though, to see if the stuck tasks are getting through by tomorrow. otherwise I'll probably re-derive that one item |
00:25
🔗
|
|
i336_ has joined #archiveteam |
00:28
🔗
|
FalconK |
oh, if you aren't setting the content-length and content-md5 headers, do so |
00:31
🔗
|
Darkstar |
huh? I'm using the web-based uploader, I would think that it sets these headers correctly by default? |
00:31
🔗
|
FalconK |
aah, likely |
00:31
🔗
|
FalconK |
I guess the web uploader uses the S3 API now then. cool. |
00:33
🔗
|
|
Kironide has joined #archiveteam |
00:35
🔗
|
Darkstar |
yeah, looks like it (from the error message). I have yet to find a nice (and usable) tool to upload items from the command line |
00:35
🔗
|
Kironide |
are there any good tools available for backing up your Facebook activity? |
00:35
🔗
|
Kironide |
- the official tool doesn't download everything (e.g. your comments in other places) |
00:36
🔗
|
Kironide |
- digi.me / socialsafe is just horrible to use |
00:36
🔗
|
Darkstar |
they all require me to either specify the metadata in some archaic CSV file (where it's not at all clear how to use cr/lf), or plain just don't work |
00:39
🔗
|
FalconK |
Darkstar: consider basing one off of my IA S3 uploader in github.com/falconkirtaran/ArchiveBot |
00:57
🔗
|
|
nertzy has quit IRC (Read error: Operation timed out) |
01:16
🔗
|
|
ravetcofx has joined #archiveteam |
01:43
🔗
|
|
pizzaiolo has quit IRC (Remote host closed the connection) |
01:52
🔗
|
|
Swizzle has joined #archiveteam |
01:54
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
01:58
🔗
|
|
dashcloud has joined #archiveteam |
01:59
🔗
|
|
odemg has quit IRC (Remote host closed the connection) |
01:59
🔗
|
|
odemg has joined #archiveteam |
02:03
🔗
|
|
kristian_ has quit IRC (Quit: Leaving) |
02:06
🔗
|
wp494 |
Kironide: the only thing I find bad about it is its scheduler sometimes goes off whack but I haven't seen anything other than standard api limits that turn it into shite |
02:09
🔗
|
Kironide |
wp494, I have lots of trouble exporting my data elsewhere, I have to do it in very small chunks or the program locks up |
02:09
🔗
|
Kironide |
and the export formats (CSV and PDF) are both very user-unfriendly |
02:10
🔗
|
Kironide |
presumably I could look into the database files directly, except I read online that digi.me's database files are encrypted with something that isn't your password |
02:12
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
02:13
🔗
|
Kironide |
do you know of any better solutions? I've failed to find any |
02:15
🔗
|
wp494 |
not that I've seen |
02:17
🔗
|
|
dashcloud has joined #archiveteam |
02:19
🔗
|
Kironide |
also it seems that it doesn't download comments you've made on other people's posts |
02:23
🔗
|
|
Hobart has joined #archiveteam |
02:28
🔗
|
|
VADemon has quit IRC (Quit: left4dead) |
02:31
🔗
|
|
BlueMaxim has joined #archiveteam |
02:36
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
02:41
🔗
|
|
dashcloud has joined #archiveteam |
03:09
🔗
|
|
Swizzle has quit IRC (Quit: Leaving) |
03:11
🔗
|
|
RetroRomp has joined #archiveteam |
03:12
🔗
|
RetroRomp |
Uh... How do we notify you guys of whole sites and forums that are in danger? |
03:12
🔗
|
RetroRomp |
Anyone? |
03:13
🔗
|
dxrt |
In here |
03:13
🔗
|
dxrt |
feel free to tell us |
03:13
🔗
|
RetroRomp |
ringplus.net is an MVNO that is soon to be shut down. |
03:14
🔗
|
RetroRomp |
Currently they are migrating all of their users to another service in preparation for it, but there are several tens of thousands of messages plus their cell plans were unique. |
03:15
🔗
|
dxrt |
Ah, I just checked and we're on that already with ArchiveBot. |
03:15
🔗
|
dxrt |
You can have a look at the dashboard http://dashboard.at.ninjawedding.org/3 |
03:15
🔗
|
RetroRomp |
Great! Even social.ringplus.net? |
03:15
🔗
|
dxrt |
that specifically. |
03:16
🔗
|
RetroRomp |
Why did I even worry? Do you guys need / want the user dashboard that is only available behind an account? |
03:20
🔗
|
dxrt |
It's probably a bit out of the scope, the ArchiveBot job is just grabbing all the available public data. |
03:21
🔗
|
dxrt |
Is there much significance to it? |
03:48
🔗
|
|
RetroRomp has quit IRC (Read error: Connection reset by peer) |
03:50
🔗
|
|
Hobart has left |
04:01
🔗
|
|
Burak has joined #archiveteam |
04:01
🔗
|
|
Svekla has quit IRC (Read error: Connection reset by peer) |
04:08
🔗
|
|
ravetcofx has quit IRC (Read error: Operation timed out) |
04:15
🔗
|
|
SmileyG has quit IRC (Read error: Connection reset by peer) |
04:25
🔗
|
|
ravetcofx has joined #archiveteam |
04:29
🔗
|
|
Smiley has joined #archiveteam |
04:59
🔗
|
yipdw |
whenever people say MVNO I think of DVNO and everything becomes printed in gold |
04:59
🔗
|
yipdw |
details make the girls sweat even more |
05:00
🔗
|
Frogging |
^^^^^ |
05:00
🔗
|
|
ndiddy has quit IRC (Read error: Connection reset by peer) |
05:16
🔗
|
|
i336_ has quit IRC (Ping timeout: 260 seconds) |
05:18
🔗
|
|
maelstrom has quit IRC (Quit: Leaving) |
05:52
🔗
|
ranma |
wat |
05:52
🔗
|
ranma |
i love my MVNO |
06:15
🔗
|
|
ravetcofx has quit IRC (Read error: Operation timed out) |
06:16
🔗
|
|
QBcrusher has joined #archiveteam |
06:17
🔗
|
|
ravetcofx has joined #archiveteam |
06:29
🔗
|
|
unkn0wn_ has joined #archiveteam |
07:19
🔗
|
|
crwbot_ has quit IRC (Quit: leaving) |
07:34
🔗
|
|
anomie has quit IRC (Ping timeout: 250 seconds) |
07:42
🔗
|
|
achip has quit IRC (west.us.hub irc.Prison.NET) |
07:42
🔗
|
|
LastNinja has quit IRC (west.us.hub irc.Prison.NET) |
07:42
🔗
|
|
tpw_rules has quit IRC (west.us.hub irc.Prison.NET) |
07:42
🔗
|
|
K4k has quit IRC (west.us.hub irc.Prison.NET) |
07:42
🔗
|
|
sivoais has quit IRC (west.us.hub irc.Prison.NET) |
07:42
🔗
|
|
mundus201 has quit IRC (west.us.hub irc.Prison.NET) |
07:42
🔗
|
|
ColdIce has quit IRC (west.us.hub irc.Prison.NET) |
07:42
🔗
|
|
midas1 has quit IRC (west.us.hub irc.Prison.NET) |
07:42
🔗
|
|
achip has joined #archiveteam |
07:42
🔗
|
|
LastNinja has joined #archiveteam |
07:42
🔗
|
|
tpw_rules has joined #archiveteam |
07:42
🔗
|
|
K4k has joined #archiveteam |
07:42
🔗
|
|
sivoais has joined #archiveteam |
07:42
🔗
|
|
mundus201 has joined #archiveteam |
07:42
🔗
|
|
ColdIce has joined #archiveteam |
07:42
🔗
|
|
midas1 has joined #archiveteam |
07:42
🔗
|
|
irc.Prison.NET sets mode: +o midas1 |
07:49
🔗
|
|
spiko has joined #archiveteam |
07:53
🔗
|
|
anomie has joined #archiveteam |
07:58
🔗
|
|
FalconK has quit IRC (Ping timeout: 260 seconds) |
08:08
🔗
|
|
odemg has quit IRC (Remote host closed the connection) |
08:43
🔗
|
|
pikhq_ has quit IRC (Ping timeout: 245 seconds) |
08:50
🔗
|
|
pikhq has joined #archiveteam |
08:58
🔗
|
|
schbirid has joined #archiveteam |
09:31
🔗
|
|
i336_ has joined #archiveteam |
09:56
🔗
|
|
FalconK has joined #archiveteam |
10:50
🔗
|
|
ravetcofx has quit IRC (Read error: Operation timed out) |
11:12
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
11:15
🔗
|
|
Morbus has joined #archiveteam |
11:25
🔗
|
|
BiggieJon has joined #archiveteam |
11:28
🔗
|
|
Morbus has quit IRC (Quit: http://www.disobey.com/) |
11:59
🔗
|
|
Morbus has joined #archiveteam |
12:22
🔗
|
scyther |
How would i go about uploading about 3TB of data? I tried messaging sketchcow, but he didn't respond |
12:41
🔗
|
|
pizzaiolo has joined #archiveteam |
13:07
🔗
|
|
i336_ has quit IRC (Ping timeout: 260 seconds) |
13:40
🔗
|
|
nertzy has joined #archiveteam |
13:55
🔗
|
|
nertzy has quit IRC (Ping timeout: 255 seconds) |
14:09
🔗
|
|
closure has quit IRC (Ping timeout: 244 seconds) |
14:15
🔗
|
|
nertzy has joined #archiveteam |
14:18
🔗
|
|
closure has joined #archiveteam |
14:44
🔗
|
|
will has quit IRC (Ping timeout: 244 seconds) |
14:45
🔗
|
|
will has joined #archiveteam |
14:50
🔗
|
|
kris33 has quit IRC (Textual IRC Client: www.textualapp.com) |
14:51
🔗
|
|
HCross has quit IRC (Read error: Connection reset by peer) |
14:52
🔗
|
|
kris33 has joined #archiveteam |
14:52
🔗
|
|
HarryCros has joined #archiveteam |
14:55
🔗
|
|
paparus has joined #archiveteam |
14:55
🔗
|
|
Boppen has quit IRC (Ping timeout: 194 seconds) |
14:56
🔗
|
|
Boppen has joined #archiveteam |
15:02
🔗
|
|
kris33 has quit IRC (Textual IRC Client: www.textualapp.com) |
15:03
🔗
|
|
kris33 has joined #archiveteam |
15:04
🔗
|
|
nertzy has quit IRC (Ping timeout: 255 seconds) |
15:05
🔗
|
|
kris33 has quit IRC (Textual IRC Client: www.textualapp.com) |
15:07
🔗
|
|
nertzy has joined #archiveteam |
15:07
🔗
|
|
kris33 has joined #archiveteam |
15:30
🔗
|
|
ravetcofx has joined #archiveteam |
15:49
🔗
|
|
kris33 has quit IRC (Textual IRC Client: www.textualapp.com) |
15:51
🔗
|
|
kris33 has joined #archiveteam |
15:59
🔗
|
|
kris33 has quit IRC (Textual IRC Client: www.textualapp.com) |
16:31
🔗
|
rocode |
scyther, use the IA CLi tool, and if possible cut it up into 400gb chunks |
16:32
🔗
|
scyther |
okay, so no special permission needed for stuff this big? |
16:34
🔗
|
rocode |
Nope. Do try to break it into smaller chunks though so the derive task doesn't murder you in your sleep. |
16:34
🔗
|
rocode |
What are you uploading? |
16:49
🔗
|
scyther |
backup of nintendo game servers |
16:50
🔗
|
scyther |
encrypted binaries of games, but when they pull them (they already pulled some), people who have the title keys can download from archive |
17:02
🔗
|
arkiver |
nice |
17:02
🔗
|
arkiver |
are these individual games? |
17:03
🔗
|
arkiver |
or like packs of games? |
17:03
🔗
|
arkiver |
if you have any metadata and depending on the number of games it might be nice to upload them as multiple items |
17:23
🔗
|
|
odemg has joined #archiveteam |
17:23
🔗
|
|
odemg has quit IRC (Connection closed) |
17:24
🔗
|
|
odemg has joined #archiveteam |
17:29
🔗
|
|
jonty has joined #archiveteam |
17:29
🔗
|
jonty |
Hello! |
17:30
🔗
|
jonty |
I'd like to ensure that all of gov.uk is archived - a random sampling of URL's from the sitemap shows it has very little |
17:30
🔗
|
jonty |
What's the best way to go about doing this? |
17:30
🔗
|
jonty |
It's about 300k urls |
17:30
🔗
|
jonty |
I was about to just hit https://web.archive.org/save/ for all of them, but then realised there must be a better way using the appliance or something |
17:31
🔗
|
jonty |
Could someone point me at the right thing to be doing? |
17:39
🔗
|
rocode |
jonty, #cheetoflee and http://archiveteam.org/index.php?title=Government_Backup |
17:45
🔗
|
jonty |
Thanks! |
17:58
🔗
|
|
btfo has quit IRC (Read error: Operation timed out) |
17:59
🔗
|
|
unkn0wn_ has quit IRC () |
18:04
🔗
|
|
Morbus has quit IRC (http://www.disobey.com/) |
18:10
🔗
|
|
odemg has quit IRC (Remote host closed the connection) |
18:53
🔗
|
|
unkn0wn_ has joined #archiveteam |
18:54
🔗
|
|
odemg has joined #archiveteam |
18:59
🔗
|
|
maz1324 has joined #archiveteam |
19:01
🔗
|
|
maz1324 has quit IRC (Remote host closed the connection) |
19:04
🔗
|
|
maz1324 has joined #archiveteam |
19:07
🔗
|
|
odemg has quit IRC (Remote host closed the connection) |
19:10
🔗
|
|
ZexaronS has joined #archiveteam |
19:11
🔗
|
schbirid |
scyther/rocode: iirc a much smaller size is nicer, eg 50G |
19:11
🔗
|
|
odemg has joined #archiveteam |
19:11
🔗
|
schbirid |
that enables IA to distribute items across storage much nicer |
19:19
🔗
|
yipdw |
yeah, it also helps when people want to download it |
19:19
🔗
|
yipdw |
it happens sometimes |
19:24
🔗
|
|
maz1324 has quit IRC (Quit: http://chat.efnet.org ) |
19:55
🔗
|
|
mls has quit IRC (Ping timeout: 250 seconds) |
20:05
🔗
|
|
maelstrom has joined #archiveteam |
20:18
🔗
|
|
Ravenloft has joined #archiveteam |
20:19
🔗
|
|
odemg has quit IRC (Remote host closed the connection) |
20:24
🔗
|
|
mls has joined #archiveteam |
20:33
🔗
|
|
lordcosmo has joined #archiveteam |
20:39
🔗
|
|
lordcosmo has quit IRC (Read error: Connection reset by peer) |
20:40
🔗
|
|
lordcosmo has joined #archiveteam |
20:49
🔗
|
|
lordcosmo has quit IRC (Ping timeout: 250 seconds) |
20:50
🔗
|
|
lordcosmo has joined #archiveteam |
20:55
🔗
|
|
lordcosmo has quit IRC (Ping timeout: 250 seconds) |
20:56
🔗
|
|
lordcosmo has joined #archiveteam |
21:03
🔗
|
|
tsr has quit IRC (Quit: foo) |
21:05
🔗
|
|
lordcosmo has quit IRC (Ping timeout: 250 seconds) |
21:07
🔗
|
|
lordcosmo has joined #archiveteam |
21:07
🔗
|
|
tsr has joined #archiveteam |
21:12
🔗
|
nightpool |
cool article, if you missed it: http://venturebeat.com/2017/02/14/the-internet-archive-wants-to-host-pacer-records-from-u-s-courts-and-make-them-available-for-free/ |
21:15
🔗
|
|
cheez has joined #archiveteam |
21:16
🔗
|
cheez |
just heard about this effort, wanted to say thanks for making it |
21:16
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
21:18
🔗
|
|
VADemon has joined #archiveteam |
21:26
🔗
|
Lord_Nigh |
SketchCow: how goes the noaa archiving thing? iirc someone on /r/datahoarders has the whole thing (400GB or something) |
21:29
🔗
|
|
n00b616 has joined #archiveteam |
21:34
🔗
|
|
n00b616 has quit IRC (Quit: Page closed) |
21:35
🔗
|
|
BlueMaxim has joined #archiveteam |
21:37
🔗
|
FalconK |
cf. recap |
21:38
🔗
|
FalconK |
we'll get them, will they or won't they. |
21:38
🔗
|
|
lordcosmo has quit IRC (Ping timeout: 250 seconds) |
21:45
🔗
|
|
icedice has joined #archiveteam |
22:10
🔗
|
|
Honno has quit IRC (Read error: Connection reset by peer) |
22:13
🔗
|
|
ndiddy has joined #archiveteam |
22:21
🔗
|
|
crwbot has joined #archiveteam |
22:33
🔗
|
Kaz |
SketchCow: when you've got a few minutes - /pipeline is broken again |
22:35
🔗
|
|
Oddy has joined #archiveteam |
22:41
🔗
|
Oddy |
goodevening |
22:41
🔗
|
|
btfo has joined #archiveteam |
22:54
🔗
|
|
icedice has quit IRC (Read error: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac) |
22:55
🔗
|
|
icedice has joined #archiveteam |
23:01
🔗
|
rocode |
https://www.reddit.com/r/HumansBeingBros/comments/5u1aj6/on_saturday_morning_200_hackers_at_uc_berkeley/ |
23:02
🔗
|
rocode |
^- actually related. |
23:42
🔗
|
|
QBcrusher has quit IRC (Ping timeout: 492 seconds) |
23:44
🔗
|
|
odemg has joined #archiveteam |
23:48
🔗
|
|
odemg has quit IRC (Remote host closed the connection) |
23:49
🔗
|
|
odemg has joined #archiveteam |
23:57
🔗
|
|
Famicoman has joined #archiveteam |
23:58
🔗
|
|
ats has quit IRC (Read error: Operation timed out) |