#archiveteam 2017-02-14,Tue

↑back Search

Time Nickname Message
00:00 🔗 Darkstar the error was "500 internal server error"
00:00 🔗 Darkstar I get the feeling that it has to do with the item I'm trying to upload (or one specific file in it), however I uploaded many similar items just fine the last hour or so...
00:01 🔗 Darkstar is there a blacklist on filenames or any scans that are being done while I upload (as opposed to post-processing) that I'm triggering?
00:03 🔗 FalconK Darkstar: just out of curiosity, when you look at http://catalogd.archive.org/catalog.php?justme=1 do you have a ton of outstanding jobs?
00:04 🔗 FalconK it looks like things have been sluggish with lots of failed disks this week
00:04 🔗 Darkstar yeah, 2 jobs in queue (one waiting, one running), but they were from ~2h ago and I uploaded a few other items after that just fine...
00:05 🔗 FalconK could be that there are globally too many jobs too
00:05 🔗 FalconK somewhere in the S3 documentation is a place you can check your quota
00:05 🔗 xmc i've noticed derives taking a really long time lately
00:05 🔗 FalconK also in the archivebot uploader there is code to do it
00:05 🔗 Darkstar I'm wondering because it seems to trigger only with that (slightly bigger) item I'm uploading (and by "bigger" I mean ~220 mb, nothing too fancy)
00:05 🔗 FalconK derives have been taking a long time all month
00:06 🔗 xmc yep
00:06 🔗 Darkstar the item with the 2 jobs in my queue is ~23mb so I'm wondering why it takes so long
00:06 🔗 kristian_ has joined #archiveteam
00:06 🔗 xmc computers are tired
00:06 🔗 FalconK Darkstar: probably not - I upload 2-3GB files probably 50 times a day at least and it keeps chugging away.
00:06 🔗 Darkstar I uploaded a ~200mb one after that and that one seems to have cleared the queue already
00:07 🔗 Darkstar that's why I was wondering if it's maybe something with the item. but it's just a ZIP file, a few JPGs and a few disk image files (of which I uploaded a few already today without problems)
00:09 🔗 Darkstar I'll try uploading a different (smaller) item ...
00:11 🔗 nertzy has joined #archiveteam
00:14 🔗 Darkstar okay, that worked...
00:15 🔗 Darkstar strange. I don't get it...
00:19 🔗 Darkstar ok, let's try again
00:20 🔗 Darkstar I changed the order of the files in the uploader now, maybe I can find out if it stops at a specific file or something
00:21 🔗 Darkstar if it doesn't work I'll try again tomorrow by uploading only one file and adding the others one by one later...
00:24 🔗 Darkstar strange. now it worked
00:25 🔗 Darkstar oh well. probably cosmic rays or something :)
00:25 🔗 Darkstar I'll keep an eye on my job queue though, to see if the stuck tasks are getting through by tomorrow. otherwise I'll probably re-derive that one item
00:25 🔗 i336_ has joined #archiveteam
00:28 🔗 FalconK oh, if you aren't setting the content-length and content-md5 headers, do so
00:31 🔗 Darkstar huh? I'm using the web-based uploader, I would think that it sets these headers correctly by default?
00:31 🔗 FalconK aah, likely
00:31 🔗 FalconK I guess the web uploader uses the S3 API now then. cool.
00:33 🔗 Kironide has joined #archiveteam
00:35 🔗 Darkstar yeah, looks like it (from the error message). I have yet to find a nice (and usable) tool to upload items from the command line
00:35 🔗 Kironide are there any good tools available for backing up your Facebook activity?
00:35 🔗 Kironide - the official tool doesn't download everything (e.g. your comments in other places)
00:36 🔗 Kironide - digi.me / socialsafe is just horrible to use
00:36 🔗 Darkstar they all require me to either specify the metadata in some archaic CSV file (where it's not at all clear how to use cr/lf), or plain just don't work
00:39 🔗 FalconK Darkstar: consider basing one off of my IA S3 uploader in github.com/falconkirtaran/ArchiveBot
00:57 🔗 nertzy has quit IRC (Read error: Operation timed out)
01:16 🔗 ravetcofx has joined #archiveteam
01:43 🔗 pizzaiolo has quit IRC (Remote host closed the connection)
01:52 🔗 Swizzle has joined #archiveteam
01:54 🔗 dashcloud has quit IRC (Read error: Operation timed out)
01:58 🔗 dashcloud has joined #archiveteam
01:59 🔗 odemg has quit IRC (Remote host closed the connection)
01:59 🔗 odemg has joined #archiveteam
02:03 🔗 kristian_ has quit IRC (Quit: Leaving)
02:06 🔗 wp494 Kironide: the only thing I find bad about it is its scheduler sometimes goes off whack but I haven't seen anything other than standard api limits that turn it into shite
02:09 🔗 Kironide wp494, I have lots of trouble exporting my data elsewhere, I have to do it in very small chunks or the program locks up
02:09 🔗 Kironide and the export formats (CSV and PDF) are both very user-unfriendly
02:10 🔗 Kironide presumably I could look into the database files directly, except I read online that digi.me's database files are encrypted with something that isn't your password
02:12 🔗 dashcloud has quit IRC (Read error: Operation timed out)
02:13 🔗 Kironide do you know of any better solutions? I've failed to find any
02:15 🔗 wp494 not that I've seen
02:17 🔗 dashcloud has joined #archiveteam
02:19 🔗 Kironide also it seems that it doesn't download comments you've made on other people's posts
02:23 🔗 Hobart has joined #archiveteam
02:28 🔗 VADemon has quit IRC (Quit: left4dead)
02:31 🔗 BlueMaxim has joined #archiveteam
02:36 🔗 dashcloud has quit IRC (Read error: Operation timed out)
02:41 🔗 dashcloud has joined #archiveteam
03:09 🔗 Swizzle has quit IRC (Quit: Leaving)
03:11 🔗 RetroRomp has joined #archiveteam
03:12 🔗 RetroRomp Uh... How do we notify you guys of whole sites and forums that are in danger?
03:12 🔗 RetroRomp Anyone?
03:13 🔗 dxrt In here
03:13 🔗 dxrt feel free to tell us
03:13 🔗 RetroRomp ringplus.net is an MVNO that is soon to be shut down.
03:14 🔗 RetroRomp Currently they are migrating all of their users to another service in preparation for it, but there are several tens of thousands of messages plus their cell plans were unique.
03:15 🔗 dxrt Ah, I just checked and we're on that already with ArchiveBot.
03:15 🔗 dxrt You can have a look at the dashboard http://dashboard.at.ninjawedding.org/3
03:15 🔗 RetroRomp Great! Even social.ringplus.net?
03:15 🔗 dxrt that specifically.
03:16 🔗 RetroRomp Why did I even worry? Do you guys need / want the user dashboard that is only available behind an account?
03:20 🔗 dxrt It's probably a bit out of the scope, the ArchiveBot job is just grabbing all the available public data.
03:21 🔗 dxrt Is there much significance to it?
03:48 🔗 RetroRomp has quit IRC (Read error: Connection reset by peer)
03:50 🔗 Hobart has left
04:01 🔗 Burak has joined #archiveteam
04:01 🔗 Svekla has quit IRC (Read error: Connection reset by peer)
04:08 🔗 ravetcofx has quit IRC (Read error: Operation timed out)
04:15 🔗 SmileyG has quit IRC (Read error: Connection reset by peer)
04:25 🔗 ravetcofx has joined #archiveteam
04:29 🔗 Smiley has joined #archiveteam
04:59 🔗 yipdw whenever people say MVNO I think of DVNO and everything becomes printed in gold
04:59 🔗 yipdw details make the girls sweat even more
05:00 🔗 Frogging ^^^^^
05:00 🔗 ndiddy has quit IRC (Read error: Connection reset by peer)
05:16 🔗 i336_ has quit IRC (Ping timeout: 260 seconds)
05:18 🔗 maelstrom has quit IRC (Quit: Leaving)
05:52 🔗 ranma wat
05:52 🔗 ranma i love my MVNO
06:15 🔗 ravetcofx has quit IRC (Read error: Operation timed out)
06:16 🔗 QBcrusher has joined #archiveteam
06:17 🔗 ravetcofx has joined #archiveteam
06:29 🔗 unkn0wn_ has joined #archiveteam
07:19 🔗 crwbot_ has quit IRC (Quit: leaving)
07:34 🔗 anomie has quit IRC (Ping timeout: 250 seconds)
07:42 🔗 achip has quit IRC (west.us.hub irc.Prison.NET)
07:42 🔗 LastNinja has quit IRC (west.us.hub irc.Prison.NET)
07:42 🔗 tpw_rules has quit IRC (west.us.hub irc.Prison.NET)
07:42 🔗 K4k has quit IRC (west.us.hub irc.Prison.NET)
07:42 🔗 sivoais has quit IRC (west.us.hub irc.Prison.NET)
07:42 🔗 mundus201 has quit IRC (west.us.hub irc.Prison.NET)
07:42 🔗 ColdIce has quit IRC (west.us.hub irc.Prison.NET)
07:42 🔗 midas1 has quit IRC (west.us.hub irc.Prison.NET)
07:42 🔗 achip has joined #archiveteam
07:42 🔗 LastNinja has joined #archiveteam
07:42 🔗 tpw_rules has joined #archiveteam
07:42 🔗 K4k has joined #archiveteam
07:42 🔗 sivoais has joined #archiveteam
07:42 🔗 mundus201 has joined #archiveteam
07:42 🔗 ColdIce has joined #archiveteam
07:42 🔗 midas1 has joined #archiveteam
07:42 🔗 irc.Prison.NET sets mode: +o midas1
07:49 🔗 spiko has joined #archiveteam
07:53 🔗 anomie has joined #archiveteam
07:58 🔗 FalconK has quit IRC (Ping timeout: 260 seconds)
08:08 🔗 odemg has quit IRC (Remote host closed the connection)
08:43 🔗 pikhq_ has quit IRC (Ping timeout: 245 seconds)
08:50 🔗 pikhq has joined #archiveteam
08:58 🔗 schbirid has joined #archiveteam
09:31 🔗 i336_ has joined #archiveteam
09:56 🔗 FalconK has joined #archiveteam
10:50 🔗 ravetcofx has quit IRC (Read error: Operation timed out)
11:12 🔗 BlueMaxim has quit IRC (Quit: Leaving)
11:15 🔗 Morbus has joined #archiveteam
11:25 🔗 BiggieJon has joined #archiveteam
11:28 🔗 Morbus has quit IRC (Quit: http://www.disobey.com/)
11:59 🔗 Morbus has joined #archiveteam
12:22 🔗 scyther How would i go about uploading about 3TB of data? I tried messaging sketchcow, but he didn't respond
12:41 🔗 pizzaiolo has joined #archiveteam
13:07 🔗 i336_ has quit IRC (Ping timeout: 260 seconds)
13:40 🔗 nertzy has joined #archiveteam
13:55 🔗 nertzy has quit IRC (Ping timeout: 255 seconds)
14:09 🔗 closure has quit IRC (Ping timeout: 244 seconds)
14:15 🔗 nertzy has joined #archiveteam
14:18 🔗 closure has joined #archiveteam
14:44 🔗 will has quit IRC (Ping timeout: 244 seconds)
14:45 🔗 will has joined #archiveteam
14:50 🔗 kris33 has quit IRC (Textual IRC Client: www.textualapp.com)
14:51 🔗 HCross has quit IRC (Read error: Connection reset by peer)
14:52 🔗 kris33 has joined #archiveteam
14:52 🔗 HarryCros has joined #archiveteam
14:55 🔗 paparus has joined #archiveteam
14:55 🔗 Boppen has quit IRC (Ping timeout: 194 seconds)
14:56 🔗 Boppen has joined #archiveteam
15:02 🔗 kris33 has quit IRC (Textual IRC Client: www.textualapp.com)
15:03 🔗 kris33 has joined #archiveteam
15:04 🔗 nertzy has quit IRC (Ping timeout: 255 seconds)
15:05 🔗 kris33 has quit IRC (Textual IRC Client: www.textualapp.com)
15:07 🔗 nertzy has joined #archiveteam
15:07 🔗 kris33 has joined #archiveteam
15:30 🔗 ravetcofx has joined #archiveteam
15:49 🔗 kris33 has quit IRC (Textual IRC Client: www.textualapp.com)
15:51 🔗 kris33 has joined #archiveteam
15:59 🔗 kris33 has quit IRC (Textual IRC Client: www.textualapp.com)
16:31 🔗 rocode scyther, use the IA CLi tool, and if possible cut it up into 400gb chunks
16:32 🔗 scyther okay, so no special permission needed for stuff this big?
16:34 🔗 rocode Nope. Do try to break it into smaller chunks though so the derive task doesn't murder you in your sleep.
16:34 🔗 rocode What are you uploading?
16:49 🔗 scyther backup of nintendo game servers
16:50 🔗 scyther encrypted binaries of games, but when they pull them (they already pulled some), people who have the title keys can download from archive
17:02 🔗 arkiver nice
17:02 🔗 arkiver are these individual games?
17:03 🔗 arkiver or like packs of games?
17:03 🔗 arkiver if you have any metadata and depending on the number of games it might be nice to upload them as multiple items
17:23 🔗 odemg has joined #archiveteam
17:23 🔗 odemg has quit IRC (Connection closed)
17:24 🔗 odemg has joined #archiveteam
17:29 🔗 jonty has joined #archiveteam
17:29 🔗 jonty Hello!
17:30 🔗 jonty I'd like to ensure that all of gov.uk is archived - a random sampling of URL's from the sitemap shows it has very little
17:30 🔗 jonty What's the best way to go about doing this?
17:30 🔗 jonty It's about 300k urls
17:30 🔗 jonty I was about to just hit https://web.archive.org/save/ for all of them, but then realised there must be a better way using the appliance or something
17:31 🔗 jonty Could someone point me at the right thing to be doing?
17:39 🔗 rocode jonty, #cheetoflee and http://archiveteam.org/index.php?title=Government_Backup
17:45 🔗 jonty Thanks!
17:58 🔗 btfo has quit IRC (Read error: Operation timed out)
17:59 🔗 unkn0wn_ has quit IRC ()
18:04 🔗 Morbus has quit IRC (http://www.disobey.com/)
18:10 🔗 odemg has quit IRC (Remote host closed the connection)
18:53 🔗 unkn0wn_ has joined #archiveteam
18:54 🔗 odemg has joined #archiveteam
18:59 🔗 maz1324 has joined #archiveteam
19:01 🔗 maz1324 has quit IRC (Remote host closed the connection)
19:04 🔗 maz1324 has joined #archiveteam
19:07 🔗 odemg has quit IRC (Remote host closed the connection)
19:10 🔗 ZexaronS has joined #archiveteam
19:11 🔗 schbirid scyther/rocode: iirc a much smaller size is nicer, eg 50G
19:11 🔗 odemg has joined #archiveteam
19:11 🔗 schbirid that enables IA to distribute items across storage much nicer
19:19 🔗 yipdw yeah, it also helps when people want to download it
19:19 🔗 yipdw it happens sometimes
19:24 🔗 maz1324 has quit IRC (Quit: http://chat.efnet.org )
19:55 🔗 mls has quit IRC (Ping timeout: 250 seconds)
20:05 🔗 maelstrom has joined #archiveteam
20:18 🔗 Ravenloft has joined #archiveteam
20:19 🔗 odemg has quit IRC (Remote host closed the connection)
20:24 🔗 mls has joined #archiveteam
20:33 🔗 lordcosmo has joined #archiveteam
20:39 🔗 lordcosmo has quit IRC (Read error: Connection reset by peer)
20:40 🔗 lordcosmo has joined #archiveteam
20:49 🔗 lordcosmo has quit IRC (Ping timeout: 250 seconds)
20:50 🔗 lordcosmo has joined #archiveteam
20:55 🔗 lordcosmo has quit IRC (Ping timeout: 250 seconds)
20:56 🔗 lordcosmo has joined #archiveteam
21:03 🔗 tsr has quit IRC (Quit: foo)
21:05 🔗 lordcosmo has quit IRC (Ping timeout: 250 seconds)
21:07 🔗 lordcosmo has joined #archiveteam
21:07 🔗 tsr has joined #archiveteam
21:12 🔗 nightpool cool article, if you missed it: http://venturebeat.com/2017/02/14/the-internet-archive-wants-to-host-pacer-records-from-u-s-courts-and-make-them-available-for-free/
21:15 🔗 cheez has joined #archiveteam
21:16 🔗 cheez just heard about this effort, wanted to say thanks for making it
21:16 🔗 schbirid has quit IRC (Quit: Leaving)
21:18 🔗 VADemon has joined #archiveteam
21:26 🔗 Lord_Nigh SketchCow: how goes the noaa archiving thing? iirc someone on /r/datahoarders has the whole thing (400GB or something)
21:29 🔗 n00b616 has joined #archiveteam
21:34 🔗 n00b616 has quit IRC (Quit: Page closed)
21:35 🔗 BlueMaxim has joined #archiveteam
21:37 🔗 FalconK cf. recap
21:38 🔗 FalconK we'll get them, will they or won't they.
21:38 🔗 lordcosmo has quit IRC (Ping timeout: 250 seconds)
21:45 🔗 icedice has joined #archiveteam
22:10 🔗 Honno has quit IRC (Read error: Connection reset by peer)
22:13 🔗 ndiddy has joined #archiveteam
22:21 🔗 crwbot has joined #archiveteam
22:33 🔗 Kaz SketchCow: when you've got a few minutes - /pipeline is broken again
22:35 🔗 Oddy has joined #archiveteam
22:41 🔗 Oddy goodevening
22:41 🔗 btfo has joined #archiveteam
22:54 🔗 icedice has quit IRC (Read error: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac)
22:55 🔗 icedice has joined #archiveteam
23:01 🔗 rocode https://www.reddit.com/r/HumansBeingBros/comments/5u1aj6/on_saturday_morning_200_hackers_at_uc_berkeley/
23:02 🔗 rocode ^- actually related.
23:42 🔗 QBcrusher has quit IRC (Ping timeout: 492 seconds)
23:44 🔗 odemg has joined #archiveteam
23:48 🔗 odemg has quit IRC (Remote host closed the connection)
23:49 🔗 odemg has joined #archiveteam
23:57 🔗 Famicoman has joined #archiveteam
23:58 🔗 ats has quit IRC (Read error: Operation timed out)

irclogger-viewer