#archiveteam-bs 2016-12-12,Mon

↑back Search

Time Nickname Message
00:08 🔗 i336_ the ex.ua project just got started in the tracker, everyone, so you might like to start running it. it's currently at 0
00:31 🔗 brayden has joined #archiveteam-bs
00:31 🔗 swebb sets mode: +o brayden
01:07 🔗 godane can anyone figure out why this domain is going slow: http://archive1.rthk.hk/mp4/tv/2013/201303261900_h.mp4
01:12 🔗 HCross2 It's in Hong Kong, a long way from you in the states
01:26 🔗 godane i maybe able to get a few shows from there english site
01:27 🔗 godane but still is all hosted in Hong Kong
01:27 🔗 godane good news is ffmpeg downloads the streams faster then wget
01:41 🔗 i336_ godane: maybe wget's UA is ratelimited
01:42 🔗 i336_ also, just to let everyone know, ex.ua is up, but not many are running it - check your warriors!
02:05 🔗 i336_ wat
02:05 🔗 i336_ my warrior is sitting there doing nothing
02:05 🔗 i336_ what do I do now?
02:05 🔗 i336_ can I ^C and restart it?
02:08 🔗 yipdw is it really doing nothing, or is it working
02:08 🔗 yipdw you have 23 open claims
02:08 🔗 i336_ the last lines are:
02:08 🔗 i336_ Finished WgetDownload for Item filelists:58570500-58570999
02:08 🔗 i336_ Starting PrepareStatsForTracker for Item filelists:58570500-58570999
02:08 🔗 i336_ Finished PrepareStatsForTracker for Item filelists:58570500-58570999
02:08 🔗 i336_ Starting MoveFiles for Item filelists:58570500-58570999
02:08 🔗 i336_ Finished MoveFiles for Item filelists:58570500-58570999
02:08 🔗 i336_ (sorry)
02:08 🔗 i336_ but yeah. it's stuck. sitting there.
02:10 🔗 yipdw please use pastebins or similar for that sort of stuff
02:10 🔗 yipdw anyway, the next step after MoveFiles is a concurrency-limited rsync upload
02:10 🔗 i336_ oh...
02:10 🔗 i336_ and sorry
02:10 🔗 yipdw the default is 1 rsync uploader per process
02:11 🔗 i336_ I did --concurrency 20
02:11 🔗 yipdw if you have 23 runners in the same process, 22 are going to wait
02:11 🔗 yipdw there's the problem
02:11 🔗 i336_ ???
02:11 🔗 i336_ did I break my crawl? :(
02:11 🔗 yipdw https://github.com/ArchiveTeam/exua-grab/blob/master/pipeline.py#L282
02:12 🔗 yipdw rsync processes per pipeline process is limited to 4, defaults to 1
02:12 🔗 i336_ hmm.
02:12 🔗 yipdw there's a few reasons why we do this
02:12 🔗 yipdw the biggest reason is that that's been part of the code that gets copied across projects
02:12 🔗 i336_ I see
02:12 🔗 yipdw the other reasons include connections typically being asymmetric and limited number of connections per rsync host
02:13 🔗 i336_ mmmm, right
02:13 🔗 i336_ well, I may have a bigger problem: iftop on the machine in question is currently only talking to 192.168.x.x and my IP address.
02:13 🔗 yipdw in any case, your --concurrent 20 is going to block 19 workers at the rsync stage
02:13 🔗 i336_ showing the machine talking*
02:13 🔗 i336_ oh yikes
02:13 🔗 i336_ if I ^C it, will it restart from scratch?
02:13 🔗 yipdw no, the claims you have out will remain out until someone recycles them
02:14 🔗 yipdw restarting the process will restart the pipeline from the fetch stage
02:14 🔗 i336_ I mean - will the data I've been downloaded be found and sent?
02:14 🔗 i336_ ah.
02:14 🔗 yipdw no, that data will only be sent as part of the rsync stage
02:14 🔗 yipdw typically it doesn't matter
02:14 🔗 i336_ okay. so should I ^C it then? how many workers should I run with/
02:14 🔗 i336_ ?
02:14 🔗 yipdw 4
02:14 🔗 yipdw or 2
02:15 🔗 * i336_ restarts it with 3 :P
02:15 🔗 yipdw the warrior VM has a limit of 6
02:15 🔗 i336_ I see
02:15 🔗 * i336_ uses 4 then
02:15 🔗 jrwr has joined #archiveteam-bs
02:15 🔗 i336_ ....it's now saying "stopping when current tasks are completed" and waiting.
02:15 🔗 i336_ I'm curious what it's waiting for.
02:15 🔗 yipdw waiting for tasks to complete
02:16 🔗 i336_ yeah - but... what?
02:16 🔗 yipdw a task is one full trip through the pipeline
02:16 🔗 yipdw i.e. rsync upload
02:16 🔗 i336_ rsync isn't running _at all_
02:16 🔗 i336_ and it is installed, fwiw
02:16 🔗 RichardG_ is now known as RichardG
02:17 🔗 i336_ checking htop, run-pipeline has no child processes running underneath it.
02:17 🔗 yipdw interrupt again to force-quit the process and this time run with a more sensible number of workers
02:17 🔗 yipdw like 2
02:17 🔗 yipdw if there is a problem spawning rsync, that'll make it much easier to diagnose
02:17 🔗 i336_ okay :(
02:17 🔗 i336_ done
02:20 🔗 i336_ argh. "out" just went from 23 to 3 :<
02:20 🔗 yipdw it's fine
02:20 🔗 yipdw I requeued them
02:20 🔗 i336_ yeah
02:20 🔗 i336_ and I just figured it out
02:20 🔗 i336_ I started 20 processes
02:20 🔗 i336_ the ratelimiter was pausing them
02:20 🔗 i336_ so they were literally waiting to run
02:20 🔗 i336_ right?
02:21 🔗 yipdw they may have gotten to some stage of completion
02:21 🔗 yipdw but they're not going to be counted as done until the work item makes it through the pipeline and checked in
02:21 🔗 i336_ yup.
02:22 🔗 i336_ I saw a lot of messages about "we don't want to overload this resource so we're waiting" at the start of the run
02:22 🔗 i336_ heh
02:22 🔗 yipdw that's a tracker-side rate limit
02:22 🔗 i336_ oh okay
02:22 🔗 i336_ I'm not sure then.
02:22 🔗 i336_ hopefully this works
02:23 🔗 nicolas17 is there any point in adding more warrior nodes once the tracker rate limiter is already being hit?
02:23 🔗 arkiver the limit is currently at 2 items per minute
02:23 🔗 arkiver I'll raise it if the site can handle it
02:23 🔗 yipdw nicolas17: for a given project, not really
02:24 🔗 yipdw they might be better utilized on some other warrior project
02:24 🔗 arkiver for this project it wouldn't hurt, since we don't really know where the limit is. Just keeping it low for now to see if the site handle it
02:25 🔗 i336_ I see
02:25 🔗 arkiver and will raise it as long as the site remain stable (since we also have only 20 days)
02:26 🔗 i336_ mmmm
02:32 🔗 yipdw exua-grab so far is proceeding normally here
02:32 🔗 i336_ ok. it just got to where it stalled before
02:33 🔗 i336_ yipdw: htop is showing nothing running underneath python again
02:33 🔗 i336_ yipdw: note that this is running the crawler directly from git, on freebsd
02:34 🔗 i336_ a) what can I look for/at? where's the debug/status info? what can I inspect for sanity? b) you can SSH in if you want
02:35 🔗 yipdw we don't really run this code on FreeBSD that often
02:35 🔗 i336_ I realize that - but my friend's PC with the ZFS pool is running freebsd, so I'm trying to use it
02:35 🔗 i336_ Kaz didn't mention if I could use his VPS for this so I haven't
02:36 🔗 joepie91 hey, a FreeBSD user
02:36 🔗 joepie91 :P
02:36 🔗 yipdw a warrior client doesn't need a ZFS pool
02:36 🔗 i336_ in this context I mean "pile of diskspace"
02:36 🔗 yipdw I know what a ZFS pool is
02:36 🔗 i336_ right
02:36 🔗 yipdw it's still not really needed for a warrior
02:36 🔗 joepie91 warriors don't usually need a lot of disk space fwiw
02:36 🔗 i336_ yeah
02:37 🔗 nicolas17 warriors download, upload, delete
02:37 🔗 yipdw I have a FreeBSD system here, I'll try to debug
02:37 🔗 i336_ unfortunately, at this point I also mean "PC with bandwidth", my own internet is 50GB/mo and HTML5(TM) uses most of that sadly (yup)
02:37 🔗 i336_ yipdw: wget-lua was fun to build, but I got it working
02:37 🔗 nicolas17 i336_: hope you use an adblocker
02:37 🔗 yipdw I have exua-grab uploading filelists:82940500-82940999
02:37 🔗 i336_ nicolas17: yup, /etc/hosts file
02:37 🔗 yipdw and done
02:37 🔗 joepie91 i336_: I'm guessing you ran into this? https://github.com/joepie91/isohunt-grab#for-freebsd
02:38 🔗 * nicolas17 gets 50MB/day on his phone
02:38 🔗 yipdw ok, so we know the grab works on Ubuntu
02:38 🔗 i336_ nicolas17: wow.
02:38 🔗 i336_ joepie91: yup, and managed to get past it
02:38 🔗 nicolas17 still cheaper than communicating over SMS :P
02:38 🔗 i336_ lol, yeah
02:38 🔗 joepie91 i336_: always fun to hear that issues from several years ago are still issues :P
02:38 🔗 i336_ hahaha
02:39 🔗 * i336_ swat
02:43 🔗 yipdw I'll resume poking at this in a bit; I need to hop on a conference call
02:43 🔗 i336_ okay. thanks!
02:44 🔗 yipdw in the meantime, if you can run the grabber on a Linux-ish machine you may have better luck
02:44 🔗 * nicolas17 has a bored EC2, should look into it
02:45 🔗 * i336_ volunteers to be sysadmin
02:45 🔗 i336_ (for exua crawling specifically :P)
02:45 🔗 i336_ although it's not hard, tbh.
02:57 🔗 ndiddy has quit IRC (Quit: Leaving)
02:59 🔗 compu_85 the warrior seems to be running fine for me on this project
03:00 🔗 compu_85 so far
03:06 🔗 i336_ okay, </lunch>
03:06 🔗 i336_ time to see if I can figure out what's going on
03:06 🔗 i336_ hopefully I can
03:06 🔗 i336_ it's still stalled
03:14 🔗 RichardG has quit IRC (Quit: Keyboard not found, press F1 to continue)
03:32 🔗 Fletcher has quit IRC (Ping timeout: 244 seconds)
03:40 🔗 Fletcher has joined #archiveteam-bs
04:01 🔗 jrwr has quit IRC (Remote host closed the connection)
04:07 🔗 i336_ okay, this is really annoying
04:07 🔗 i336_ I enabled the python debugger, but ex.ua has ratelimited me so it's taking hours to crash
04:07 🔗 i336_ I feel so stupid downloading content just to make it crash >.<
04:08 🔗 i336_ I think I need to take a break for a little while. I'm just frustrated that I don't have resources. not anyone's fault but my own, I'm just really aware that ex.ua will be gone in a few days and there's really nothing I can do about it and it makes me really sad :(
04:45 🔗 SketchCow Hurrah
04:46 🔗 SketchCow PurpleSym: The next round through, I'm sure that'll happen
04:48 🔗 i336_ SketchCow: I got your email reply back - thanks so much for that. The exua archiver project is running in the tracker! arkiver's current project is to save the file references, I'm also hoping we can save the discussions on the site as well. There are a lot of access vectors ex.ua forgot to turn off :D
04:49 🔗 i336_ SketchCow: I understand (but have no real information) that there might be some discussions on Monday regarding the actual content on the site. That will be /interesting/, I'm sure. :)
04:56 🔗 BlueMaxim has quit IRC (Read error: Operation timed out)
05:17 🔗 SketchCow You got Arkiver's
05:17 🔗 SketchCow Arkiver's the dude
05:17 🔗 i336_ ^^
05:18 🔗 i336_ okay then
05:19 🔗 i336_ I'll reiterate my question
05:20 🔗 i336_ I'm trying to figure out how to prototype new crawler projects locally... where should I look on the wiki?
05:22 🔗 i336_ ...I don't really want to set up a full tracker. is there no local mode?
05:23 🔗 i336_ I'm going out now, I'm really looking forward to figuring this out, if anyone can answer while I'm gone I'd really appreciate it. I want to try and help with my own archiving code
05:24 🔗 i336_ i336_ will disconnect in a couple minutes but i336 is still here, so I'll still see what everyone says
05:25 🔗 yipdw there isn't a local mode; you're running a test tracker project, a test tracker, or you substitute in a mock
05:25 🔗 yipdw the third option hasn't been implemented
05:26 🔗 yipdw you can accelerate a local tracker setup with https://github.com/ArchiveTeam/archiveteam-dev-env
05:26 🔗 yipdw specifically, the linked OVA
05:26 🔗 yipdw #warrior is for discussion of these tools
05:29 🔗 i336_ has quit IRC (Read error: Operation timed out)
05:33 🔗 ravetcofx has joined #archiveteam-bs
05:36 🔗 Sk1d has quit IRC (Ping timeout: 250 seconds)
05:43 🔗 Sk1d has joined #archiveteam-bs
05:51 🔗 nicolas17 has quit IRC (Quit: nuff 4 2day)
06:07 🔗 Sk1d has quit IRC (Ping timeout: 194 seconds)
06:08 🔗 Sk1d has joined #archiveteam-bs
06:19 🔗 BlueMaxim has joined #archiveteam-bs
07:24 🔗 Start has quit IRC (Quit: Disconnected.)
07:31 🔗 Start has joined #archiveteam-bs
07:38 🔗 Start has quit IRC (Quit: Disconnected.)
08:02 🔗 GE has joined #archiveteam-bs
08:48 🔗 BlueMaxim has quit IRC (Quit: Leaving)
09:13 🔗 ravetcofx has quit IRC (Read error: Operation timed out)
09:23 🔗 GE has quit IRC (Remote host closed the connection)
10:41 🔗 BlueMaxim has joined #archiveteam-bs
11:06 🔗 GE has joined #archiveteam-bs
11:21 🔗 arkiver If anyone here 'lain'? or anyone knows who that it?
11:21 🔗 arkiver is*
11:28 🔗 Sanqui there may be several lains. I know somebody who has used that nick in the past.
11:28 🔗 Sanqui (several years ago, though.)
11:54 🔗 GE has quit IRC (Remote host closed the connection)
12:03 🔗 BlueMaxim has quit IRC (Quit: Leaving)
12:41 🔗 Whopper has joined #archiveteam-bs
13:15 🔗 RichardG has joined #archiveteam-bs
13:33 🔗 RichardG has quit IRC (Ping timeout: 244 seconds)
13:34 🔗 RichardG has joined #archiveteam-bs
13:42 🔗 GE has joined #archiveteam-bs
13:53 🔗 _desu___ has joined #archiveteam-bs
13:59 🔗 Ctrl-S___ has joined #archiveteam-bs
14:00 🔗 antonizoo has quit IRC ()
14:01 🔗 antonizoo has joined #archiveteam-bs
15:12 🔗 hook54321 has quit IRC ()
15:12 🔗 Yoshimura has quit IRC (Ping timeout: 255 seconds)
15:12 🔗 hook54321 has joined #archiveteam-bs
15:53 🔗 sep332 has joined #archiveteam-bs
16:06 🔗 Start has joined #archiveteam-bs
16:28 🔗 Yoshimura has joined #archiveteam-bs
17:10 🔗 godane so RTHK radio3 Hong Kong Today is starting to get uploaded: https://archive.org/search.php?query=subject%3A%22Hong+Kong+Today%22
17:18 🔗 t2t2 has quit IRC (Ping timeout: 633 seconds)
17:35 🔗 godane so i maybe going thur RTHK TV videos
17:36 🔗 godane it will be done sort of like how i did kpfa since the urls have the same pattens
19:42 🔗 ravetcofx has joined #archiveteam-bs
20:29 🔗 godane so turns out there is a pdfs of The Tech Newspaper
20:29 🔗 godane a newspaper from MIT going back to 1881
20:41 🔗 godane how do i get this to not be invalid date: date -d "November 16, 1881" +%Y-%m-%d
20:41 🔗 xmc seems to work for me
20:41 🔗 xmc $ date -d "November 16, 1881" +%Y-%m-%d
20:41 🔗 xmc 1881-11-16
20:43 🔗 godane i keep getting invalid date on my slackware
20:44 🔗 HCross2 godane: does the CMOS battery in your motherboard still have power in it?
20:44 🔗 HCross2 Do you get a bios error on startup?
20:45 🔗 godane hwclock still works
20:45 🔗 godane i also never noticed any bio errors on boot
20:45 🔗 xmc that probably has nothing to do with it ...
20:46 🔗 xmc um, is your machine 32 or 64 bit
20:46 🔗 xmc and when was it compiled?
20:47 🔗 xmc could be you have a 32-bit time_t
20:47 🔗 HCross2 Ahh, I misread. Formatting dates, not telling the current one. Sorry
20:47 🔗 xmc also, what's your timezone set to
20:47 🔗 godane i have 64 bit system with i486 slackware
20:47 🔗 godane i don't think my timezone is set
20:47 🔗 xmc try "echo $TZ"
20:48 🔗 godane its blank
20:48 🔗 xmc you're in eastern time, right? try TZ=EST5EDT date -d "November 16, 1881" +%Y-%m-%d
20:49 🔗 godane its still invalid
20:49 🔗 xmc hrum
20:49 🔗 xmc try dates in 1899, 1900, and 1901?
20:49 🔗 xmc basically, can you figure out when it becomes valid
20:50 🔗 godane even this doesn't work: date -d "18811116" +%Y-%m-%d\
20:51 🔗 xmc i suspect it doesn't like 1881. but it could be that it's before 1900, before timezones were invented, or more than two billion seconds before 0 unix time
20:52 🔗 godane looks like 1981 worked fine
20:53 🔗 godane 1902 is the earliest i could get
20:54 🔗 xmc more narrowly: try dec 12 1901 and dec 14 1901
20:55 🔗 godane it works on december 14 1901 but not december 12 or 13 1901
20:55 🔗 xmc ding ding ding
20:56 🔗 xmc you need a system with a 64-bit time_t
20:56 🔗 xmc i486 slackware won't have that, you need x86_64
20:56 🔗 xmc or you can figure out a workaround that doesn't use `date`
20:57 🔗 xmc computers are kind of garbage, sorry
21:12 🔗 godane i just figure i will do the The Tech newspaper with volume and issue
21:13 🔗 xmc and just put the date into the archive.org date-published field, that should be plenty
21:13 🔗 godane i can't put it as November 16, 1881 format
21:14 🔗 godane anyways i will be able to put date for issues past volume 21
21:14 🔗 * xmc nods
21:22 🔗 johansch has joined #archiveteam-bs
21:24 🔗 johansch So.. to continue my stream from #archiveteam (apologies =) )
21:25 🔗 johansch 98 GB per day (2012) for imgur.com averages out to about 9 Mbit/s, if i did the the calculations correctly
21:27 🔗 johansch assuming the did a 1.5x growth per year, that's about 45 Mbit/s today
21:28 🔗 johansch or about 500 GB/day
21:28 🔗 johansch that seems like something archive.org could handle...
21:29 🔗 johansch but that would of course assume there's an efficient mechanism to get new items...
21:36 🔗 SketchCow I don't think archive.org is going to back up imgur
21:36 🔗 SketchCow Maybe they might save the top, say, 100,000 items
21:37 🔗 RedType has left
21:37 🔗 johansch wow, this is a testy place..
21:38 🔗 xmc TRYING to keep the announcements channel clear from discussion of potential project logistics
21:38 🔗 xmc thank you for cooperating
21:38 🔗 johansch so maybe then rename it #archiveteam-annoucements and #archiveteam-discussions ?
21:39 🔗 xmc no
21:39 🔗 johansch just a newb's point of view.
21:39 🔗 xmc it's in the topic
21:39 🔗 Frogging 500GB per day is a lot, especially given that most of it is crap
21:40 🔗 Frogging A better way would be selectively archiving popular or unique/original content as linked from various communities
21:40 🔗 bwn has quit IRC (Read error: Operation timed out)
21:41 🔗 Igloo^_^ Have a criteria for x likes etc
21:42 🔗 Igloo^_^ Could work, but getting access to the statistics (c/w)ould be difficult
21:42 🔗 johansch @xmc i've go to congratulated you - way to make someone being enthusiastic about this topic feeling not-very-welcome..
21:42 🔗 Frogging Igloo^_^: A lot of it isn't submitted to the Imgur "gallery" with likes and all, it'd be linked from forums and subreddits
21:43 🔗 johansch s/congratulated/congratulate
21:43 🔗 xmc i'm not going to argue this inane topic with you endlessly, so as to leave this channel available for discussing the thing that you so clearly want to talk about
21:44 🔗 xmc you're in the right place! continue!
21:44 🔗 Frogging sets mode: +oo HCross2 joepie91
21:44 🔗 Igloo^_^ True Frogging, You'd need to have internal stats to get the information. Best have a method to have people request somehow (like how we're doing vine)
21:45 🔗 Igloo^_^ is now known as Igloo
21:45 🔗 SketchCow Like I said. Top 100,000 probably
21:45 🔗 Frogging top by what measure?
21:46 🔗 xmc why not by all six measures you can think of, at most it'll be less than a million pictures
21:46 🔗 Frogging true
21:46 🔗 johansch you could crawl reddit.. get the list of the top 10k subs, crawl them continously to catch imgur references
21:48 🔗 johansch anything that gets on the frontpages of those 10k subs would be a pretty good candidate for archival
21:48 🔗 bwn has joined #archiveteam-bs
21:53 🔗 godane looks like i can get video from rthk.hk going back to at least 2012-09
22:02 🔗 BlueMaxim has joined #archiveteam-bs
22:32 🔗 squires johansch: the best way to process reddit data is through bigquery. it will be insanely more efficient to write your filter in bigquery than to crawl reddit pages.
22:32 🔗 johansch how do you figure? you mean all of the reddit data is already in BQ?
22:33 🔗 bwn has quit IRC (Ping timeout: 244 seconds)
22:34 🔗 squires yes
22:49 🔗 bwn has joined #archiveteam-bs
22:51 🔗 i336 johansch: hey there. I'm also very new here. I was similarly squashed when I tried to post to #archiveteam-bs. that channel is basically for announcements only that everyone needs to read, so general chat makes people go "oh something important" and then when it's not they get mad
22:51 🔗 godane so i'm at 1019k items now
22:51 🔗 i336 johansch: chat in here is fine though. about imgur, that is a _lot_ of data. like many GB of data. the thing is, most of it is junk.
22:52 🔗 i336 godane: I'm running slackware too, November 16, 1881 works fine for me, as does November 10, 1790
22:52 🔗 godane weird
22:53 🔗 johansch hi i336.. i found someone who seems quite initiated who was happy to talk to me in private... :)
22:53 🔗 i336 johansch: oh that's great to hear ^^
22:55 🔗 i336 godane: the only thing I can think is that our /etc/localtime is different. mine's EST
22:56 🔗 johansch I would still like to re-state a fact: the way you guys are naming this pair of channels is like you're setting up newbs to run away, screaming.
22:57 🔗 xmc thank you for your input
22:57 🔗 i336 it is a bit unintuitive. no disrespect but I did feel like I got pelted with a bucket of water yesterday - but only a bit, I did get it in the end.
22:57 🔗 xmc unfortunately it is impossible to rename channels once they're established
22:57 🔗 johansch well, no, it's not
22:57 🔗 * i336 mumbles something inaudiable about channel forwards
22:57 🔗 xmc efnet does not provide any facilities in that regard
22:58 🔗 i336 o.o
22:58 🔗 xmc efnet does not provide any services at all
22:58 🔗 xmc we're camping in the mountains
22:58 🔗 i336 wow. I see. maybe a topic change could be in order
22:58 🔗 xmc have to bring your own water, have to pack out all your cookie boxes
22:58 🔗 i336 haha
23:00 🔗 * i336 s/lengthy\/off-topic in #archiveteam-bs/this channel is for alerts only - use #archiveteam-bs for ALL discussions/
23:01 🔗 * i336 queues an edit to the wiki that says "if you're new, you probably want to join #archiveteam-bs"
23:01 🔗 * i336 nudges the s// and queue in xmc's direction[1~
23:01 🔗 xmc why don't you edit the wiki?
23:01 🔗 i336 okay!
23:01 🔗 xmc it's a -*- wiki -*-
23:01 🔗 i336 I didn't know I had edit permission. caveat emptor
23:01 🔗 xmc everyone does, once they join
23:02 🔗 i336 oh.
23:02 🔗 * i336 facepalm
23:03 🔗 i336 xmc: I can't edit the front page. that's where I'd want to add this.
23:04 🔗 xmc turns out i can't either
23:04 🔗 * i336 is unsure what to say at this point
23:04 🔗 xmc SketchCow: beep
23:06 🔗 i336_ has joined #archiveteam-bs
23:06 🔗 i336_ that's better
23:06 🔗 i336_ local irssi ftw
23:06 🔗 xmc http://archiveteam.org/index.php?title=IRC#Special_ArchiveTeam_IRC_rules
23:06 🔗 xmc i should point out that everything you've brought up is already listed
23:07 🔗 i336_ you're right
23:08 🔗 i336_ I was originally going to try and put #archiveteam-bs references to all the spots the channel is referenced on the main page, but as I read this, I switched tactics and decided it would be a better idea to just put links to this page near all the IRC references instead
23:08 🔗 SketchCow Whut
23:09 🔗 SketchCow A little hint
23:10 🔗 SketchCow In you walk into a 7 year old channel going "u dun it rong"
23:10 🔗 SketchCow You might not have all the facts.
23:11 🔗 i336_ SketchCow: I completely agree. I figured this was an established place. I wanted to edit the wiki to make the rules more visible/accessible to newcomers so they can get up to speed much more quickly on the established way things are done. I have zero problem with them.
23:11 🔗 xmc go ahead and edit the wiki to your satisfaction
23:11 🔗 * i336_ reiterates the small tidbit about being unable to edit the main page
23:12 🔗 i336_ maybe I can create/use a sandbox page, and someone can copy it over if they like it
23:12 🔗 xmc other than that page, go ahead and edit the wiki to your satisfaction
23:12 🔗 i336_ lol. okay
23:15 🔗 i336_ (it isn't finished yet)
23:17 🔗 GE has quit IRC (Remote host closed the connection)
23:19 🔗 Aranje has joined #archiveteam-bs
23:21 🔗 SketchCow Edit it, put in your suggestions, and I'll approve them over.
23:27 🔗 SketchCow I'm having feelings
23:27 🔗 SketchCow Do I need to break some overzealous dreams here?
23:28 🔗 zino You know you want to.
23:28 🔗 i336_ okay, about to tackle the sandbox page. my two edits to http://archiveteam.org/index.php?title=IRC are the bold bit at the top and the first bullet point in the special IRC rules section.
23:28 🔗 i336_ (thoughts/suggestions welcome)
23:29 🔗 SketchCow This is all going to end sadly, I can see that.
23:30 🔗 SketchCow So, hi, I'm Jason.
23:30 🔗 SketchCow Somewhere, down in the bedrock of Archiveteam, is me.
23:31 🔗 SketchCow You gotta really, really, really work hard these days to dig that far down.
23:31 🔗 SketchCow These are good folks, they get amazing work done.
23:31 🔗 SketchCow So 99.9999% of the time I'm not even needed in the channel. Magic happens.
23:31 🔗 SketchCow You have successfully dug down.
23:31 🔗 SketchCow Now you have me. Hi.
23:31 🔗 * xmc waves quietly
23:31 🔗 * i336_ waves quietly too
23:32 🔗 SketchCow Now, you have multiple projects you've dreamed up.
23:32 🔗 SketchCow One is to save ex.ua.
23:32 🔗 xmc i've been around since day 1 also, but i'm different
23:32 🔗 SketchCow One is to fuck with rover.info to get to ex.ua.
23:33 🔗 i336_ SketchCow: rover.info has mostly the same info on it, but better
23:33 🔗 SketchCow Somewhere down here, you have now encountered several roadblocks, enough that multiple people are messaging me.
23:33 🔗 i336_ o.o
23:33 🔗 i336_ okay, I don't want to do that. that's a bit of a freakout
23:33 🔗 SketchCow It's pretty hard to get multiple people to message me, unless you are all buying me a cake.
23:33 🔗 i336_ okay... whatever line I've stepped on I'd like to say I'm sorry upfront
23:33 🔗 i336_ so, sorry
23:34 🔗 SketchCow Realize there's no single way you can get past me.
23:34 🔗 SketchCow Let's start with that.
23:34 🔗 i336_ okay. not trying to do that, if I seem to be trying to do that then I've made some mistakes somewhere
23:34 🔗 johansch this place does seem very talented at discouraging newbies.
23:35 🔗 i336_ johansch: shh. let me figure out what's happened first. for what it's worth I have possibly been trying too hard.
23:35 🔗 SketchCow sets mode: +b *!*webchat@*.02-2-6c6b701.cust.bredbandsbolaget.se
23:35 🔗 johansch was kicked by SketchCow (johansch)
23:35 🔗 SketchCow Yes, it is.
23:35 🔗 i336_ yikes
23:35 🔗 Frogging lol
23:35 🔗 SketchCow Let me tell you what to not get hung up on.
23:36 🔗 SketchCow - Do not get hung up on someone losing a massive collection of hollywood films
23:36 🔗 SketchCow - Do not get hung up on a crappy .ua version of what_cd
23:36 🔗 SketchCow - Do get hung up on unique Ukranian culture
23:36 🔗 SketchCow - Do get hung up on unique support materials for same
23:37 🔗 SketchCow If you are capable of finding the first two, then Archive Team can help
23:37 🔗 SketchCow And the Internet Archive can probably take it.
23:37 🔗 SketchCow If not, no.
23:37 🔗 Stiletto has quit IRC (Ping timeout: 246 seconds)
23:38 🔗 SketchCow But don't dream up ridiculous rube goldberg approaches using a combination on darknets and dvrs and dragging us down into DC+++ and god knows what else.
23:38 🔗 i336_ oh lol
23:38 🔗 SketchCow Do you know someone who can be mailed a hard drive, who can do the work.
23:38 🔗 i336_ alright.
23:38 🔗 i336_ noone with fast internet, unfortunately. :(
23:38 🔗 SketchCow Do you know some way to grab unique material without trying to flood all our channels
23:39 🔗 SketchCow Because we're about to ctrl+c and ctrl+v the government over here
23:39 🔗 SketchCow I've hit refresh 3 times and I've still not had any mail to archiveteam from a furious johansch
23:39 🔗 SketchCow I am disappoint.
23:40 🔗 i336_ OK. let me try and explain a bit
23:40 🔗 SketchCow We have interest and are willing to support trying to grab some useful parts of ex.ua
23:40 🔗 SketchCow There's not much left to explain, but go ahead.
23:40 🔗 SketchCow In here.
23:40 🔗 SketchCow And not through /msg mania
23:41 🔗 i336_ fussy but relevant background: I've had some long-term issues with storage and diskspace for about the last 10 years. TL;DR being given old people's computers and no central file server w/ a big disk = snowball of duplicates. trying to solve it, financial issues. so saving stuff is a big deal for me. #2. I saw all the people going "noooo D:" about what.cd, and I guess I fixated on that not happening
23:41 🔗 i336_ again. #3. I don't come up with good ideas when things are going 1,000 miles an hour like this (20 days to save a gigantic website). I have known bugs seeing the bigger picture.
23:42 🔗 xmc trust us, the what.cd data is safe (if inaccessible)
23:43 🔗 i336_ so. unhelpful biases combined with a brain that focuses on detail too much and takes time to come up with ideas that are actually good = I'll acknowledge I've messed around with and mildly annoyed everyone here to some extent
23:43 🔗 i336_ xmc: that is awesome to hear. is that 99% of it, or a 100% snapshot, if I may ask? I have nothing I can do with the info except smile, if it's 100% :)
23:43 🔗 SketchCow What exactly is the difference between 99% and 100%
23:44 🔗 i336_ "we had arrangements in place but when the plug got pulled our $mirroring_system didn't get the last bit"
23:44 🔗 i336_ pretty much any applicable direct interpretation of that
23:45 🔗 SketchCow Yes, I'm just philosophically asking why that's a meansurement of happiness.
23:45 🔗 SketchCow Getting hung up on "we got most of it" vs. "we got all of it" is how you end up drinking too much and drowning in a bathtub
23:45 🔗 SketchCow Ever hear of fdupes? Use fdupes
23:46 🔗 i336_ fdupes would be awesome if I didn't have 10 HDDs I can't all have plugged in at once... some of which are clicking and need backing up to other disks before I can even use them
23:48 🔗 i336_ I do hear you, I've thought of a lot of ideas, most of them boil down to "agh, TB HDDs are hundreds of dollars, and I have all this other medical stuff that's chewing up my funds first." (annoying and boring long story)
23:49 🔗 xmc as a first approximation, https://archive.org/upload can help take the edge off
23:50 🔗 * i336_ shakes fist at 50GB/mo ISP bandwidth cap (which is pretty much at capacity as the end of the month rolls around)
23:50 🔗 i336_ besides that, I'm on ADSL2+. 80KB/s upload.
23:51 🔗 godane i336_: you may need a local wikipedia
23:51 🔗 xmc you can mail me sd cards or usb drives or whatever and i'll upload it, if you include metadata
23:51 🔗 godane http://download.kiwix.org/zim/wikipedia/?C=M;O=D
23:51 🔗 i336_ godane: I would actually consider that, but I don't browse the site frequently enough
23:51 🔗 i336_ xmc: huh, nice. I'll keep that in mind :)
23:51 🔗 godane i sort of figure that
23:52 🔗 godane but its something
23:52 🔗 xmc i336_: and i'll mail them back, because it sounds like money is an issue for you
23:52 🔗 godane also grab No-Intro collection for tons of retro gaming
23:52 🔗 i336_ godane: oh, okay :>
23:53 🔗 godane this is also the RACHEL project from world possible : http://dev.worldpossible.org/cgi/rachelmods.pl
23:53 🔗 godane it has wikihow
23:53 🔗 i336_ xmc: well, not for the normal/average reasons - I'm just on disability support for mental health issues, and I understand that means I can't earn very much income because of it. but the various medicinal things I require chew up most of the budget. interesting deadlock I've been trying to headscratch how to solve for a while
23:54 🔗 xmc mmm, tricky, that
23:54 🔗 SketchCow godane would know nothing about that
23:54 🔗 i336_ godane: this is interesting. what is this?
23:54 🔗 i336_ SketchCow: what do you mean?
23:55 🔗 i336_ godane: ...oh, it's a portable internet-in-a-box. cool!
23:55 🔗 SketchCow godane, pretty much our most prolific contributor, is on disability
23:55 🔗 SketchCow and is also awesome
23:55 🔗 i336_ wow
23:55 🔗 SketchCow never gets kicked much
23:55 🔗 SketchCow Listens
23:55 🔗 SketchCow I like that guy
23:55 🔗 godane i have a inbox box
23:56 🔗 i336_ I have problems with listening and understanding too; I often don't reach the "aha" moment until some measure of irritation and "what is this guy even..." has happened a bit. Known bug, trying to fix.
23:56 🔗 xmc interim measure, keep your mouth closed a bit more often than is comfortable
23:56 🔗 xmc it actually works really well
23:56 🔗 i336_ I'll try that. interesting way of putting it.
23:57 🔗 i336_ thanks
23:57 🔗 xmc but yeah. when you think "this is confusing" just stay confused and read some more, then if you're *still* confused after a kinda uncomfortably long time, do what you would have done
23:57 🔗 xmc but you have the advantage of having done your homework first

irclogger-viewer