[00:08] *** BlueMaxim has joined #archiveteam-bs [01:04] *** ranma has quit IRC (Read error: Operation timed out) [01:04] *** ranma_ has joined #archiveteam-bs [01:05] *** ranma_ is now known as ranma [01:14] *** JesseW has joined #archiveteam-bs [01:38] *** Ravenloft has quit IRC (Ping timeout: 244 seconds) [02:20] *** xmc has joined #archiveteam-bs [02:20] *** swebb sets mode: +o xmc [02:30] what's the difference between having the bot back up a site vs waiting for archive.org crawl a site? [02:30] is the bot more likely to grab all of the assets? [02:36] *** Stiletto has quit IRC (Read error: Operation timed out) [02:42] ranma: The Internet Archive does a wide variety of different sorts of crawls -- which ones were you thinking of? [02:43] The only one that is available to the general public is http://web.archive.org/save/ which only saves a single page at a time (it doesn't follow any links to other pages). [02:44] They also have the Archive-It service, which they sell access and support for to large institutions so the institutions can run crawls of specific sites or lists of sites they are interested in. [02:45] They also regularly crawl various lists of popular or otherwise important sites. [02:45] But those lists are not generally public, or subject to public suggestions. [02:46] They also re-check URLs they previously archived, although I don't know on what schedule. [02:46] And there are probably other crawls, too. [02:52] at this point IA has tools that are more likely to get more of a page [02:52] e.g. ivan has pointed out to me several IA tools that use Chromium as a crawler, which we could adapt for ArchiveBot [02:52] really it's just an augment [02:55] hm, I didn't know IA was actively using Chromium-using tools [02:55] Do you remember any links? [02:55] they're in the #archivebot logs [02:56] I don't know if they're actively being used [02:56] but they are in the internetarchive github account [02:56] Ah, ok. [02:57] *** bsmith093 has quit IRC (Ping timeout: 370 seconds) [02:58] *** bsmith093 has joined #archiveteam-bs [03:20] *** fie has joined #archiveteam-bs [03:52] *** BlueMaxim has quit IRC (Read error: Operation timed out) [03:53] *** BlueMaxim has joined #archiveteam-bs [04:00] *** Honno has joined #archiveteam-bs [04:15] *** Honno_ has joined #archiveteam-bs [04:17] *** Sk1d has quit IRC (Ping timeout: 250 seconds) [04:20] *** Honno has quit IRC (Ping timeout: 492 seconds) [04:28] *** Sk1d has joined #archiveteam-bs [04:28] *** Sk1d has quit IRC (Connection closed) [04:32] *** fie has quit IRC (Ping timeout: 244 seconds) [04:50] *** fie has joined #archiveteam-bs [05:05] *** Honno__ has joined #archiveteam-bs [05:07] *** Honno has joined #archiveteam-bs [05:08] *** jspiros has quit IRC (Read error: Operation timed out) [05:08] *** jspiros has joined #archiveteam-bs [05:13] *** Honno__ has quit IRC (Read error: Operation timed out) [05:17] *** Honno_ has quit IRC (Read error: Operation timed out) [05:26] *** froakie has joined #archiveteam-bs [05:27] *** froakie has quit IRC (Client Quit) [05:32] *** Honno_ has joined #archiveteam-bs [05:36] *** yakfish has quit IRC (Read error: Operation timed out) [05:39] *** yakfish has joined #archiveteam-bs [05:39] *** Honno has quit IRC (Read error: Operation timed out) [05:54] I feel like usage of python-pip on Debian is intentionally difficult because of some stupid easy_install/pip holy war [05:55] all I want is to be able to use pessimistic version constraints on Debian jessie [05:55] the version of pip shipped in jessie is too old for that and it's even too old to be upgraded via pip install --upgrade [06:03] *** JesseW has quit IRC (Ping timeout: 370 seconds) [06:03] ugh.. I've had some.. issues with debian jessie as well [06:06] when developers of development tools bikeshed, everyone loses [06:07] it's too bad there's no way to redirect that misery to said developers [06:08] or, rather, in a constructive manner that doesn't involve internet deathmobs [06:09] *** tomwsmf-a has quit IRC (Read error: Operation timed out) [06:13] hah, seems like it usually ends in pitchforks or yet another shed [06:43] *** dashcloud has quit IRC (Read error: Connection reset by peer) [06:44] *** dashcloud has joined #archiveteam-bs [07:29] *** balrog has quit IRC (Read error: Operation timed out) [07:30] *** BlueMaxim has quit IRC (Read error: Operation timed out) [07:32] *** BlueMaxim has joined #archiveteam-bs [07:35] *** balrog has joined #archiveteam-bs [07:35] *** swebb sets mode: +o balrog [07:44] *** schbirid has joined #archiveteam-bs [07:44] *** BlueMaxim has quit IRC (Read error: Operation timed out) [07:45] *** BlueMaxim has joined #archiveteam-bs [07:46] *** Cameron_D has quit IRC (Ping timeout: 370 seconds) [07:46] *** Cameron_D has joined #archiveteam-bs [07:48] i've started hating debian since they switched to systemd [07:48] dont like it. [07:48] at all. [07:56] mimimi [08:06] *** dashcloud has quit IRC (Read error: Operation timed out) [08:07] systemd is nice [08:11] https://www.reddit.com/r/linux/comments/4lh7yv/systemd_developer_asks_tmux_and_other_programs_to/ [08:11] *** dashcloud has joined #archiveteam-bs [08:17] systemd sucks and is designed bad, but there arent any better options right now. The init system is way too old, the code for it is horrible and it lacks a lot of modern functionality. So systemd is the thing we have to roll with right now, if we like it or not. [08:18] Medowar: that's a non-argument [08:18] the latter doesn't follow from the former [08:18] there's no moratorium on developing better solutions [08:18] so no, "we have to roll with it right now, whether we like it or not" is nonsense [08:19] it is perfectly valid to refuse to use it, ESPECIALLY if it is trying to take over everything and drive out alternatives compatibility-wise, and at the same time get a better thing developed in some way [08:19] that fixes the issues of old systems without introducing the new systemd issues [08:20] going "oh well it's all we have right now so let's just shut up and use it" can be actively harmful [08:23] as a user i love most of systemd [08:23] journalctl dumping huge coredumps by default is stupid though [08:24] I don't like how journalctl deals with large logs [08:24] seeking to the end can take forever [08:28] joepie91: I am coming from a Supercomputer-background. For us, init was the thing that caused most of our problems. It is way too slow, the codebase is horrible to adapt, it is overly complicated, it lacks a lot of functinality. [08:29] We are using our own linux images and it was an absolute nightmare to build a new system(even though we based it on scientific linux) for new hardware or new software [08:29] optimizing for certain workloads was impossible. [08:30] When we implemented systemd, we saw a 9% performance plus. [08:30] Which, for our standarts is HUGE. [08:31] But yes, it is getting overly complicated, is introducing unnecesary depencies and has a questionable moral [08:31] But for us, it was the better choice, so we went with it [08:34] *** asie has joined #archiveteam-bs [08:34] hi [08:35] hi [08:58] *** JW_work1 has quit IRC (Read error: Operation timed out) [09:01] *** JW_work has joined #archiveteam-bs [09:20] *** jut has joined #archiveteam-bs [10:13] *** SN4T14 has quit IRC (west.us.hub irc.mzima.net) [10:13] *** xXx_ndidd has quit IRC (west.us.hub irc.mzima.net) [10:13] *** mutoso has quit IRC (west.us.hub irc.mzima.net) [10:17] *** SN4T14 has joined #archiveteam-bs [10:17] *** xXx_ndidd has joined #archiveteam-bs [10:17] *** mutoso has joined #archiveteam-bs [10:48] *** midas has quit IRC (Read error: Operation timed out) [10:48] *** midas has joined #archiveteam-bs [10:58] *** Honno__ has joined #archiveteam-bs [10:58] *** Honno_ has quit IRC (Read error: Connection reset by peer) [11:03] *** antomati_ is now known as antomatic [11:39] *** hictooth_ has quit IRC (Remote host closed the connection) [11:56] *** dashcloud has quit IRC (Read error: Operation timed out) [12:00] *** dashcloud has joined #archiveteam-bs [12:02] *** n00bLurke has joined #archiveteam-bs [12:10] *** n00bLurke has quit IRC (n00bLurke) [12:21] *** arkiver has quit IRC (Ping timeout: 257 seconds) [12:22] *** sigkell has quit IRC (Ping timeout: 260 seconds) [12:22] *** sigkell_ is now known as sigkell [12:33] *** sigkell_ has joined #archiveteam-bs [12:36] *** arkiver has joined #archiveteam-bs [12:56] *** n00bLurke has joined #archiveteam-bs [13:04] *** Honno__ has quit IRC (Ping timeout: 492 seconds) [13:20] *** dashcloud has quit IRC (Read error: Operation timed out) [13:24] *** dashcloud has joined #archiveteam-bs [13:50] *** Boppen has joined #archiveteam-bs [13:52] *** Sk1d has joined #archiveteam-bs [13:55] *** Boppen has quit IRC (Ping timeout: 190 seconds) [13:58] *** Sk1d has quit IRC (Ping timeout: 190 seconds) [13:59] *** Sk1d has joined #archiveteam-bs [14:20] *** Stiletto has joined #archiveteam-bs [14:21] *** Boppen has joined #archiveteam-bs [14:32] *** Boppen has quit IRC (Ping timeout: 190 seconds) [14:52] *** Cameron_D has quit IRC (Ping timeout: 370 seconds) [14:52] *** Cameron_D has joined #archiveteam-bs [14:57] *** Boppen has joined #archiveteam-bs [15:05] *** Sk1d has quit IRC (Ping timeout: 190 seconds) [15:22] *** Stiletto has quit IRC (Read error: Operation timed out) [15:22] *** Stiletto has joined #archiveteam-bs [15:30] *** Sk1d has joined #archiveteam-bs [15:36] *** Boppen has quit IRC (Ping timeout: 190 seconds) [15:38] *** Sk1d has quit IRC (Ping timeout: 190 seconds) [15:38] *** Boppen has joined #archiveteam-bs [15:39] *** Sk1d has joined #archiveteam-bs [15:53] *** JesseW has joined #archiveteam-bs [16:03] *** VADemon has joined #archiveteam-bs [16:08] *** Boppen has quit IRC (Ping timeout: 190 seconds) [16:09] *** BlueMaxim has quit IRC (Quit: Leaving) [16:09] *** Sk1d has quit IRC (Ping timeout: 190 seconds) [16:23] *** JesseW has quit IRC (Ping timeout: 370 seconds) [16:31] Decentralized Web Summit - Live From The Internet Archive : https://www.youtube.com/watch?v=Yth7O6yeZRE [16:37] *** Boppen has joined #archiveteam-bs [16:37] *** Sk1d has joined #archiveteam-bs [16:42] *** SilSte has quit IRC (Ping timeout: 633 seconds) [16:46] *** Boppen has quit IRC (Ping timeout: 190 seconds) [16:46] *** Boppen has joined #archiveteam-bs [16:46] *** Sk1d has quit IRC (Ping timeout: 190 seconds) [16:48] *** Sk1d has joined #archiveteam-bs [16:52] *** dan- has quit IRC (Ping timeout: 260 seconds) [16:54] godane: Ironcially streaming on Youtube, probably the largest single point of failure in the modern web [16:55] *** dan- has joined #archiveteam-bs [17:12] *** yakfish has quit IRC (Read error: Operation timed out) [17:15] *** Sk1d has quit IRC (Ping timeout: 190 seconds) [17:16] *** yakfish has joined #archiveteam-bs [17:19] *** Boppen has quit IRC (Ping timeout: 190 seconds) [17:43] *** Boppen has joined #archiveteam-bs [17:45] *** Sk1d has joined #archiveteam-bs [17:50] "Debian packaging is not that hard." [17:50] ha [17:53] *** Sk1d has quit IRC (Ping timeout: 190 seconds) [17:53] *** Boppen has quit IRC (Ping timeout: 190 seconds) [18:09] who said that yipdw ? [18:11] https://wiki.debian.org/HowToPackageForDebian [18:12] all lies and slander [18:14] well I gotta build them for something so [18:16] *** Sk1d has joined #archiveteam-bs [18:17] *** Boppen has joined #archiveteam-bs [18:17] this is probably easy in the easy case, but I need to have a package install several systemd unit files [18:17] the documentation around dh_systemd is uh [18:17] it's wiki [18:26] *** Boppen has quit IRC (Ping timeout: 190 seconds) [18:27] *** Sk1d has quit IRC (hub.se irc.du.se) [18:40] *** tomwsmf-a has joined #archiveteam-bs [18:43] *** xXx_ndidd has quit IRC (Read error: Operation timed out) [18:50] *** dashcloud has quit IRC (Read error: Operation timed out) [18:51] looks like Watch Dogs 2 is SF [18:51] *is in SF [18:52] it's been announced? [18:52] It would be really cool to see the IA in there [18:52] all i can think is a mod to put Internet Archive Building into it [18:52] https://www.youtube.com/watch?v=m2qEYCuFxGs [18:54] *** dashcloud has joined #archiveteam-bs [18:55] it'd be cool if they did somewhere in Canada [18:55] just for a random change :p [19:02] *** dashcloud has quit IRC (Read error: Operation timed out) [19:06] *** dashcloud has joined #archiveteam-bs [19:15] cough [19:15] https://i.imgur.com/HLp8z11.jpg [19:15] * ranma ducks [19:24] *** closure has joined #archiveteam-bs [19:44] *** dashcloud has quit IRC (Read error: Operation timed out) [19:48] *** jut has quit IRC (Leaving) [19:49] *** Simpbra1 has quit IRC (Leaving) [19:49] *** mutoso has quit IRC (Read error: Operation timed out) [19:52] *** mutoso has joined #archiveteam-bs [20:06] *** dashcloud has joined #archiveteam-bs [20:35] *** schbirid has quit IRC (Quit: Leaving) [20:59] *** dashcloud has quit IRC (Read error: Operation timed out) [21:03] *** dashcloud has joined #archiveteam-bs [21:22] *** Simpbrain has joined #archiveteam-bs [21:38] *** ndiddy has joined #archiveteam-bs [22:01] Hi all. Got an old PC sat in the corner with 400Gb or so of disk, and was thinking of throwing it on internetarchive.bak - is that still a thing? If so, would a system on a 200/12 domestic home connection work? [22:04] 400Gb seems a bit small, but the network connection should be OK. (This is a mostly ignorant opinion, though) [22:05] yeah, its just an old PC that I got donated from a friend who isnt using it [22:11] AFAIK, IA.BAK is still a thing, it's just in hibernation mainly due to needing either lots more donated space and/or significant improvements to ease-of-installation. But I haven't looked into it in detail, so this may be wrong. [22:14] *** n00bLurke has quit IRC (n00bLurke) [22:14] JW_work, thanks. Going to have a look at setting it up and seeing what happens [22:15] nice! [22:16] I'm interested in doing so too — but I'll probably buy a 1TB drive and just use that. [22:16] the script is just doing its SSH thing now [22:23] *** dashcloud has quit IRC (Read error: Operation timed out) [22:28] *** dashcloud has joined #archiveteam-bs [22:35] *** xXx_ndidd has joined #archiveteam-bs [22:36] http://paste.harrycross.me/view/b011721a [22:37] JW_work, seem to have run into an issue [22:38] that does look like a bug :-) [22:38] IDK more than that. [22:39] *** ndiddy has quit IRC (Read error: Operation timed out) [22:39] I may look into it more this evening, but there are likely more knowledgable people here. [22:40] JW_work, restarted it and it seems to now be downloading away [22:41] still worth opening a ticket in the appropriate repo [22:41] or not [22:42] nvm, shuf is doing its thing slowly [22:58] JW_work, now verification is failing all over the place [22:58] we should probably move this to #internetarchive.bak [23:24] *** yakfish has quit IRC (Read error: Operation timed out) [23:32] *** yakfish has joined #archiveteam-bs [23:51] so looks like MBC Newsdesk for 2003-09-11 only has have of the broadcast for some reason [23:53] there a are even pictures of weather forecast thats not in video on there pages for that date