[00:12] *** Start has quit IRC (Read error: Connection reset by peer) [00:27] *** furrie_ has joined #archiveteam-bs [00:28] im not sure where else to ask but how do i use wpull's --reject-regex parameter [00:29] i mean like more than once. this is how to use it once --reject-regex "/login\.php" [00:30] in parentheses, comma separated, like in -X option, what [00:37] maybe use nested ors [00:38] what's an example [00:38] "login\.php" OR "calendar\.php"? [00:39] (OR capitalized on purpose) [00:40] --reject-regex 'login\.php' --reject-regex 'calendar\.php' [00:40] or [00:40] --reject-regex "(login|calendar)\.php" [00:41] thanks [00:41] It may be worth using grab-site as well, as it allows ignores to be injected while it is running [00:41] is it okay if i ask how to use grab-site and what it is [00:43] https://github.com/ludios/grab-site [00:43] It's like an ArchiveBot you can run locally. [00:43] https://github.com/ludios/grab-site it is like an archivebot that runs on your local computer, but without the long setup process. [00:43] Awesome [00:45] is grab-site an archiveteam software [00:48] yeah i really wanted something to enter commands in thanks a bundle guys [00:48] imma head off to try this out [00:48] *** furrie_ has quit IRC (Quit: Page closed) [00:51] *** furrie_ has joined #archiveteam-bs [00:52] alright so i dont quite understand sometihng i installed grab-site but [00:52] https://github.com/ludios/grab-site - this url says to do this next: [00:52] To avoid having to type out ~/.local/bin/ below, add this to your ~/.bashrc or ~/.zshrc: PATH="$PATH:$HOME/.local/bin" [00:53] *** Start has joined #archiveteam-bs [00:53] there isn't a local/bin on my computer [00:54] it's /bin/ [00:55] wait wait wait disregard all of that. [00:55] i try to head to local/bin/ by typing 'cd ~/.local/bin/gs-server' but it doesnt work [00:55] *** BlueMaxim has joined #archiveteam-bs [00:57] im confused [01:01] okay wait i think i may be on to somethin [01:10] aaaaaaaaa: grab-site asks me to add PATH="$PATH:$HOME/.local/bin" to my .bashrc. problem is where is my bashrc file [01:13] in your home folder [01:13] it doesn't exist is that a problem (im using debian) [01:14] ls -a [01:15] *** ivan` has joined #archiveteam-bs [01:15] furrie_: what OS? [01:15] oh [01:15] thanks aaaaaaaaa for being my front-line tech support [01:15] someone's got to do it [01:16] i try to head to local/bin/ by typing 'cd ~/.local/bin/gs-server' but it doesnt work [01:16] you can't cd into a file, only a directory [01:16] don't cd there anyway [01:17] if pip3 succeeded you can literally just type ~/.local/bin/gs-server [01:17] follow by the Enter key [01:18] also, if you are not aware, you can tab-complete filenames by typing something like ~/.loc and watching it complete ~/.local [01:18] .local/bin/gs-server: No such file or directory [01:18] i got that for an error [01:18] can you paste the command you ran [01:20] Wait, i installed it on / and not ~ [01:20] it works now [01:20] how did you install it on /? [01:20] did you omit the --user? [01:21] i meant i ran the installation commands while in / in terminal [01:21] the current directly does not affect where pip3 installs things [01:21] directory* [01:21] ~ means your home directory i.e. $HOME [01:22] yeah well never mind it's working [01:22] I am worried more about you having an idea of what you're doing ;) [01:23] did pip3 on debian really install to /.local i.e. / the root directory? that would make no sense [01:23] ls -la /.local/bin please [01:25] comp ~ $ ls -la /.local/bin ls: cannot access /.local/bin: No such file or directory comp ~ $ ls -la /.local/bin ls: cannot access /.local/bin: No such file or directory [01:25] yeah, I thought so [01:25] i'm on ~ in terminal not / [01:25] but ~/.local/bin/gs-server strangely works [01:26] yes, that's exactly what the README says to do [01:27] that ~/ resolves to your home directory and it does not matter where you are [01:27] yeah [01:27] *** yakfish has joined #archiveteam-bs [01:27] i'm actually on the part trying to figure out how to add a site onto the grab-site dashboard [01:28] furrie_: Have you read the README the steps are all very well documented [01:28] almost too well.... What kind of docs are user friendly? [01:29] https://github.com/ludios/grab-site -- isn't the readme on here [01:31] yes [01:33] http://pastebin.com/1w20eMaW - i didn't use tmux but i did this in a separate tab from the tab where grab-site is running [01:33] hopefully im not frustrating you [01:34] no, that looks like a problem that could be my fault [01:34] hm [01:34] oh [01:35] so do you know what to do to fix the issue ivan [01:35] try pip3 install --user --upgrade trollius [01:35] hey, now it works [01:36] cool [01:36] you should tmux if you plan to ever disconnect from your terminal session [01:37] just install tmux via apt and run tmux before doing stuff [01:37] ctrl-b d to detach from your tmux, tmux attach to reattach [01:38] how do i add ignoresets while grabsite is runnjing [01:40] edit DIR/igsets [01:41] or more specifically add commands while the site's running [01:41] you can't really give grab-site commands, you edit files in whatever directory it created, and it picks up the changes [01:41] Oh [01:43] they' [01:43] what about for regex ignores [01:43] DIR/ignores [01:44] *** tomwsmf-a has quit IRC (Ping timeout: 258 seconds) [01:44] separated line by line or comma [01:45] line [01:45] "DIR/ignores is a newline-separated list of Python 3 regular expressions to use in addition to the ignore sets." [01:50] *** kyan has joined #archiveteam-bs [01:51] *** kyanz`bot has joined #archiveteam-bs [01:53] *** furrie_ has quit IRC (Ping timeout: 240 seconds) [02:02] *** furrie has joined #archiveteam-bs [02:02] last important question--how do you change the directory in which grab-site crawls are being saved to (default: home dir) [02:06] it's the current working directory [02:07] (it's in the readme: https://github.com/ludios/grab-site/) [02:07] furrie: ^ [02:07] yeah, just cd to another directory first [02:07] awesome, thanks. [02:08] yeah because my computer has limited memory, and i have an external hard drive that could do with some use. [02:10] *** furrie has quit IRC (Quit: Page closed) [02:17] *** kyanz`bot has quit IRC (Quit: KeyboardInterrupt) [02:19] *** yakfish has quit IRC (Quit: Read error: Operation timed out) [02:20] *** yakfish has joined #archiveteam-bs [02:20] *** furrie has joined #archiveteam-bs [02:21] *** kyanz`bot has joined #archiveteam-bs [02:21] *** kyanz`bot has quit IRC (Remote host closed the connection) [02:22] okay sorry for coming back yet again. i have a problem--i cannot use grab-site in my external hard drive [02:22] furrie: well, what went wrong [02:22] http://pastebin.com/KeG4Q3ir [02:22] ah, yes, sorry, it looks like it is broken with spaces in the directory name [02:23] so it can usually work in HDD otherwise? [02:23] yes [02:24] i assume there's no workaroudn other than changing the HDD's name [02:24] you can create a symbolic link to the directory with ln -s and cd into that directory instead [02:24] I will bbl [02:25] okay [02:26] *** kyanz`bot has joined #archiveteam-bs [02:26] *** zenguy_pc has quit IRC (Read error: Connection reset by peer) [02:27] *** zenguy_pc has joined #archiveteam-bs [02:27] bash: cd: bone: Too many levels of symbolic links [02:29] thats the error i get from trying to cd into the directory within my HDD with a space in it [02:29] *** zenguy_pc has quit IRC (Read error: Connection reset by peer) [02:30] *** zenguy_pc has joined #archiveteam-bs [02:32] *** xtr-201 has quit IRC (Read error: Operation timed out) [02:35] *** kyanz`bot has quit IRC (Remote host closed the connection) [02:35] oh wait, i'm figuring it out [02:35] *** furrie has quit IRC (Quit: Page closed) [02:35] Try absolute paths, that way you don't have to worry about the strange way ln treats relative links [02:35] or not [03:01] *** mistym has joined #archiveteam-bs [03:48] *** rejk has quit IRC (Ping timeout: 483 seconds) [03:53] if you see furrie tell him the space issue is fixed and he just needs to install grab-site again [04:10] *** aaaaaaaaa has quit IRC (Leaving) [04:42] *** mistym_ has joined #archiveteam-bs [04:43] *** mistym has quit IRC (Ping timeout: 252 seconds) [04:55] *** mistym has joined #archiveteam-bs [05:01] *** mistym_ has quit IRC (Read error: Operation timed out) [05:41] *** JesseW has joined #archiveteam-bs [07:12] *** JesseW has quit IRC (Quit: Leaving.) [07:28] *** BlueMaxim has quit IRC (Read error: Connection reset by peer) [07:35] *** mistym has quit IRC (Remote host closed the connection) [07:38] *** BlueMaxim has joined #archiveteam-bs [08:05] *** superkuh has quit IRC (Read error: Operation timed out) [08:23] *** superkuh has joined #archiveteam-bs [08:36] *** mistym has joined #archiveteam-bs [08:41] *** schbirid has joined #archiveteam-bs [08:43] *** mistym has quit IRC (Read error: Operation timed out) [09:01] *** SimpBrain has joined #archiveteam-bs [09:19] *** BlueMaxim has quit IRC (Quit: Leaving) [12:01] *** BlueMaxim has joined #archiveteam-bs [12:04] *** brayden has joined #archiveteam-bs [12:06] *** schbirid has quit IRC (Read error: Operation timed out) [12:08] *** brayden has quit IRC (Read error: Connection reset by peer) [12:18] *** schbirid has joined #archiveteam-bs [12:40] *** ivan` has left [12:46] *** BlueMaxim has quit IRC (Quit: Leaving) [12:55] *** RichardG has quit IRC (Remote host closed the connection) [12:58] *** RichardG has joined #archiveteam-bs [13:25] *** brayden has joined #archiveteam-bs [14:16] *** tomwsmf-a has joined #archiveteam-bs [14:17] *** kyan has quit IRC (Quit: This computer has gone to sleep) [14:19] *** useretail has quit IRC (Remote host closed the connection) [14:41] *** mistym has joined #archiveteam-bs [14:51] *** mistym has quit IRC (Ping timeout: 606 seconds) [15:08] *** furrie has joined #archiveteam-bs [15:10] regex question: say i want to exclude a site in grab-site's DIR/ignores file. how can i make it so that i exclude, for instance www.example.com and example.com both on the same line? [15:11] would i just have to add http://(www)?\.example\.com/? or what [15:12] you want to put the . in the parens with the www [15:12] but yeah [15:13] also the slash before the period right [15:14] yes [15:14] so in total -- http://(www\.)?example\.com/? correct [15:15] yep [15:17] *** mistym has joined #archiveteam-bs [15:20] okay thanks [15:21] and to only exclude, for instance, &sid= only at the end of a url i do &sid=.*$ probably not right, but what's the correct answer [15:22] does archivebot and grabsite allow the regex .* to be used in any case [15:24] maybe to exclude &sid= only at the end of a url for instance i might do ^.*&sid= correct [15:25] yeah sorry if this is unclear [15:27] disregard maybe it's ^http://www\.example\.com/.*/?&sid= [15:30] nm actually [15:31] *** furrie has quit IRC (Quit: Page closed) [15:44] *** JesseW has joined #archiveteam-bs [15:45] *** brayden_ has joined #archiveteam-bs [15:50] *** brayden has quit IRC (Read error: Operation timed out) [16:27] *** Apathy has quit IRC (Quit: OOOOoooooooooo................) [16:33] *** mistym_ has joined #archiveteam-bs [16:35] *** mistym has quit IRC (Ping timeout: 492 seconds) [16:50] *** godane has quit IRC (Read error: Operation timed out) [17:08] *** godane has joined #archiveteam-bs [17:10] does any here use rtl8821ae for wifi? [17:22] *** kniffy has quit IRC (Ping timeout: 483 seconds) [17:24] *** mistym_ has quit IRC (Remote host closed the connection) [17:29] *** Apathy has joined #archiveteam-bs [17:39] *** SadDM has quit IRC (Ping timeout: 483 seconds) [17:43] *** SadDM has joined #archiveteam-bs [18:08] *** JesseW has quit IRC (Quit: Leaving.) [18:12] *** kniffy has joined #archiveteam-bs [18:13] *** JesseW has joined #archiveteam-bs [18:14] *** dashcloud has quit IRC (Ping timeout: 265 seconds) [18:16] *** JesseW has quit IRC (Client Quit) [18:18] *** ivan` has joined #archiveteam-bs [18:21] *** JesseW has joined #archiveteam-bs [18:27] *** dashcloud has joined #archiveteam-bs [18:40] *** useretail has joined #archiveteam-bs [18:45] *** JesseW has quit IRC (Quit: Leaving.) [19:39] *** aaaaaaaaa has joined #archiveteam-bs [19:52] *** tomwsmf-a has quit IRC (Read error: Operation timed out) [19:56] *** schbirid has quit IRC (Leaving) [20:02] *** dashcloud has quit IRC (Read error: Operation timed out) [20:05] *** dashcloud has joined #archiveteam-bs [20:44] *** tomwsmf-a has joined #archiveteam-bs [20:54] *** tomwsmf-a has quit IRC (Read error: Operation timed out) [21:18] *** dashcloud has quit IRC (Read error: Connection reset by peer) [21:18] *** godane has quit IRC (Quit: Leaving.) [21:22] *** dashcloud has joined #archiveteam-bs [21:24] *** godane has joined #archiveteam-bs [21:28] i need help guys [21:28] my rtl8821ae wifi card sucks [21:29] *** Asparagir has joined #archiveteam-bs [21:30] i can't really connect to internet with my alfa awus036h usb [21:31] my main problem is that i will be limited to what i can grab with mplayer -dumpstream [21:33] *** marvinw has quit IRC (Remote host closed the connection) [21:40] *** marvinw has joined #archiveteam-bs [21:41] godane: what's wrong with it? [21:43] its slower then the other wifi card i was using [21:48] *** Kazzy has quit IRC (Ping timeout: 252 seconds) [21:48] ah, hm, weird [21:49] if you stay plugged in, try turning off power saving modes or try a better (or DIY) antenna. [21:52] the wifi card is at [21:52] 72.2 Mb/s [21:52] i normally never had a card go at 72.2Mb/s [21:57] Isn't that the maximum, at least if you don't channel bond? [22:00] *** godane has quit IRC (Quit: Leaving.) [22:02] *** godane has joined #archiveteam-bs [22:02] so wifi hates having channel changed [22:09] *** BlueMaxim has joined #archiveteam-bs [22:10] *** Asparagir has quit IRC (Asparagir) [22:16] *** BlueMaxim has quit IRC (Read error: Connection reset by peer) [22:34] *** ripvanwin has quit IRC (Leaving) [22:34] *** SadDM has quit IRC (Ping timeout: 483 seconds) [22:42] *** ripvanwin has joined #archiveteam-bs [22:48] *** SadDM has joined #archiveteam-bs [23:01] *** BlueMaxim has joined #archiveteam-bs [23:22] *** Balrog_ has joined #archiveteam-bs [23:23] *** Balrog_ is now known as Balrog-34 [23:46] *** Kazzy has joined #archiveteam-bs [23:59] Balrog-34: oh, I meant just from what I've seen in recent months [23:59] archive jobs going across the warrior and archivebot and other things [23:59] it's quite possible that I am just looking at all the wrong logs and/or have a terrible selection bias but yeah [23:59] so it's not you