#archiveteam-bs 2015-08-02,Sun

↑back Search

Time Nickname Message
00:12 🔗 Start has quit IRC (Read error: Connection reset by peer)
00:27 🔗 furrie_ has joined #archiveteam-bs
00:28 🔗 furrie_ im not sure where else to ask but how do i use wpull's --reject-regex parameter
00:29 🔗 furrie_ i mean like more than once. this is how to use it once --reject-regex "/login\.php"
00:30 🔗 furrie_ in parentheses, comma separated, like in -X option, what
00:37 🔗 aaaaaaaaa maybe use nested ors
00:38 🔗 furrie_ what's an example
00:38 🔗 furrie_ "login\.php" OR "calendar\.php"?
00:39 🔗 furrie_ (OR capitalized on purpose)
00:40 🔗 xmc --reject-regex 'login\.php' --reject-regex 'calendar\.php'
00:40 🔗 xmc or
00:40 🔗 xmc --reject-regex "(login|calendar)\.php"
00:41 🔗 furrie_ thanks
00:41 🔗 aaaaaaaaa It may be worth using grab-site as well, as it allows ignores to be injected while it is running
00:41 🔗 furrie_ is it okay if i ask how to use grab-site and what it is
00:43 🔗 garyrh https://github.com/ludios/grab-site
00:43 🔗 garyrh It's like an ArchiveBot you can run locally.
00:43 🔗 aaaaaaaaa https://github.com/ludios/grab-site it is like an archivebot that runs on your local computer, but without the long setup process.
00:43 🔗 furrie_ Awesome
00:45 🔗 furrie_ is grab-site an archiveteam software
00:48 🔗 furrie_ yeah i really wanted something to enter commands in thanks a bundle guys
00:48 🔗 furrie_ imma head off to try this out
00:48 🔗 furrie_ has quit IRC (Quit: Page closed)
00:51 🔗 furrie_ has joined #archiveteam-bs
00:52 🔗 furrie_ alright so i dont quite understand sometihng i installed grab-site but
00:52 🔗 furrie_ https://github.com/ludios/grab-site - this url says to do this next:
00:52 🔗 furrie_ To avoid having to type out ~/.local/bin/ below, add this to your ~/.bashrc or ~/.zshrc: PATH="$PATH:$HOME/.local/bin"
00:53 🔗 Start has joined #archiveteam-bs
00:53 🔗 furrie_ there isn't a local/bin on my computer
00:54 🔗 furrie_ it's /bin/
00:55 🔗 furrie_ wait wait wait disregard all of that.
00:55 🔗 furrie_ i try to head to local/bin/ by typing 'cd ~/.local/bin/gs-server' but it doesnt work
00:55 🔗 BlueMaxim has joined #archiveteam-bs
00:57 🔗 furrie_ im confused
01:01 🔗 furrie_ okay wait i think i may be on to somethin
01:10 🔗 furrie_ aaaaaaaaa: grab-site asks me to add PATH="$PATH:$HOME/.local/bin" to my .bashrc. problem is where is my bashrc file
01:13 🔗 aaaaaaaaa in your home folder
01:13 🔗 furrie_ it doesn't exist is that a problem (im using debian)
01:14 🔗 aaaaaaaaa ls -a
01:15 🔗 ivan` has joined #archiveteam-bs
01:15 🔗 ivan` furrie_: what OS?
01:15 🔗 ivan` oh
01:15 🔗 ivan` thanks aaaaaaaaa for being my front-line tech support
01:15 🔗 aaaaaaaaa someone's got to do it
01:16 🔗 ivan` <furrie_> i try to head to local/bin/ by typing 'cd ~/.local/bin/gs-server' but it doesnt work
01:16 🔗 ivan` you can't cd into a file, only a directory
01:16 🔗 ivan` don't cd there anyway
01:17 🔗 ivan` if pip3 succeeded you can literally just type ~/.local/bin/gs-server
01:17 🔗 ivan` follow by the Enter key
01:18 🔗 ivan` also, if you are not aware, you can tab-complete filenames by typing something like ~/.loc<TAB> and watching it complete ~/.local
01:18 🔗 furrie_ .local/bin/gs-server: No such file or directory
01:18 🔗 furrie_ i got that for an error
01:18 🔗 ivan` can you paste the command you ran
01:20 🔗 furrie_ Wait, i installed it on / and not ~
01:20 🔗 furrie_ it works now
01:20 🔗 ivan` how did you install it on /?
01:20 🔗 ivan` did you omit the --user?
01:21 🔗 furrie_ i meant i ran the installation commands while in / in terminal
01:21 🔗 ivan` the current directly does not affect where pip3 installs things
01:21 🔗 ivan` directory*
01:21 🔗 ivan` ~ means your home directory i.e. $HOME
01:22 🔗 furrie_ yeah well never mind it's working
01:22 🔗 ivan` I am worried more about you having an idea of what you're doing ;)
01:23 🔗 ivan` did pip3 on debian really install to /.local i.e. / the root directory? that would make no sense
01:23 🔗 ivan` ls -la /.local/bin please
01:25 🔗 furrie_ comp ~ $ ls -la /.local/bin ls: cannot access /.local/bin: No such file or directory comp ~ $ ls -la /.local/bin ls: cannot access /.local/bin: No such file or directory
01:25 🔗 ivan` yeah, I thought so
01:25 🔗 furrie_ i'm on ~ in terminal not /
01:25 🔗 furrie_ but ~/.local/bin/gs-server strangely works
01:26 🔗 ivan` yes, that's exactly what the README says to do
01:27 🔗 ivan` that ~/ resolves to your home directory and it does not matter where you are
01:27 🔗 furrie_ yeah
01:27 🔗 yakfish has joined #archiveteam-bs
01:27 🔗 furrie_ i'm actually on the part trying to figure out how to add a site onto the grab-site dashboard
01:28 🔗 dxrt furrie_: Have you read the README the steps are all very well documented
01:28 🔗 aaaaaaaaa almost too well.... What kind of docs are user friendly?
01:29 🔗 furrie_ https://github.com/ludios/grab-site -- isn't the readme on here
01:31 🔗 ivan` yes
01:33 🔗 furrie_ http://pastebin.com/1w20eMaW - i didn't use tmux but i did this in a separate tab from the tab where grab-site is running
01:33 🔗 furrie_ hopefully im not frustrating you
01:34 🔗 ivan` no, that looks like a problem that could be my fault
01:34 🔗 furrie_ hm
01:34 🔗 furrie_ oh
01:35 🔗 furrie_ so do you know what to do to fix the issue ivan
01:35 🔗 ivan` try pip3 install --user --upgrade trollius
01:35 🔗 furrie_ hey, now it works
01:36 🔗 ivan` cool
01:36 🔗 ivan` you should tmux if you plan to ever disconnect from your terminal session
01:37 🔗 ivan` just install tmux via apt and run tmux before doing stuff
01:37 🔗 ivan` ctrl-b d to detach from your tmux, tmux attach to reattach
01:38 🔗 furrie_ how do i add ignoresets while grabsite is runnjing
01:40 🔗 ivan` edit DIR/igsets
01:41 🔗 furrie_ or more specifically add commands while the site's running
01:41 🔗 ivan` you can't really give grab-site commands, you edit files in whatever directory it created, and it picks up the changes
01:41 🔗 furrie_ Oh
01:43 🔗 furrie_ they'
01:43 🔗 furrie_ what about for regex ignores
01:43 🔗 ivan` DIR/ignores
01:44 🔗 tomwsmf-a has quit IRC (Ping timeout: 258 seconds)
01:44 🔗 furrie_ separated line by line or comma
01:45 🔗 ivan` line
01:45 🔗 ivan` "DIR/ignores is a newline-separated list of Python 3 regular expressions to use in addition to the ignore sets."
01:50 🔗 kyan has joined #archiveteam-bs
01:51 🔗 kyanz`bot has joined #archiveteam-bs
01:53 🔗 furrie_ has quit IRC (Ping timeout: 240 seconds)
02:02 🔗 furrie has joined #archiveteam-bs
02:02 🔗 furrie last important question--how do you change the directory in which grab-site crawls are being saved to (default: home dir)
02:06 🔗 kyan it's the current working directory
02:07 🔗 kyan (it's in the readme: https://github.com/ludios/grab-site/)
02:07 🔗 kyan furrie: ^
02:07 🔗 ivan` yeah, just cd to another directory first
02:07 🔗 furrie awesome, thanks.
02:08 🔗 furrie yeah because my computer has limited memory, and i have an external hard drive that could do with some use.
02:10 🔗 furrie has quit IRC (Quit: Page closed)
02:17 🔗 kyanz`bot has quit IRC (Quit: KeyboardInterrupt)
02:19 🔗 yakfish has quit IRC (Quit: Read error: Operation timed out)
02:20 🔗 yakfish has joined #archiveteam-bs
02:20 🔗 furrie has joined #archiveteam-bs
02:21 🔗 kyanz`bot has joined #archiveteam-bs
02:21 🔗 kyanz`bot has quit IRC (Remote host closed the connection)
02:22 🔗 furrie okay sorry for coming back yet again. i have a problem--i cannot use grab-site in my external hard drive
02:22 🔗 ivan` furrie: well, what went wrong
02:22 🔗 furrie http://pastebin.com/KeG4Q3ir
02:22 🔗 ivan` ah, yes, sorry, it looks like it is broken with spaces in the directory name
02:23 🔗 furrie so it can usually work in HDD otherwise?
02:23 🔗 ivan` yes
02:24 🔗 furrie i assume there's no workaroudn other than changing the HDD's name
02:24 🔗 ivan` you can create a symbolic link to the directory with ln -s and cd into that directory instead
02:24 🔗 ivan` I will bbl
02:25 🔗 furrie okay
02:26 🔗 kyanz`bot has joined #archiveteam-bs
02:26 🔗 zenguy_pc has quit IRC (Read error: Connection reset by peer)
02:27 🔗 zenguy_pc has joined #archiveteam-bs
02:27 🔗 furrie bash: cd: bone: Too many levels of symbolic links
02:29 🔗 furrie thats the error i get from trying to cd into the directory within my HDD with a space in it
02:29 🔗 zenguy_pc has quit IRC (Read error: Connection reset by peer)
02:30 🔗 zenguy_pc has joined #archiveteam-bs
02:32 🔗 xtr-201 has quit IRC (Read error: Operation timed out)
02:35 🔗 kyanz`bot has quit IRC (Remote host closed the connection)
02:35 🔗 furrie oh wait, i'm figuring it out
02:35 🔗 furrie has quit IRC (Quit: Page closed)
02:35 🔗 aaaaaaaaa Try absolute paths, that way you don't have to worry about the strange way ln treats relative links
02:35 🔗 aaaaaaaaa or not
03:01 🔗 mistym has joined #archiveteam-bs
03:48 🔗 rejk has quit IRC (Ping timeout: 483 seconds)
03:53 🔗 ivan` if you see furrie tell him the space issue is fixed and he just needs to install grab-site again
04:10 🔗 aaaaaaaaa has quit IRC (Leaving)
04:42 🔗 mistym_ has joined #archiveteam-bs
04:43 🔗 mistym has quit IRC (Ping timeout: 252 seconds)
04:55 🔗 mistym has joined #archiveteam-bs
05:01 🔗 mistym_ has quit IRC (Read error: Operation timed out)
05:41 🔗 JesseW has joined #archiveteam-bs
07:12 🔗 JesseW has quit IRC (Quit: Leaving.)
07:28 🔗 BlueMaxim has quit IRC (Read error: Connection reset by peer)
07:35 🔗 mistym has quit IRC (Remote host closed the connection)
07:38 🔗 BlueMaxim has joined #archiveteam-bs
08:05 🔗 superkuh has quit IRC (Read error: Operation timed out)
08:23 🔗 superkuh has joined #archiveteam-bs
08:36 🔗 mistym has joined #archiveteam-bs
08:41 🔗 schbirid has joined #archiveteam-bs
08:43 🔗 mistym has quit IRC (Read error: Operation timed out)
09:01 🔗 SimpBrain has joined #archiveteam-bs
09:19 🔗 BlueMaxim has quit IRC (Quit: Leaving)
12:01 🔗 BlueMaxim has joined #archiveteam-bs
12:04 🔗 brayden has joined #archiveteam-bs
12:06 🔗 schbirid has quit IRC (Read error: Operation timed out)
12:08 🔗 brayden has quit IRC (Read error: Connection reset by peer)
12:18 🔗 schbirid has joined #archiveteam-bs
12:40 🔗 ivan` has left
12:46 🔗 BlueMaxim has quit IRC (Quit: Leaving)
12:55 🔗 RichardG has quit IRC (Remote host closed the connection)
12:58 🔗 RichardG has joined #archiveteam-bs
13:25 🔗 brayden has joined #archiveteam-bs
14:16 🔗 tomwsmf-a has joined #archiveteam-bs
14:17 🔗 kyan has quit IRC (Quit: This computer has gone to sleep)
14:19 🔗 useretail has quit IRC (Remote host closed the connection)
14:41 🔗 mistym has joined #archiveteam-bs
14:51 🔗 mistym has quit IRC (Ping timeout: 606 seconds)
15:08 🔗 furrie has joined #archiveteam-bs
15:10 🔗 furrie regex question: say i want to exclude a site in grab-site's DIR/ignores file. how can i make it so that i exclude, for instance www.example.com and example.com both on the same line?
15:11 🔗 furrie would i just have to add http://(www)?\.example\.com/? or what
15:12 🔗 xmc you want to put the . in the parens with the www
15:12 🔗 xmc but yeah
15:13 🔗 furrie also the slash before the period right
15:14 🔗 xmc yes
15:14 🔗 furrie so in total -- http://(www\.)?example\.com/? correct
15:15 🔗 xmc yep
15:17 🔗 mistym has joined #archiveteam-bs
15:20 🔗 furrie okay thanks
15:21 🔗 furrie and to only exclude, for instance, &sid= only at the end of a url i do &sid=.*$ probably not right, but what's the correct answer
15:22 🔗 furrie does archivebot and grabsite allow the regex .* to be used in any case
15:24 🔗 furrie maybe to exclude &sid= only at the end of a url for instance i might do ^.*&sid= correct
15:25 🔗 furrie yeah sorry if this is unclear
15:27 🔗 furrie disregard maybe it's ^http://www\.example\.com/.*/?&sid=
15:30 🔗 furrie nm actually
15:31 🔗 furrie has quit IRC (Quit: Page closed)
15:44 🔗 JesseW has joined #archiveteam-bs
15:45 🔗 brayden_ has joined #archiveteam-bs
15:50 🔗 brayden has quit IRC (Read error: Operation timed out)
16:27 🔗 Apathy has quit IRC (Quit: OOOOoooooooooo................)
16:33 🔗 mistym_ has joined #archiveteam-bs
16:35 🔗 mistym has quit IRC (Ping timeout: 492 seconds)
16:50 🔗 godane has quit IRC (Read error: Operation timed out)
17:08 🔗 godane has joined #archiveteam-bs
17:10 🔗 godane does any here use rtl8821ae for wifi?
17:22 🔗 kniffy has quit IRC (Ping timeout: 483 seconds)
17:24 🔗 mistym_ has quit IRC (Remote host closed the connection)
17:29 🔗 Apathy has joined #archiveteam-bs
17:39 🔗 SadDM has quit IRC (Ping timeout: 483 seconds)
17:43 🔗 SadDM has joined #archiveteam-bs
18:08 🔗 JesseW has quit IRC (Quit: Leaving.)
18:12 🔗 kniffy has joined #archiveteam-bs
18:13 🔗 JesseW has joined #archiveteam-bs
18:14 🔗 dashcloud has quit IRC (Ping timeout: 265 seconds)
18:16 🔗 JesseW has quit IRC (Client Quit)
18:18 🔗 ivan` has joined #archiveteam-bs
18:21 🔗 JesseW has joined #archiveteam-bs
18:27 🔗 dashcloud has joined #archiveteam-bs
18:40 🔗 useretail has joined #archiveteam-bs
18:45 🔗 JesseW has quit IRC (Quit: Leaving.)
19:39 🔗 aaaaaaaaa has joined #archiveteam-bs
19:52 🔗 tomwsmf-a has quit IRC (Read error: Operation timed out)
19:56 🔗 schbirid has quit IRC (Leaving)
20:02 🔗 dashcloud has quit IRC (Read error: Operation timed out)
20:05 🔗 dashcloud has joined #archiveteam-bs
20:44 🔗 tomwsmf-a has joined #archiveteam-bs
20:54 🔗 tomwsmf-a has quit IRC (Read error: Operation timed out)
21:18 🔗 dashcloud has quit IRC (Read error: Connection reset by peer)
21:18 🔗 godane has quit IRC (Quit: Leaving.)
21:22 🔗 dashcloud has joined #archiveteam-bs
21:24 🔗 godane has joined #archiveteam-bs
21:28 🔗 godane i need help guys
21:28 🔗 godane my rtl8821ae wifi card sucks
21:29 🔗 Asparagir has joined #archiveteam-bs
21:30 🔗 godane i can't really connect to internet with my alfa awus036h usb
21:31 🔗 godane my main problem is that i will be limited to what i can grab with mplayer -dumpstream
21:33 🔗 marvinw has quit IRC (Remote host closed the connection)
21:40 🔗 marvinw has joined #archiveteam-bs
21:41 🔗 joepie91 godane: what's wrong with it?
21:43 🔗 godane its slower then the other wifi card i was using
21:48 🔗 Kazzy has quit IRC (Ping timeout: 252 seconds)
21:48 🔗 joepie91 ah, hm, weird
21:49 🔗 aaaaaaaaa if you stay plugged in, try turning off power saving modes or try a better (or DIY) antenna.
21:52 🔗 godane the wifi card is at
21:52 🔗 godane 72.2 Mb/s
21:52 🔗 godane i normally never had a card go at 72.2Mb/s
21:57 🔗 aaaaaaaaa Isn't that the maximum, at least if you don't channel bond?
22:00 🔗 godane has quit IRC (Quit: Leaving.)
22:02 🔗 godane has joined #archiveteam-bs
22:02 🔗 godane so wifi hates having channel changed
22:09 🔗 BlueMaxim has joined #archiveteam-bs
22:10 🔗 Asparagir has quit IRC (Asparagir)
22:16 🔗 BlueMaxim has quit IRC (Read error: Connection reset by peer)
22:34 🔗 ripvanwin has quit IRC (Leaving)
22:34 🔗 SadDM has quit IRC (Ping timeout: 483 seconds)
22:42 🔗 ripvanwin has joined #archiveteam-bs
22:48 🔗 SadDM has joined #archiveteam-bs
23:01 🔗 BlueMaxim has joined #archiveteam-bs
23:22 🔗 Balrog_ has joined #archiveteam-bs
23:23 🔗 Balrog_ is now known as Balrog-34
23:46 🔗 Kazzy has joined #archiveteam-bs
23:59 🔗 yipdw Balrog-34: oh, I meant just from what I've seen in recent months
23:59 🔗 yipdw archive jobs going across the warrior and archivebot and other things
23:59 🔗 yipdw it's quite possible that I am just looking at all the wrong logs and/or have a terrible selection bias but yeah
23:59 🔗 yipdw so it's not you

irclogger-viewer