Time |
Nickname |
Message |
00:12
🔗
|
|
Start has quit IRC (Read error: Connection reset by peer) |
00:27
🔗
|
|
furrie_ has joined #archiveteam-bs |
00:28
🔗
|
furrie_ |
im not sure where else to ask but how do i use wpull's --reject-regex parameter |
00:29
🔗
|
furrie_ |
i mean like more than once. this is how to use it once --reject-regex "/login\.php" |
00:30
🔗
|
furrie_ |
in parentheses, comma separated, like in -X option, what |
00:37
🔗
|
aaaaaaaaa |
maybe use nested ors |
00:38
🔗
|
furrie_ |
what's an example |
00:38
🔗
|
furrie_ |
"login\.php" OR "calendar\.php"? |
00:39
🔗
|
furrie_ |
(OR capitalized on purpose) |
00:40
🔗
|
xmc |
--reject-regex 'login\.php' --reject-regex 'calendar\.php' |
00:40
🔗
|
xmc |
or |
00:40
🔗
|
xmc |
--reject-regex "(login|calendar)\.php" |
00:41
🔗
|
furrie_ |
thanks |
00:41
🔗
|
aaaaaaaaa |
It may be worth using grab-site as well, as it allows ignores to be injected while it is running |
00:41
🔗
|
furrie_ |
is it okay if i ask how to use grab-site and what it is |
00:43
🔗
|
garyrh |
https://github.com/ludios/grab-site |
00:43
🔗
|
garyrh |
It's like an ArchiveBot you can run locally. |
00:43
🔗
|
aaaaaaaaa |
https://github.com/ludios/grab-site it is like an archivebot that runs on your local computer, but without the long setup process. |
00:43
🔗
|
furrie_ |
Awesome |
00:45
🔗
|
furrie_ |
is grab-site an archiveteam software |
00:48
🔗
|
furrie_ |
yeah i really wanted something to enter commands in thanks a bundle guys |
00:48
🔗
|
furrie_ |
imma head off to try this out |
00:48
🔗
|
|
furrie_ has quit IRC (Quit: Page closed) |
00:51
🔗
|
|
furrie_ has joined #archiveteam-bs |
00:52
🔗
|
furrie_ |
alright so i dont quite understand sometihng i installed grab-site but |
00:52
🔗
|
furrie_ |
https://github.com/ludios/grab-site - this url says to do this next: |
00:52
🔗
|
furrie_ |
To avoid having to type out ~/.local/bin/ below, add this to your ~/.bashrc or ~/.zshrc: PATH="$PATH:$HOME/.local/bin" |
00:53
🔗
|
|
Start has joined #archiveteam-bs |
00:53
🔗
|
furrie_ |
there isn't a local/bin on my computer |
00:54
🔗
|
furrie_ |
it's /bin/ |
00:55
🔗
|
furrie_ |
wait wait wait disregard all of that. |
00:55
🔗
|
furrie_ |
i try to head to local/bin/ by typing 'cd ~/.local/bin/gs-server' but it doesnt work |
00:55
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
00:57
🔗
|
furrie_ |
im confused |
01:01
🔗
|
furrie_ |
okay wait i think i may be on to somethin |
01:10
🔗
|
furrie_ |
aaaaaaaaa: grab-site asks me to add PATH="$PATH:$HOME/.local/bin" to my .bashrc. problem is where is my bashrc file |
01:13
🔗
|
aaaaaaaaa |
in your home folder |
01:13
🔗
|
furrie_ |
it doesn't exist is that a problem (im using debian) |
01:14
🔗
|
aaaaaaaaa |
ls -a |
01:15
🔗
|
|
ivan` has joined #archiveteam-bs |
01:15
🔗
|
ivan` |
furrie_: what OS? |
01:15
🔗
|
ivan` |
oh |
01:15
🔗
|
ivan` |
thanks aaaaaaaaa for being my front-line tech support |
01:15
🔗
|
aaaaaaaaa |
someone's got to do it |
01:16
🔗
|
ivan` |
<furrie_> i try to head to local/bin/ by typing 'cd ~/.local/bin/gs-server' but it doesnt work |
01:16
🔗
|
ivan` |
you can't cd into a file, only a directory |
01:16
🔗
|
ivan` |
don't cd there anyway |
01:17
🔗
|
ivan` |
if pip3 succeeded you can literally just type ~/.local/bin/gs-server |
01:17
🔗
|
ivan` |
follow by the Enter key |
01:18
🔗
|
ivan` |
also, if you are not aware, you can tab-complete filenames by typing something like ~/.loc<TAB> and watching it complete ~/.local |
01:18
🔗
|
furrie_ |
.local/bin/gs-server: No such file or directory |
01:18
🔗
|
furrie_ |
i got that for an error |
01:18
🔗
|
ivan` |
can you paste the command you ran |
01:20
🔗
|
furrie_ |
Wait, i installed it on / and not ~ |
01:20
🔗
|
furrie_ |
it works now |
01:20
🔗
|
ivan` |
how did you install it on /? |
01:20
🔗
|
ivan` |
did you omit the --user? |
01:21
🔗
|
furrie_ |
i meant i ran the installation commands while in / in terminal |
01:21
🔗
|
ivan` |
the current directly does not affect where pip3 installs things |
01:21
🔗
|
ivan` |
directory* |
01:21
🔗
|
ivan` |
~ means your home directory i.e. $HOME |
01:22
🔗
|
furrie_ |
yeah well never mind it's working |
01:22
🔗
|
ivan` |
I am worried more about you having an idea of what you're doing ;) |
01:23
🔗
|
ivan` |
did pip3 on debian really install to /.local i.e. / the root directory? that would make no sense |
01:23
🔗
|
ivan` |
ls -la /.local/bin please |
01:25
🔗
|
furrie_ |
comp ~ $ ls -la /.local/bin ls: cannot access /.local/bin: No such file or directory comp ~ $ ls -la /.local/bin ls: cannot access /.local/bin: No such file or directory |
01:25
🔗
|
ivan` |
yeah, I thought so |
01:25
🔗
|
furrie_ |
i'm on ~ in terminal not / |
01:25
🔗
|
furrie_ |
but ~/.local/bin/gs-server strangely works |
01:26
🔗
|
ivan` |
yes, that's exactly what the README says to do |
01:27
🔗
|
ivan` |
that ~/ resolves to your home directory and it does not matter where you are |
01:27
🔗
|
furrie_ |
yeah |
01:27
🔗
|
|
yakfish has joined #archiveteam-bs |
01:27
🔗
|
furrie_ |
i'm actually on the part trying to figure out how to add a site onto the grab-site dashboard |
01:28
🔗
|
dxrt |
furrie_: Have you read the README the steps are all very well documented |
01:28
🔗
|
aaaaaaaaa |
almost too well.... What kind of docs are user friendly? |
01:29
🔗
|
furrie_ |
https://github.com/ludios/grab-site -- isn't the readme on here |
01:31
🔗
|
ivan` |
yes |
01:33
🔗
|
furrie_ |
http://pastebin.com/1w20eMaW - i didn't use tmux but i did this in a separate tab from the tab where grab-site is running |
01:33
🔗
|
furrie_ |
hopefully im not frustrating you |
01:34
🔗
|
ivan` |
no, that looks like a problem that could be my fault |
01:34
🔗
|
furrie_ |
hm |
01:34
🔗
|
furrie_ |
oh |
01:35
🔗
|
furrie_ |
so do you know what to do to fix the issue ivan |
01:35
🔗
|
ivan` |
try pip3 install --user --upgrade trollius |
01:35
🔗
|
furrie_ |
hey, now it works |
01:36
🔗
|
ivan` |
cool |
01:36
🔗
|
ivan` |
you should tmux if you plan to ever disconnect from your terminal session |
01:37
🔗
|
ivan` |
just install tmux via apt and run tmux before doing stuff |
01:37
🔗
|
ivan` |
ctrl-b d to detach from your tmux, tmux attach to reattach |
01:38
🔗
|
furrie_ |
how do i add ignoresets while grabsite is runnjing |
01:40
🔗
|
ivan` |
edit DIR/igsets |
01:41
🔗
|
furrie_ |
or more specifically add commands while the site's running |
01:41
🔗
|
ivan` |
you can't really give grab-site commands, you edit files in whatever directory it created, and it picks up the changes |
01:41
🔗
|
furrie_ |
Oh |
01:43
🔗
|
furrie_ |
they' |
01:43
🔗
|
furrie_ |
what about for regex ignores |
01:43
🔗
|
ivan` |
DIR/ignores |
01:44
🔗
|
|
tomwsmf-a has quit IRC (Ping timeout: 258 seconds) |
01:44
🔗
|
furrie_ |
separated line by line or comma |
01:45
🔗
|
ivan` |
line |
01:45
🔗
|
ivan` |
"DIR/ignores is a newline-separated list of Python 3 regular expressions to use in addition to the ignore sets." |
01:50
🔗
|
|
kyan has joined #archiveteam-bs |
01:51
🔗
|
|
kyanz`bot has joined #archiveteam-bs |
01:53
🔗
|
|
furrie_ has quit IRC (Ping timeout: 240 seconds) |
02:02
🔗
|
|
furrie has joined #archiveteam-bs |
02:02
🔗
|
furrie |
last important question--how do you change the directory in which grab-site crawls are being saved to (default: home dir) |
02:06
🔗
|
kyan |
it's the current working directory |
02:07
🔗
|
kyan |
(it's in the readme: https://github.com/ludios/grab-site/) |
02:07
🔗
|
kyan |
furrie: ^ |
02:07
🔗
|
ivan` |
yeah, just cd to another directory first |
02:07
🔗
|
furrie |
awesome, thanks. |
02:08
🔗
|
furrie |
yeah because my computer has limited memory, and i have an external hard drive that could do with some use. |
02:10
🔗
|
|
furrie has quit IRC (Quit: Page closed) |
02:17
🔗
|
|
kyanz`bot has quit IRC (Quit: KeyboardInterrupt) |
02:19
🔗
|
|
yakfish has quit IRC (Quit: Read error: Operation timed out) |
02:20
🔗
|
|
yakfish has joined #archiveteam-bs |
02:20
🔗
|
|
furrie has joined #archiveteam-bs |
02:21
🔗
|
|
kyanz`bot has joined #archiveteam-bs |
02:21
🔗
|
|
kyanz`bot has quit IRC (Remote host closed the connection) |
02:22
🔗
|
furrie |
okay sorry for coming back yet again. i have a problem--i cannot use grab-site in my external hard drive |
02:22
🔗
|
ivan` |
furrie: well, what went wrong |
02:22
🔗
|
furrie |
http://pastebin.com/KeG4Q3ir |
02:22
🔗
|
ivan` |
ah, yes, sorry, it looks like it is broken with spaces in the directory name |
02:23
🔗
|
furrie |
so it can usually work in HDD otherwise? |
02:23
🔗
|
ivan` |
yes |
02:24
🔗
|
furrie |
i assume there's no workaroudn other than changing the HDD's name |
02:24
🔗
|
ivan` |
you can create a symbolic link to the directory with ln -s and cd into that directory instead |
02:24
🔗
|
ivan` |
I will bbl |
02:25
🔗
|
furrie |
okay |
02:26
🔗
|
|
kyanz`bot has joined #archiveteam-bs |
02:26
🔗
|
|
zenguy_pc has quit IRC (Read error: Connection reset by peer) |
02:27
🔗
|
|
zenguy_pc has joined #archiveteam-bs |
02:27
🔗
|
furrie |
bash: cd: bone: Too many levels of symbolic links |
02:29
🔗
|
furrie |
thats the error i get from trying to cd into the directory within my HDD with a space in it |
02:29
🔗
|
|
zenguy_pc has quit IRC (Read error: Connection reset by peer) |
02:30
🔗
|
|
zenguy_pc has joined #archiveteam-bs |
02:32
🔗
|
|
xtr-201 has quit IRC (Read error: Operation timed out) |
02:35
🔗
|
|
kyanz`bot has quit IRC (Remote host closed the connection) |
02:35
🔗
|
furrie |
oh wait, i'm figuring it out |
02:35
🔗
|
|
furrie has quit IRC (Quit: Page closed) |
02:35
🔗
|
aaaaaaaaa |
Try absolute paths, that way you don't have to worry about the strange way ln treats relative links |
02:35
🔗
|
aaaaaaaaa |
or not |
03:01
🔗
|
|
mistym has joined #archiveteam-bs |
03:48
🔗
|
|
rejk has quit IRC (Ping timeout: 483 seconds) |
03:53
🔗
|
ivan` |
if you see furrie tell him the space issue is fixed and he just needs to install grab-site again |
04:10
🔗
|
|
aaaaaaaaa has quit IRC (Leaving) |
04:42
🔗
|
|
mistym_ has joined #archiveteam-bs |
04:43
🔗
|
|
mistym has quit IRC (Ping timeout: 252 seconds) |
04:55
🔗
|
|
mistym has joined #archiveteam-bs |
05:01
🔗
|
|
mistym_ has quit IRC (Read error: Operation timed out) |
05:41
🔗
|
|
JesseW has joined #archiveteam-bs |
07:12
🔗
|
|
JesseW has quit IRC (Quit: Leaving.) |
07:28
🔗
|
|
BlueMaxim has quit IRC (Read error: Connection reset by peer) |
07:35
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
07:38
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
08:05
🔗
|
|
superkuh has quit IRC (Read error: Operation timed out) |
08:23
🔗
|
|
superkuh has joined #archiveteam-bs |
08:36
🔗
|
|
mistym has joined #archiveteam-bs |
08:41
🔗
|
|
schbirid has joined #archiveteam-bs |
08:43
🔗
|
|
mistym has quit IRC (Read error: Operation timed out) |
09:01
🔗
|
|
SimpBrain has joined #archiveteam-bs |
09:19
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
12:01
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
12:04
🔗
|
|
brayden has joined #archiveteam-bs |
12:06
🔗
|
|
schbirid has quit IRC (Read error: Operation timed out) |
12:08
🔗
|
|
brayden has quit IRC (Read error: Connection reset by peer) |
12:18
🔗
|
|
schbirid has joined #archiveteam-bs |
12:40
🔗
|
|
ivan` has left |
12:46
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
12:55
🔗
|
|
RichardG has quit IRC (Remote host closed the connection) |
12:58
🔗
|
|
RichardG has joined #archiveteam-bs |
13:25
🔗
|
|
brayden has joined #archiveteam-bs |
14:16
🔗
|
|
tomwsmf-a has joined #archiveteam-bs |
14:17
🔗
|
|
kyan has quit IRC (Quit: This computer has gone to sleep) |
14:19
🔗
|
|
useretail has quit IRC (Remote host closed the connection) |
14:41
🔗
|
|
mistym has joined #archiveteam-bs |
14:51
🔗
|
|
mistym has quit IRC (Ping timeout: 606 seconds) |
15:08
🔗
|
|
furrie has joined #archiveteam-bs |
15:10
🔗
|
furrie |
regex question: say i want to exclude a site in grab-site's DIR/ignores file. how can i make it so that i exclude, for instance www.example.com and example.com both on the same line? |
15:11
🔗
|
furrie |
would i just have to add http://(www)?\.example\.com/? or what |
15:12
🔗
|
xmc |
you want to put the . in the parens with the www |
15:12
🔗
|
xmc |
but yeah |
15:13
🔗
|
furrie |
also the slash before the period right |
15:14
🔗
|
xmc |
yes |
15:14
🔗
|
furrie |
so in total -- http://(www\.)?example\.com/? correct |
15:15
🔗
|
xmc |
yep |
15:17
🔗
|
|
mistym has joined #archiveteam-bs |
15:20
🔗
|
furrie |
okay thanks |
15:21
🔗
|
furrie |
and to only exclude, for instance, &sid= only at the end of a url i do &sid=.*$ probably not right, but what's the correct answer |
15:22
🔗
|
furrie |
does archivebot and grabsite allow the regex .* to be used in any case |
15:24
🔗
|
furrie |
maybe to exclude &sid= only at the end of a url for instance i might do ^.*&sid= correct |
15:25
🔗
|
furrie |
yeah sorry if this is unclear |
15:27
🔗
|
furrie |
disregard maybe it's ^http://www\.example\.com/.*/?&sid= |
15:30
🔗
|
furrie |
nm actually |
15:31
🔗
|
|
furrie has quit IRC (Quit: Page closed) |
15:44
🔗
|
|
JesseW has joined #archiveteam-bs |
15:45
🔗
|
|
brayden_ has joined #archiveteam-bs |
15:50
🔗
|
|
brayden has quit IRC (Read error: Operation timed out) |
16:27
🔗
|
|
Apathy has quit IRC (Quit: OOOOoooooooooo................) |
16:33
🔗
|
|
mistym_ has joined #archiveteam-bs |
16:35
🔗
|
|
mistym has quit IRC (Ping timeout: 492 seconds) |
16:50
🔗
|
|
godane has quit IRC (Read error: Operation timed out) |
17:08
🔗
|
|
godane has joined #archiveteam-bs |
17:10
🔗
|
godane |
does any here use rtl8821ae for wifi? |
17:22
🔗
|
|
kniffy has quit IRC (Ping timeout: 483 seconds) |
17:24
🔗
|
|
mistym_ has quit IRC (Remote host closed the connection) |
17:29
🔗
|
|
Apathy has joined #archiveteam-bs |
17:39
🔗
|
|
SadDM has quit IRC (Ping timeout: 483 seconds) |
17:43
🔗
|
|
SadDM has joined #archiveteam-bs |
18:08
🔗
|
|
JesseW has quit IRC (Quit: Leaving.) |
18:12
🔗
|
|
kniffy has joined #archiveteam-bs |
18:13
🔗
|
|
JesseW has joined #archiveteam-bs |
18:14
🔗
|
|
dashcloud has quit IRC (Ping timeout: 265 seconds) |
18:16
🔗
|
|
JesseW has quit IRC (Client Quit) |
18:18
🔗
|
|
ivan` has joined #archiveteam-bs |
18:21
🔗
|
|
JesseW has joined #archiveteam-bs |
18:27
🔗
|
|
dashcloud has joined #archiveteam-bs |
18:40
🔗
|
|
useretail has joined #archiveteam-bs |
18:45
🔗
|
|
JesseW has quit IRC (Quit: Leaving.) |
19:39
🔗
|
|
aaaaaaaaa has joined #archiveteam-bs |
19:52
🔗
|
|
tomwsmf-a has quit IRC (Read error: Operation timed out) |
19:56
🔗
|
|
schbirid has quit IRC (Leaving) |
20:02
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
20:05
🔗
|
|
dashcloud has joined #archiveteam-bs |
20:44
🔗
|
|
tomwsmf-a has joined #archiveteam-bs |
20:54
🔗
|
|
tomwsmf-a has quit IRC (Read error: Operation timed out) |
21:18
🔗
|
|
dashcloud has quit IRC (Read error: Connection reset by peer) |
21:18
🔗
|
|
godane has quit IRC (Quit: Leaving.) |
21:22
🔗
|
|
dashcloud has joined #archiveteam-bs |
21:24
🔗
|
|
godane has joined #archiveteam-bs |
21:28
🔗
|
godane |
i need help guys |
21:28
🔗
|
godane |
my rtl8821ae wifi card sucks |
21:29
🔗
|
|
Asparagir has joined #archiveteam-bs |
21:30
🔗
|
godane |
i can't really connect to internet with my alfa awus036h usb |
21:31
🔗
|
godane |
my main problem is that i will be limited to what i can grab with mplayer -dumpstream |
21:33
🔗
|
|
marvinw has quit IRC (Remote host closed the connection) |
21:40
🔗
|
|
marvinw has joined #archiveteam-bs |
21:41
🔗
|
joepie91 |
godane: what's wrong with it? |
21:43
🔗
|
godane |
its slower then the other wifi card i was using |
21:48
🔗
|
|
Kazzy has quit IRC (Ping timeout: 252 seconds) |
21:48
🔗
|
joepie91 |
ah, hm, weird |
21:49
🔗
|
aaaaaaaaa |
if you stay plugged in, try turning off power saving modes or try a better (or DIY) antenna. |
21:52
🔗
|
godane |
the wifi card is at |
21:52
🔗
|
godane |
72.2 Mb/s |
21:52
🔗
|
godane |
i normally never had a card go at 72.2Mb/s |
21:57
🔗
|
aaaaaaaaa |
Isn't that the maximum, at least if you don't channel bond? |
22:00
🔗
|
|
godane has quit IRC (Quit: Leaving.) |
22:02
🔗
|
|
godane has joined #archiveteam-bs |
22:02
🔗
|
godane |
so wifi hates having channel changed |
22:09
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
22:10
🔗
|
|
Asparagir has quit IRC (Asparagir) |
22:16
🔗
|
|
BlueMaxim has quit IRC (Read error: Connection reset by peer) |
22:34
🔗
|
|
ripvanwin has quit IRC (Leaving) |
22:34
🔗
|
|
SadDM has quit IRC (Ping timeout: 483 seconds) |
22:42
🔗
|
|
ripvanwin has joined #archiveteam-bs |
22:48
🔗
|
|
SadDM has joined #archiveteam-bs |
23:01
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
23:22
🔗
|
|
Balrog_ has joined #archiveteam-bs |
23:23
🔗
|
|
Balrog_ is now known as Balrog-34 |
23:46
🔗
|
|
Kazzy has joined #archiveteam-bs |
23:59
🔗
|
yipdw |
Balrog-34: oh, I meant just from what I've seen in recent months |
23:59
🔗
|
yipdw |
archive jobs going across the warrior and archivebot and other things |
23:59
🔗
|
yipdw |
it's quite possible that I am just looking at all the wrong logs and/or have a terrible selection bias but yeah |
23:59
🔗
|
yipdw |
so it's not you |