Time |
Nickname |
Message |
00:00
🔗
|
|
xf2e has joined #archiveteam-bs |
00:08
🔗
|
|
oldcad has quit IRC (Quit: Leaving.) |
00:17
🔗
|
|
Aranje has quit IRC (Read error: Connection reset by peer) |
00:17
🔗
|
|
Aranje has joined #archiveteam-bs |
00:27
🔗
|
godane |
2008 dailymail.co.uk sitemap urls are fully uploaded now |
00:28
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
00:35
🔗
|
|
Aranje has quit IRC (Ping timeout: 240 seconds) |
00:41
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
00:42
🔗
|
|
mistym has joined #archiveteam-bs |
00:44
🔗
|
|
kyan has quit IRC (Quit: This computer has gone to sleep) |
00:47
🔗
|
|
Aranje has joined #archiveteam-bs |
01:35
🔗
|
szalwia |
trying to archive a site that uses ajax for pagination. is there any way to make archivebot click on things in phantomjs mode? |
01:36
🔗
|
|
kyan has joined #archiveteam-bs |
01:36
🔗
|
godane |
i'm grabbing MontherboardTV youtube channel |
01:37
🔗
|
godane |
its not that big it looks like |
01:42
🔗
|
kyan |
I ta |
01:42
🔗
|
kyan |
have to stop using google code. Eventually. I'm still using it all the time |
01:58
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
02:10
🔗
|
|
mistym has joined #archiveteam-bs |
02:11
🔗
|
|
bzc6p_ has joined #archiveteam-bs |
02:11
🔗
|
|
swebb sets mode: +o bzc6p_ |
02:14
🔗
|
|
bzc6p has quit IRC (Read error: Operation timed out) |
02:35
🔗
|
|
Aranje has quit IRC (Read error: Connection reset by peer) |
02:36
🔗
|
|
Aranje has joined #archiveteam-bs |
02:38
🔗
|
xf2e |
So I'm interested in (and have been) backing up Quora |
02:49
🔗
|
|
primus104 has quit IRC (Leaving.) |
02:58
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
03:08
🔗
|
|
Asparagir has joined #archiveteam-bs |
03:09
🔗
|
|
robink has joined #archiveteam-bs |
03:23
🔗
|
|
mistym has joined #archiveteam-bs |
03:27
🔗
|
|
bzc6p_ is now known as bzc6p |
03:27
🔗
|
bzc6p |
szalwia: AFAIK it doesn't. In such cases, I try to find out the number of pages from the source code and generate the URLs for the pages. |
03:29
🔗
|
|
Asparagir has quit IRC (Asparagir) |
03:30
🔗
|
szalwia |
bzc6p: no way to do this in my case, the ajax call returns javascript that then gets eval'd ;x |
03:31
🔗
|
szalwia |
i wrote my own scraper that works, but it only gets me image urls https://gist.github.com/szalwia/d8658efd2f66a7584050 |
03:31
🔗
|
bzc6p |
If you analyze the JS code, you might find out what URLs are GETted with ajax |
03:33
🔗
|
|
Aranje has quit IRC (Read error: Connection reset by peer) |
03:34
🔗
|
|
Aranje has joined #archiveteam-bs |
03:34
🔗
|
szalwia |
bzc6p: already did and as i said, they return additional javascript that gets eval()'d |
03:35
🔗
|
|
Aranje has quit IRC (Read error: Connection reset by peer) |
03:35
🔗
|
|
Aranje has joined #archiveteam-bs |
03:36
🔗
|
bzc6p |
szalwia: what is the site, if I may ask? (Not because I don't believe you, just to see that bastard myself.) |
03:37
🔗
|
szalwia |
bzc6p: ask.fm |
03:38
🔗
|
|
Aranje has quit IRC (Read error: Connection reset by peer) |
03:39
🔗
|
|
godane has quit IRC (Ping timeout: 306 seconds) |
03:40
🔗
|
szalwia |
bzc6p: that's how the "View more" button on user profiles works |
03:42
🔗
|
bzc6p |
I must be logged in to use that button? |
03:44
🔗
|
szalwia |
bzc6p: no |
03:44
🔗
|
szalwia |
http://ask.fm/ameeraaxxxxxxx |
03:44
🔗
|
szalwia |
it's at the bottom of the page |
03:44
🔗
|
bzc6p |
I see. It didn't worked for me, now I realized it needed cookies to work. |
03:45
🔗
|
bzc6p |
A think I'll never understand. |
03:45
🔗
|
bzc6p |
*thing |
03:46
🔗
|
bzc6p |
It makes a POST request. Argh. |
03:46
🔗
|
|
Aranje has joined #archiveteam-bs |
03:48
🔗
|
|
godane has joined #archiveteam-bs |
03:49
🔗
|
godane |
modem keeps going out |
03:58
🔗
|
yipdw |
https://en.wikipedia.org/wiki/Resiniferatoxin hooooly shit |
04:05
🔗
|
DFJustin |
countdown to youtube challenge |
04:05
🔗
|
yipdw |
hah |
04:07
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
04:08
🔗
|
bzc6p |
szalwia: it's indeed awful. I still believe that someone with advanced knowledge in JS can find out what POST request is actually triggered. And it's probably also WARCable, at least I managed to capture some POSTy shit with webrecorder.io, so I guess it's possible. – If you don't mind, I don't go deeper into that awful JS, otherwise I'll go insane. |
04:09
🔗
|
bzc6p |
szalwia: Also, ask.fm is a big site, thorough archiving would be a Warrior project one day when it's endangered. For archiving single profiles, you should try webrecorder.io, I did manage to archive e.g. Facebook pages with that |
04:10
🔗
|
bzc6p |
so it may work with this one too. |
04:13
🔗
|
szalwia |
https://webrecorder.io/ looks down from here |
04:22
🔗
|
|
Asparagir has joined #archiveteam-bs |
04:30
🔗
|
bzc6p |
szalwia: here too |
04:32
🔗
|
|
bzc6p sets mode: +o chfoo |
04:33
🔗
|
aaaaaaaaa |
works here |
04:33
🔗
|
|
bzc6p sets mode: +oooo godane garyrh Infreq Kazzy |
04:33
🔗
|
|
bzc6p sets mode: +oooo Kenshin midas Start wp494 |
04:33
🔗
|
bzc6p |
aaaaaaaaa: the front page does, but the archiving page gives 504 |
04:33
🔗
|
aaaaaaaaa |
ah ok, sorry for the confusion then. |
04:34
🔗
|
bzc6p |
aaaaaaaaa: no, me sorry, it works for other sites, |
04:34
🔗
|
bzc6p |
it just doesn't like ask.fm apparently |
04:37
🔗
|
|
aaaaaaaaa has quit IRC (Leaving) |
04:37
🔗
|
bzc6p |
Now the front page also 504s. |
04:39
🔗
|
|
mistym has joined #archiveteam-bs |
04:57
🔗
|
|
bzc6p has left |
05:09
🔗
|
|
Ravenloft has quit IRC (Ping timeout: 240 seconds) |
05:39
🔗
|
|
Start_ has joined #archiveteam-bs |
05:39
🔗
|
|
Start has quit IRC (Read error: Connection reset by peer) |
05:55
🔗
|
|
bsmith093 has quit IRC (Read error: Operation timed out) |
06:09
🔗
|
|
Aranje has quit IRC (Read error: Connection reset by peer) |
06:11
🔗
|
|
Start has joined #archiveteam-bs |
06:12
🔗
|
|
Start_ has quit IRC (Read error: Connection reset by peer) |
06:15
🔗
|
|
Aranje has joined #archiveteam-bs |
06:16
🔗
|
|
Aranje has quit IRC (Read error: Connection reset by peer) |
06:24
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
07:10
🔗
|
|
Asparagir has quit IRC (Asparagir) |
07:25
🔗
|
|
mistym has joined #archiveteam-bs |
07:28
🔗
|
|
yipdw has quit IRC (Quit: No Ping reply in 180 seconds.) |
07:30
🔗
|
|
yipdw has joined #archiveteam-bs |
07:34
🔗
|
|
mistym has quit IRC (Read error: Operation timed out) |
07:34
🔗
|
|
primus104 has joined #archiveteam-bs |
07:40
🔗
|
|
schbirid has joined #archiveteam-bs |
07:58
🔗
|
|
schbirid is now known as schbiridw |
07:58
🔗
|
|
schbiridw is now known as schbiwork |
08:06
🔗
|
|
toad2 has joined #archiveteam-bs |
08:08
🔗
|
|
toad1 has quit IRC (Read error: Operation timed out) |
08:29
🔗
|
|
primus104 has quit IRC (Leaving.) |
09:41
🔗
|
|
dx has quit IRC (Read error: Operation timed out) |
09:42
🔗
|
|
primus104 has joined #archiveteam-bs |
09:54
🔗
|
|
BlueMaxim has quit IRC (Read error: Connection reset by peer) |
10:09
🔗
|
|
dx has joined #archiveteam-bs |
10:12
🔗
|
|
godane has quit IRC (Ping timeout: 265 seconds) |
10:25
🔗
|
|
godane has joined #archiveteam-bs |
10:57
🔗
|
|
godane has quit IRC (Quit: Leaving.) |
11:06
🔗
|
|
godane has joined #archiveteam-bs |
11:38
🔗
|
|
xf2e has quit IRC (Ping timeout: 483 seconds) |
11:57
🔗
|
|
ohhdemgir has quit IRC (Read error: Connection reset by peer) |
12:00
🔗
|
|
balrog has quit IRC (Read error: Operation timed out) |
12:01
🔗
|
|
ripvanwin has quit IRC (Read error: Operation timed out) |
12:02
🔗
|
|
ohhdemgir has joined #archiveteam-bs |
12:05
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
12:08
🔗
|
|
balrog has joined #archiveteam-bs |
12:08
🔗
|
|
swebb sets mode: +o balrog |
12:11
🔗
|
|
dashcloud has joined #archiveteam-bs |
12:46
🔗
|
|
Stilett0 has joined #archiveteam-bs |
12:50
🔗
|
|
Stiletto has quit IRC (Ping timeout: 370 seconds) |
13:11
🔗
|
|
primus104 has quit IRC (Leaving.) |
13:26
🔗
|
godane |
SketchCow: a brief interview with the cuba SNET guy: http://media2.wptv.com/video/video_studio/2015/01/26/Jovenes_cubanos_construyeron_en_secreto__250724.mp4 |
13:27
🔗
|
godane |
we have video and words directly from him now |
13:27
🔗
|
godane |
there are no subs sadly |
13:43
🔗
|
|
Panasonic has joined #archiveteam-bs |
13:43
🔗
|
|
Panasonic is now known as Ravenloft |
13:44
🔗
|
Ravenloft |
https://retrogamingnr.wordpress.com/2015/07/06/17/ |
13:44
🔗
|
Ravenloft |
this is a nice write up |
13:47
🔗
|
Ravenloft |
except when it isnt |
13:51
🔗
|
Ravenloft |
the AVS by bunnyboy will be top class, he dismissed it, in a not elegant way, showing a pic of a proto encarnation of the HDMI solution that involved a top loader and was cancelled, it happened before the change to a brand new FPGA system, that is fully developed and in production |
14:30
🔗
|
|
mistym has joined #archiveteam-bs |
14:40
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
15:07
🔗
|
|
mistym has joined #archiveteam-bs |
16:08
🔗
|
|
primus104 has joined #archiveteam-bs |
16:43
🔗
|
|
goekesmi_ has quit IRC (Remote host closed the connection) |
16:45
🔗
|
|
Ravenloft has quit IRC (Ping timeout: 606 seconds) |
16:50
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
16:55
🔗
|
|
goekesmi has joined #archiveteam-bs |
17:05
🔗
|
|
mistym has joined #archiveteam-bs |
17:09
🔗
|
|
primus104 has quit IRC (Leaving.) |
17:39
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
17:44
🔗
|
|
aaaaaaaaa has joined #archiveteam-bs |
17:44
🔗
|
|
swebb sets mode: +o aaaaaaaaa |
17:45
🔗
|
|
dashcloud has joined #archiveteam-bs |
17:53
🔗
|
schbiwork |
nice, booting into rescue mode at oneprovider.com and port 22 is filtered |
17:56
🔗
|
schbiwork |
oh wait, it just took a "while" |
18:05
🔗
|
|
ripvanwin has joined #archiveteam-bs |
18:29
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
18:35
🔗
|
|
dashcloud has joined #archiveteam-bs |
18:43
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
18:59
🔗
|
|
dashcloud has joined #archiveteam-bs |
19:02
🔗
|
|
primus104 has joined #archiveteam-bs |
19:18
🔗
|
|
aaaaaaaaa has quit IRC (Leaving) |
19:33
🔗
|
|
aaaaaaaaa has joined #archiveteam-bs |
19:33
🔗
|
|
swebb sets mode: +o aaaaaaaaa |
19:42
🔗
|
joepie91 |
for future reference, to mirror an entire github user (repos only): curl https://api.github.com/users/$USERNAME/repos | jq -r .[].clone_url | xargs -L 1 git clone --mirror |
19:42
🔗
|
xmc |
nice |
19:48
🔗
|
joepie91 |
xmc: jq is basically magic. :P |
19:48
🔗
|
xmc |
yuuup |
19:48
🔗
|
joepie91 |
but the docs are shit, so: |
19:48
🔗
|
joepie91 |
-r means raw output, ie. no quote marks around the strings |
19:48
🔗
|
xmc |
i used it the other day in a makefile to avoid writing a proper client for an api |
19:48
🔗
|
joepie91 |
.[] means "for every item in the array" |
19:49
🔗
|
joepie91 |
.propname gets the propname |
19:49
🔗
|
joepie91 |
xmc: hehe |
19:49
🔗
|
joepie91 |
honestly syntax is pretty simple, it just has really poor docs |
19:49
🔗
|
joepie91 |
lol |
19:49
🔗
|
xmc |
jq | curl | jq --exit-status '.success == true' && success-stuff |
19:50
🔗
|
joepie91 |
oh, heh, clever |
19:50
🔗
|
xmc |
yeah i was pretty pleased |
19:50
🔗
|
joepie91 |
didn't know about exit-status |
19:50
🔗
|
joepie91 |
but I can guess what it does |
19:50
🔗
|
joepie91 |
"exit 0 if the expression is true" |
19:50
🔗
|
xmc |
yep |
19:50
🔗
|
joepie91 |
possibly also something like "or if it's a number, use that as exit code" |
19:50
🔗
|
xmc |
i could have done just '.success' but i don't always trust truthiness |
19:50
🔗
|
joepie91 |
(wild guess) |
19:50
🔗
|
joepie91 |
yeah |
19:50
🔗
|
joepie91 |
good call :P |
19:51
🔗
|
joepie91 |
too many people do.. it's like one of the top 3 things I end up correcting |
19:51
🔗
|
joepie91 |
when reviewing people's code |
19:51
🔗
|
joepie91 |
"you should not do that. use == null instead" |
19:51
🔗
|
joepie91 |
(it's related to truthiness) |
19:53
🔗
|
DFJustin |
http://www.cs.utah.edu/~gk/atwork/ |
19:54
🔗
|
|
aaaaaaaaa has quit IRC (Leaving) |
19:54
🔗
|
schbiwork |
curl https://api.github.com/users/"${USERNAME}"/repos | grep -Eo "https://github.com/"${USERNAME}"/.*\.git" | xargs -L 1 git clone --mirror |
19:54
🔗
|
schbiwork |
:P |
19:54
🔗
|
HCross |
schbiwork is it in their Paris location? |
19:55
🔗
|
schbiwork |
HCross: yeah, currently we reached "Can we run some test on the server? This will cause a short downtime." |
19:55
🔗
|
HCross |
Its basically Online.net rebranded |
19:55
🔗
|
joepie91 |
schbiwork: good bit less reliable |
19:55
🔗
|
HCross |
http://www.online.net/en/dedicated-server - go straight to the source |
19:56
🔗
|
schbiwork |
HCross: i know i know but their offer was good |
19:56
🔗
|
xmc |
schbiwork: you can also sed -e 's/"/\n/g' | grep '\.git$' |
19:56
🔗
|
schbiwork |
:) |
20:27
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
20:35
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
20:38
🔗
|
|
dashcloud has joined #archiveteam-bs |
20:40
🔗
|
|
mistym has joined #archiveteam-bs |
21:00
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
21:16
🔗
|
|
mistym has joined #archiveteam-bs |
21:54
🔗
|
|
RichardG has quit IRC (Remote host closed the connection) |
21:58
🔗
|
|
RichardG has joined #archiveteam-bs |
22:04
🔗
|
|
kyan has quit IRC (Quit: This computer has gone to sleep) |
22:28
🔗
|
|
aaaaaaaaa has joined #archiveteam-bs |
22:28
🔗
|
|
swebb sets mode: +o aaaaaaaaa |
22:32
🔗
|
aaaaaaaaa |
so I just discovered my second least favorite call to get from my family. |
22:34
🔗
|
aaaaaaaaa |
"my computer is acting funny." "what were you doing before that?" "I went to a website and the screen was all red, so my neighbor said to try it with internet explorer." |
22:35
🔗
|
joepie91 |
;_; |
22:35
🔗
|
RedType |
im hoping it was at least 10 or 11 |
22:35
🔗
|
joepie91 |
aaaaaaaaa: you're going to want to go over there right now if you can |
22:35
🔗
|
joepie91 |
and ensure it's not a cryptolocker |
22:36
🔗
|
aaaaaaaaa |
that is where I was |
22:36
🔗
|
joepie91 |
ah ok |
22:36
🔗
|
joepie91 |
also teach them to not ignore 'there's danger ahead' screens... >.> |
22:36
🔗
|
joepie91 |
(because $5 that that was the "all red screen" they were complaining about) |
22:36
🔗
|
RedType |
opendns + adblock everything + ghostery/disconnect to block oauth prompts |
22:37
🔗
|
joepie91 |
RedType: fuck ghostery |
22:37
🔗
|
joepie91 |
privacy badger |
22:37
🔗
|
aaaaaaaaa |
yeah, the default web browser is chrome, so that is what I am thinking too. |
22:37
🔗
|
joepie91 |
(ghostery is operated by a marketing company who collect and resell data on tracker blocking) |
22:37
🔗
|
joepie91 |
(seriously) |
22:37
🔗
|
RedType |
"this site redirected me to google? it's safe to log in to google!" |
22:37
🔗
|
aaaaaaaaa |
seriously? |
22:37
🔗
|
joepie91 |
yes, seriously |
22:37
🔗
|
aaaaaaaaa |
heh, guess so |
22:37
🔗
|
joepie91 |
hence, fuck ghostery |
22:37
🔗
|
joepie91 |
privacy badger is EFF |
22:37
🔗
|
joepie91 |
no such crap |
22:37
🔗
|
joepie91 |
:p |
22:38
🔗
|
joepie91 |
it also uses heuristics rather than blocklists |
22:38
🔗
|
joepie91 |
false positive rate is nonzero, but very very low |
22:38
🔗
|
aaaaaaaaa |
I'm thinking of just getting them a chromebook as my birthday present to myself. |
22:38
🔗
|
joepie91 |
lol |
22:38
🔗
|
RedType |
id like some proof on that claim about ghostery though, it looks like they have that as opt in |
22:38
🔗
|
joepie91 |
aaaaaaaaa: chromebooks are not immune to malware |
22:38
🔗
|
RedType |
the claim that they're collecting data |
22:38
🔗
|
RedType |
not the claim of who owns what |
22:39
🔗
|
joepie91 |
RedType: that may have been a recent change, but frankly, I would not trust a "keep marketing companies out" extension developed by a marketing company |
22:39
🔗
|
joepie91 |
regardless of opt in or opt out |
22:39
🔗
|
joepie91 |
massive conflict of interest |
22:39
🔗
|
joepie91 |
(the issue is pretty well documented from around a year or two ago) |
22:40
🔗
|
joepie91 |
also, aaaaaaaaa, you have installed unchecky, right? |
22:40
🔗
|
|
xf2e has joined #archiveteam-bs |
22:41
🔗
|
RedType |
well documented but you made the claim. anyways, fwiw i use noscript instead of That Jazz |
22:41
🔗
|
|
dashcloud has quit IRC (Remote host closed the connection) |
22:42
🔗
|
aaaaaaaaa |
never heard of unchecky, but they don't have admin privleges. Although that is more an ignorance defense than malware defense, it seems. |
22:43
🔗
|
joepie91 |
aaaaaaaaa: you'll want to install it regardless |
22:43
🔗
|
joepie91 |
it just auto-declines any bundled 'offers' (ie. malware) |
22:43
🔗
|
|
dashcloud has joined #archiveteam-bs |
22:43
🔗
|
joepie91 |
RedType: yes, and too busy to dig it up right now :P |
22:43
🔗
|
joepie91 |
just wanted to give a quick tip |
22:43
🔗
|
joepie91 |
also wtf, a frying pan has gone missing |
22:43
🔗
|
RedType |
fair enough |
22:44
🔗
|
RedType |
also, one of the best defences against cryptolockers is offline backups or backups that dont allow you to over write previous revisions |
22:45
🔗
|
RedType |
that latter one is actually a difficult to come by solution for consumers |
22:45
🔗
|
RedType |
at least in an open source/free form |
22:49
🔗
|
aaaaaaaaa |
A friend at work has it set up so user folders are shared on his home network and then has another computer pull any changes over that way. |
22:52
🔗
|
RedType |
deaugh |
22:53
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
22:58
🔗
|
|
dashcloud has joined #archiveteam-bs |
23:25
🔗
|
|
Asparagir has joined #archiveteam-bs |
23:26
🔗
|
|
w0rp has quit IRC (Read error: Operation timed out) |
23:45
🔗
|
|
Jonimus has quit IRC (Ping timeout: 370 seconds) |
23:56
🔗
|
|
Jonimus has joined #archiveteam-bs |