Time |
Nickname |
Message |
00:38
🔗
|
|
lunik198 has quit IRC (:x) |
00:39
🔗
|
|
lunik198 has joined #archiveteam-ot |
01:09
🔗
|
|
MrRadar2 has quit IRC (Remote host closed the connection) |
01:12
🔗
|
|
thuban1 has joined #archiveteam-ot |
01:14
🔗
|
|
MrRadar2 has joined #archiveteam-ot |
01:17
🔗
|
|
thuban has quit IRC (Read error: Operation timed out) |
01:21
🔗
|
|
Frogging has quit IRC (Quit: Close the World, Open the nExt) |
01:26
🔗
|
|
katocala has joined #archiveteam-ot |
01:35
🔗
|
|
Frogging has joined #archiveteam-ot |
01:35
🔗
|
|
X-Scale` has joined #archiveteam-ot |
01:36
🔗
|
|
X-Scale has quit IRC (Ping timeout: 240 seconds) |
01:36
🔗
|
|
X-Scale` is now known as X-Scale |
01:46
🔗
|
X-Scale |
Just saw this :( https://www.vice.com/en_us/article/8xwe9p/yahoo-groups-is-winding-down-and-all-content-will-be-permanently-removed |
01:47
🔗
|
X-Scale |
Is there a concerted effort to mirror it all before it's unplugged ? |
01:48
🔗
|
astrid |
yes #yahoosucks |
01:49
🔗
|
X-Scale |
Ah, thanks. Otherwise, it would be yet another Library of Alexandria down the drain. |
01:50
🔗
|
astrid |
it will be |
01:51
🔗
|
astrid |
but we are going to see what we can do to _mitigate_ that |
01:52
🔗
|
X-Scale |
That's a great and noble effort. |
02:38
🔗
|
|
DogsRNice has quit IRC (Read error: Connection reset by peer) |
03:01
🔗
|
|
qw3rty2 has joined #archiveteam-ot |
03:08
🔗
|
|
qw3rty has quit IRC (Ping timeout: 745 seconds) |
04:00
🔗
|
|
SynMonger has quit IRC (Quit: Wait, what?) |
04:01
🔗
|
|
qw3rty has joined #archiveteam-ot |
04:01
🔗
|
|
SynMonger has joined #archiveteam-ot |
04:10
🔗
|
|
qw3rty2 has quit IRC (Ping timeout: 745 seconds) |
04:42
🔗
|
|
BlueMax has quit IRC (Quit: Leaving) |
04:42
🔗
|
|
BlueMax has joined #archiveteam-ot |
04:51
🔗
|
|
godane has quit IRC (Leaving.) |
05:07
🔗
|
|
manjaro-u has quit IRC (Konversation terminated!) |
05:30
🔗
|
|
dhyan_nat has joined #archiveteam-ot |
05:38
🔗
|
|
godane has joined #archiveteam-ot |
06:42
🔗
|
|
dhyan_nat has quit IRC (Read error: Operation timed out) |
07:05
🔗
|
|
katocala has quit IRC (Read error: Operation timed out) |
07:05
🔗
|
|
katocala has joined #archiveteam-ot |
07:46
🔗
|
|
killsushi has quit IRC (Read error: Connection reset by peer) |
07:46
🔗
|
|
killsushi has joined #archiveteam-ot |
07:50
🔗
|
|
Frogging has quit IRC (Read error: Operation timed out) |
07:50
🔗
|
|
JAA has quit IRC (Read error: Operation timed out) |
07:51
🔗
|
|
Frogging has joined #archiveteam-ot |
07:51
🔗
|
|
simon816 has quit IRC (Ping timeout: 246 seconds) |
07:51
🔗
|
|
lunik198 has quit IRC (Ping timeout (120 seconds)) |
07:51
🔗
|
|
dxrt has quit IRC (ZNC - http://znc.sourceforge.net) |
07:51
🔗
|
|
dxrt has joined #archiveteam-ot |
07:51
🔗
|
|
ats has quit IRC (Read error: Operation timed out) |
07:52
🔗
|
|
Fusl____ sets mode: +o dxrt |
07:52
🔗
|
|
Fusl sets mode: +o dxrt |
07:52
🔗
|
|
Fusl_ sets mode: +o dxrt |
07:53
🔗
|
|
lunik198 has joined #archiveteam-ot |
07:54
🔗
|
|
simon816 has joined #archiveteam-ot |
07:54
🔗
|
|
JAA has joined #archiveteam-ot |
07:54
🔗
|
|
Fusl____ sets mode: +o JAA |
07:54
🔗
|
|
Fusl sets mode: +o JAA |
07:54
🔗
|
|
Fusl_ sets mode: +o JAA |
07:55
🔗
|
|
ats has joined #archiveteam-ot |
07:55
🔗
|
|
AlsoJAA sets mode: +o JAA |
08:23
🔗
|
|
schbirid has joined #archiveteam-ot |
13:23
🔗
|
JAA |
I wrote a little script to automate discovery of social media given a list of web pages. That sounds fancier than it is; it just fetches the URL and extracts anything that looks like or could point to a Facebook, Flickr, Instagram, Twitter, VK, or YouTube page. The output requires a lot of manual cleanup obviously, but still saves time. It's here for anyone interested: |
13:23
🔗
|
JAA |
https://github.com/JustAnotherArchivist/little-things/blob/master/website-extract-social-media |
13:25
🔗
|
JAA |
There's also wiki-website-extract-social-media, which takes input in the form of a new-viewer wiki page and formats the output accordingly again. I use it by copying the wiki page source to the clipboard, then xclip -selection c -o | ./wiki-website-extract-social-media | ./social-media-normalise | ./youtube-normalise (plus redirection to a file because it would mangle with errors otherwise) |
13:25
🔗
|
JAA |
Ryz: ^ might be of interest to you. |
13:51
🔗
|
|
icedice has joined #archiveteam-ot |
13:57
🔗
|
|
systwi_ is now known as systwi |
14:34
🔗
|
|
icedice2 has joined #archiveteam-ot |
14:39
🔗
|
|
icedice has quit IRC (Ping timeout: 252 seconds) |
14:39
🔗
|
|
icedice2 has quit IRC (Client Quit) |
14:39
🔗
|
|
icedice has joined #archiveteam-ot |
14:43
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
14:43
🔗
|
|
killsushi has quit IRC (Quit: Leaving) |
15:14
🔗
|
|
BlueMax has quit IRC (Read error: Connection reset by peer) |
15:21
🔗
|
|
DogsRNice has joined #archiveteam-ot |
15:28
🔗
|
|
icedice has quit IRC (Quit: Leaving) |
15:44
🔗
|
|
thuban1 has quit IRC (Read error: Connection reset by peer) |
15:45
🔗
|
|
thuban1 has joined #archiveteam-ot |
15:58
🔗
|
|
girst has quit IRC (Quit: ZNC 1.7.3 - https://znc.in) |
16:01
🔗
|
|
girst has joined #archiveteam-ot |
17:31
🔗
|
|
MaximeleG has joined #archiveteam-ot |
17:39
🔗
|
JAA |
Huh, TIL you can access Facebook posts based only on the post ID. E.g. https://www.facebook.com/10162401954220022 which then redirects to https://www.facebook.com/jungegruene.jeunesverts/posts/10162401954220022 |
17:41
🔗
|
DogsRNice |
makes sense |
17:41
🔗
|
JAA |
Yeah, except nothing else makes sense on Facebook. :-P |
17:56
🔗
|
JAA |
One example: you can insert periods anywhere in the username and still get the profile. |
18:11
🔗
|
|
manjaro-u has joined #archiveteam-ot |
18:11
🔗
|
|
thuban1 is now known as thuban |
18:17
🔗
|
Raccoon |
++ |
18:18
🔗
|
Raccoon |
well, facebook isn't alone with the periods. gmail / google accounts do that too |
18:18
🔗
|
Raccoon |
and i presume youtube usernames |
18:19
🔗
|
|
girst has quit IRC (Quit: ZNC 1.7.5 - https://znc.in) |
18:19
🔗
|
|
girst has joined #archiveteam-ot |
18:20
🔗
|
JAA |
That's something completely different. |
18:20
🔗
|
JAA |
And no, YouTube doesn't do this. |
18:21
🔗
|
|
manjaro-u has quit IRC (Quit: Konversation terminated!) |
18:21
🔗
|
JAA |
Well ok, not completely different, but still not the same thing. |
18:24
🔗
|
|
manjaro-u has joined #archiveteam-ot |
18:24
🔗
|
|
katocala has quit IRC () |
18:33
🔗
|
hook54321 |
I've seen lots of email providers do that |
18:33
🔗
|
Raccoon |
you can log in / email / etc the user john.smith or johnsmith |
18:33
🔗
|
Raccoon |
just because the corporate world made the dot format so prevelent |
18:34
🔗
|
Raccoon |
maybe the university world |
20:11
🔗
|
|
thuban has quit IRC (Read error: Connection reset by peer) |
20:12
🔗
|
|
thuban1 has joined #archiveteam-ot |
20:12
🔗
|
|
thuban1 is now known as thuban |
20:18
🔗
|
|
icedice has joined #archiveteam-ot |
20:19
🔗
|
markedL |
Fusl : this (now outdated) paper suggests than an IP address is worth $100/day when run as a google cookie factory http://nsl.cs.columbia.edu/papers/2016/recaptcha.eurosp16.pdf |
20:21
🔗
|
Fusl |
hm? |
20:23
🔗
|
Ryz |
Watching Vsauce recently, and I"m reminded of this video he made 6 years ago, 'Where Do Deleted Files Go?': https://www.youtube.com/watch?v=G5s4-Kak49o |
20:24
🔗
|
markedL |
oh, well the summary is Google will use a user's tracking cookie as part of the risk assessment whether to bypass an image recaptcha. So they setup automation to emulate a human browser behavior for a bit to be able to present cookies to avoid the image captcha solving part. |
20:52
🔗
|
|
MaximeleG has quit IRC (Quit: MaximeleG) |
21:13
🔗
|
|
icedice has quit IRC (Quit: icedice) |
22:36
🔗
|
|
BlueMax has joined #archiveteam-ot |
22:58
🔗
|
|
godane has quit IRC (Ping timeout: 246 seconds) |
23:09
🔗
|
|
dhyan_nat has joined #archiveteam-ot |
23:15
🔗
|
|
godane has joined #archiveteam-ot |