#projectnewsletter 2015-12-04,Fri

↑back Search

Time	Nickname	Message
15:19 ^🔗		Start has quit IRC (Quit: Disconnected.)
16:24 ^🔗		Start has joined #projectnewsletter
16:24 ^🔗		Start has quit IRC (Client Quit)
16:24 ^🔗		Start has joined #projectnewsletter
17:07 ^🔗		Start has quit IRC (Quit: Disconnected.)
17:12 ^🔗		Start has joined #projectnewsletter
18:38 ^🔗		Start has quit IRC (Quit: Disconnected.)
19:36 ^🔗		Start has joined #projectnewsletter
20:37 ^🔗		Start has quit IRC (Quit: Disconnected.)
20:43 ^🔗		Start has joined #projectnewsletter
20:45 ^🔗		Start has quit IRC (Client Quit)
20:48 ^🔗		Start has joined #projectnewsletter
21:19 ^🔗		nickname has joined #projectnewsletter
21:19 ^🔗	nickname	hello
21:21 ^🔗		nickname has quit IRC (Client Quit)
21:33 ^🔗		nickname has joined #projectnewsletter
21:33 ^🔗	nickname	anyone scraped google yet?
21:50 ^🔗	achip	nickname, feel free to, it's likely changed since the last time
21:51 ^🔗	nickname	what should I use?
21:52 ^🔗	achip	good question. My usual goto is just open a browser, enter the search and ctrl+click to open each page of the results in a tab. then use a "copy links" extension to copy all the links on the tabs. paste that into a document then regex
21:52 ^🔗	achip	it's manually intensive but it works
21:56 ^🔗	nickname	I found a big list of email pages, only problem is that their in some strange compressed JS.
21:57 ^🔗	nickname	Here's an unrelated pastebin scrape link: https://pastebin.com/raw.php?i=0tZYHKRP
22:03 ^🔗	achip	here's what I regex'd from that list http://paste.nerds.io/raw/qipuzanafa
22:08 ^🔗	nickname	Here's another link, some of the links on it may be dead and it's in JS: https://pastebin.com/raw.php?i=8LKpiZD6
22:14 ^🔗	achip	and what I got from that: http://paste.nerds.io/raw/anadaresug
22:14 ^🔗	nickname	Thank you
22:15 ^🔗	nickname	I'm on windows, so I can't do the awesome command line text manipulation that is the *nix terminal
22:16 ^🔗	achip	no problem, for future reference that was: cat thingy2.txt \| egrep -oE "\.mba\":\"[^\"]*" \| sed "s/^.mba\":\"//" > thingy2-res.txt
22:16 ^🔗		Start has quit IRC (Quit: Disconnected.)
22:18 ^🔗	nickname	There's a whole git repository of just gnu mailman links
22:18 ^🔗	achip	perfect!
22:19 ^🔗	nickname	It's at bitbucket: https://bitbucket.org/themailbait/themailbait.bitbucket.org/src/501cbbc613d2ebc56d77ccc0f3288c88c0a0d042/jsonp/?at=master
22:19 ^🔗	nickname	There all in JS
22:19 ^🔗	nickname	and there may be some dead links, but it's a start!
22:21 ^🔗	achip	that mail bait project is interesting (http://www.mailbait.info) at least I think it's the same
22:22 ^🔗	nickname	It's the same, it's linked in the source of the page
22:22 ^🔗	nickname	proof: check the source of this page www.mailbait.info/run.html?pack=52
22:23 ^🔗	achip	nice, good find
22:38 ^🔗	nickname	Another one: https://pastebin.com/raw.php?i=2Ui43VaE
22:47 ^🔗	*	nickname slaps achip around a bit with a large fishbot
22:49 ^🔗	achip	pretty similar: http://paste.nerds.io/raw/aseyitohum
22:52 ^🔗	nickname	As for the problem of archiving the email messages, I suggest using gmail, but having 1,000 variations on 1 account, such as, wearegoingtorescue@gmail.com to we.are.going.t.o.res.cu.e@gmail.com
22:55 ^🔗		nickname_ has joined #projectnewsletter
22:55 ^🔗	nickname_	woah
22:55 ^🔗	nickname_	this is weird
22:56 ^🔗		nickname has quit IRC (Ping timeout: 240 seconds)
22:57 ^🔗		nickname has joined #projectnewsletter
22:57 ^🔗	nickname	I am here
22:57 ^🔗	nickname	achip: here is another one: /mailman/subscribe/
22:58 ^🔗	nickname	ignore that
22:58 ^🔗	nickname	I have too much stuff in my clipboard
23:00 ^🔗		nickname_ has quit IRC (Ping timeout: 240 seconds)
23:17 ^🔗	nickname	achip: another one: https://pastebin.com/raw.php?i=WnTrt6NL
23:29 ^🔗		Start has joined #projectnewsletter
23:30 ^🔗		svchfoo1 sets mode: +o Start
23:36 ^🔗	Start	achip: does project newsletter regularly upload archived newsletters to archive.org yet?
23:41 ^🔗	arkiver	Start: the project isn't completely finished
23:41 ^🔗	arkiver	yet*
23:41 ^🔗	arkiver	We should get back to work on it soon
23:44 ^🔗	Start	alright, i was just wondering because mail1-3.newsletter.nerds.io have all been archiving various newsletters for several months
23:45 ^🔗	Start	i'd personally love to have a warrior project for discovering newsletters, although that would likely be very hard to code
23:54 ^🔗	nickname	I found a git repo of JS files containing GNU mailman links
23:54 ^🔗	nickname	Here it is:https://bitbucket.org/themailbait/themailbait.bitbucket.org/src/501cbbc613d2ebc56d77ccc0f3288c88c0a0d042/jsonp/?at=master

irclogger-viewer