#projectnewsletter 2015-12-04,Fri

↑back Search

Time Nickname Message
15:19 🔗 Start has quit IRC (Quit: Disconnected.)
16:24 🔗 Start has joined #projectnewsletter
16:24 🔗 Start has quit IRC (Client Quit)
16:24 🔗 Start has joined #projectnewsletter
17:07 🔗 Start has quit IRC (Quit: Disconnected.)
17:12 🔗 Start has joined #projectnewsletter
18:38 🔗 Start has quit IRC (Quit: Disconnected.)
19:36 🔗 Start has joined #projectnewsletter
20:37 🔗 Start has quit IRC (Quit: Disconnected.)
20:43 🔗 Start has joined #projectnewsletter
20:45 🔗 Start has quit IRC (Client Quit)
20:48 🔗 Start has joined #projectnewsletter
21:19 🔗 nickname has joined #projectnewsletter
21:19 🔗 nickname hello
21:21 🔗 nickname has quit IRC (Client Quit)
21:33 🔗 nickname has joined #projectnewsletter
21:33 🔗 nickname anyone scraped google yet?
21:50 🔗 achip nickname, feel free to, it's likely changed since the last time
21:51 🔗 nickname what should I use?
21:52 🔗 achip good question. My usual goto is just open a browser, enter the search and ctrl+click to open each page of the results in a tab. then use a "copy links" extension to copy all the links on the tabs. paste that into a document then regex
21:52 🔗 achip it's manually intensive but it works
21:56 🔗 nickname I found a big list of email pages, only problem is that their in some strange compressed JS.
21:57 🔗 nickname Here's an unrelated pastebin scrape link: https://pastebin.com/raw.php?i=0tZYHKRP
22:03 🔗 achip here's what I regex'd from that list http://paste.nerds.io/raw/qipuzanafa
22:08 🔗 nickname Here's another link, some of the links on it may be dead and it's in JS: https://pastebin.com/raw.php?i=8LKpiZD6
22:14 🔗 achip and what I got from that: http://paste.nerds.io/raw/anadaresug
22:14 🔗 nickname Thank you
22:15 🔗 nickname I'm on windows, so I can't do the awesome command line text manipulation that is the *nix terminal
22:16 🔗 achip no problem, for future reference that was: cat thingy2.txt | egrep -oE "\.mba\":\"[^\"]*" | sed "s/^.mba\":\"//" > thingy2-res.txt
22:16 🔗 Start has quit IRC (Quit: Disconnected.)
22:18 🔗 nickname There's a whole git repository of just gnu mailman links
22:18 🔗 achip perfect!
22:19 🔗 nickname It's at bitbucket: https://bitbucket.org/themailbait/themailbait.bitbucket.org/src/501cbbc613d2ebc56d77ccc0f3288c88c0a0d042/jsonp/?at=master
22:19 🔗 nickname There all in JS
22:19 🔗 nickname and there may be some dead links, but it's a start!
22:21 🔗 achip that mail bait project is interesting (http://www.mailbait.info) at least I think it's the same
22:22 🔗 nickname It's the same, it's linked in the source of the page
22:22 🔗 nickname proof: check the source of this page www.mailbait.info/run.html?pack=52
22:23 🔗 achip nice, good find
22:38 🔗 nickname Another one: https://pastebin.com/raw.php?i=2Ui43VaE
22:47 🔗 * nickname slaps achip around a bit with a large fishbot
22:49 🔗 achip pretty similar: http://paste.nerds.io/raw/aseyitohum
22:52 🔗 nickname As for the problem of archiving the email messages, I suggest using gmail, but having 1,000 variations on 1 account, such as, wearegoingtorescue@gmail.com to we.are.going.t.o.res.cu.e@gmail.com
22:55 🔗 nickname_ has joined #projectnewsletter
22:55 🔗 nickname_ woah
22:55 🔗 nickname_ this is weird
22:56 🔗 nickname has quit IRC (Ping timeout: 240 seconds)
22:57 🔗 nickname has joined #projectnewsletter
22:57 🔗 nickname I am here
22:57 🔗 nickname achip: here is another one: /mailman/subscribe/
22:58 🔗 nickname ignore that
22:58 🔗 nickname I have too much stuff in my clipboard
23:00 🔗 nickname_ has quit IRC (Ping timeout: 240 seconds)
23:17 🔗 nickname achip: another one: https://pastebin.com/raw.php?i=WnTrt6NL
23:29 🔗 Start has joined #projectnewsletter
23:30 🔗 svchfoo1 sets mode: +o Start
23:36 🔗 Start achip: does project newsletter regularly upload archived newsletters to archive.org yet?
23:41 🔗 arkiver Start: the project isn't completely finished
23:41 🔗 arkiver yet*
23:41 🔗 arkiver We should get back to work on it soon
23:44 🔗 Start alright, i was just wondering because mail1-3.newsletter.nerds.io have all been archiving various newsletters for several months
23:45 🔗 Start i'd personally love to have a warrior project for discovering newsletters, although that would likely be very hard to code
23:54 🔗 nickname I found a git repo of JS files containing GNU mailman links
23:54 🔗 nickname Here it is:https://bitbucket.org/themailbait/themailbait.bitbucket.org/src/501cbbc613d2ebc56d77ccc0f3288c88c0a0d042/jsonp/?at=master

irclogger-viewer