#archiveteam-bs 2017-12-05,Tue

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)

WhoWhatWhen
ola_norskstill funny though..I see some some even take it as real news :D
shits gonna go viral :D
[00:00]
***jspiros has quit IRC (Read error: Operation timed out) [00:01]
ola_norsk'Irish villagers" lol ...you know , the kind with one pub in town, one logal drunk; and massive Viagra factory :D
sad thing is, these things might need archiving.. There's a propsoal for 5 years imprisonment for 'sharing fake news' in Ireland :/
[00:03]
***jspiros has joined #archiveteam-bs [00:07]
ola_norskCoolCanuk: i kind of hope this a fake news paper as well (it's not though) https://www.independent.ie/irish-news/politics/five-years-in-jail-for-spreading-fake-news-under-ff-proposal-36375745.html [00:08]
CoolCanuknope [00:08]
ola_norsk'biggest selling newspaper to Irish in Britain' might need archiving then [00:09]
CoolCanuklmao
also, why cant the fake news sites just say theyre fake
theyre allowed to be creative. parody and such is nice. But taking away their creativity is not
[00:09]
ola_norskit people know they are, it just exploded kind of account of that proposal :d
i think*
[00:10]
CoolCanukis fanfiction illegal too? conspiracy? predictions?
:P
[00:10]
ola_norskaye
writing a parody of a new article that catches on, 5 years in prison risk...that is bad
news*
[00:10]
CoolCanukI see what theyre trying to stop [00:12]
ola_norskaye. But it's like throwing the baby out with the bath wather though :/ [00:12]
CoolCanukbut I hope it doesnt impact people from sharing news. Sharing news you wrote and know is fake is one thing, but people have a hard time knowing if it's fake [00:13]
ola_norskyeah, but "sharing" it? [00:13]
CoolCanuklike, people would be uncertain if the news is real, so newspapers die because no one shares and eventually everyone forgets
I share news sometimes
[00:13]
ola_norskwho doesn't ? [00:14]
CoolCanukI care about the environment, so if there's a report that the EPA head person isn't sure how much lead is safe in water (hint: NONE), I might share it [00:14]
ola_norskwell, if you reside in Ireland, you might want to make sure it's factual and real.. :D [00:15]
CoolCanukye
then
there's the issue of accidental fake news
like a news reporting a crash on a highway, but it's for a movie skit and they didnt know
[00:15]
ola_norskanyway, before i stray. There's no way that 'news site' is real..But, it's funny :D [00:16]
CoolCanuk:P
http://paste.nerds.io/ appears to be gone, and we link to some pastes. Nice -_-
[00:16]
ola_norskthe world needs funny stuffs
pastebin is no less prone than t.co :/
some morning some nerd has the idea "Hey! Let's change subdomain!" ...
[00:17]
SketchCowhttps://archive.org/details/manga_library?and%5B%5D=addeddate%3A2017-12%2A&sort=-publicdate&page=1
1,000 in.
102,000 to go.
[00:20]
***Ing3b0rg has quit IRC (Ping timeout: 260 seconds)
icedice has joined #archiveteam-bs
Ing3b0rg has joined #archiveteam-bs
[00:20]
ola_norskin wget documentaion, anyone know what "--follow-tags=LIST" might mean? could that mean "alt=" in url links ?
--follow-tags=LIST comma-separated list of followed HTML tags.
not sure what is meant by HTML tags
[00:28]
***kristian_ has joined #archiveteam-bs
bubblymic has joined #archiveteam-bs
[00:31]
bubblymichi [00:33]
ola_norskhello [00:33]
***bubblymic has quit IRC (Client Quit) [00:34]
ola_norskseems like --follow-tags is just <img>, <a> etc... [00:38]
CoolCanuki wish we had svg uploads so bad >:(
png sucks
[00:40]
ola_norskwhat are the domains that twitter user internally? (apart from t.co) is it just twitter.com and twimng.com ?
uses*
[00:50]
bithippotwimg.com [00:51]
ola_norskaye , ill trust this list https://github.com/dyne/domain-list/blob/master/data/twitter
sick and tired of wget crawling where it shoudlnt
--exclude-domains can even keep it from trying t.co etc..
[00:52]
CoolCanukkids tv says "must be 18 years or older to log on" when mentioning a website [00:55]
ola_norskwget is gold stuff it seems [00:55]
CoolCanukbut 99% of the websites dont require you to login xD [00:55]
bithippoCompliance hack :-P [00:55]
ola_norskCoolCanuk: just put in https://i.imgur.com/atGxO9o.jpg [01:00]
CoolCanukto the archive? [01:00]
ola_norskinternet archive asked if you are 18?
CoolCanuk: you were asked to verify age on IA ?
[01:00]
CoolCanukno. [01:01]
ola_norski missed the "kids tv" part [01:04]
***ola_norsk has quit IRC (I blame beers!) [01:04]
icedice has quit IRC (Quit: Leaving) [01:15]
icedice has joined #archiveteam-bs
icedice has quit IRC (Client Quit)
icedice has joined #archiveteam-bs
[01:22]
......... (idle for 43mn)
Stiletto has quit IRC () [02:09]
.... (idle for 15mn)
Stilett0 has joined #archiveteam-bs
kristian_ has quit IRC (Quit: Leaving)
[02:24]
pizzaiolo has quit IRC (Remote host closed the connection)
Stilett0 is now known as Stiletto
[02:38]
.... (idle for 16mn)
icedice has quit IRC (Quit: Leaving)
Ing3b0rg has quit IRC (Ping timeout: 248 seconds)
[02:56]
Ing3b0rg has joined #archiveteam-bs [03:04]
odemgwhy is this bot in this channel? [03:15]
CoolCanuksomeone thought it was a good idea
not sure who
[03:16]
***icedice has joined #archiveteam-bs
icedice2 has joined #archiveteam-bs
icedice2 has quit IRC (Client Quit)
[03:19]
icedice has quit IRC (Quit: Leaving) [03:28]
BlueMaxim has joined #archiveteam-bs [03:42]
..... (idle for 20mn)
bithippo has quit IRC (My MacBook Air has gone to sleep. ZZZzzz…) [04:02]
..... (idle for 24mn)
qw3rty118 has joined #archiveteam-bs [04:26]
qw3rty117 has quit IRC (Read error: Operation timed out) [04:32]
astridit's usually a good idea
nothing wrong with editing the wiki
thanks for the work, CoolCanuk!
hey who wants to work with me in the next few hours on compuswerve
er, compuserve
i'm gonna go do my laundry first. seeking: anyone who knows how to write a seesaw/pipeline script
[04:41]
***voidsta has joined #archiveteam-bs
voidsta has quit IRC (Connection closed)
[04:47]
voidsta has joined #archiveteam-bs [04:52]
CoolCanukLol it was hardly any work but :) thx [04:54]
***voidsta has left [04:56]
..... (idle for 22mn)
zalgovidme backup starting tomorrow? [05:18]
astridhm? [05:20]
..... (idle for 20mn)
***dashcloud has quit IRC (Read error: Operation timed out)
dashcloud has joined #archiveteam-bs
voidsta has joined #archiveteam-bs
[05:40]
...... (idle for 28mn)
voidsta has quit IRC (Quit: leaving) [06:11]
voidsta has joined #archiveteam-bs [06:17]
...... (idle for 25mn)
voidsta has quit IRC (Quit: leaving) [06:42]
bwn has quit IRC (Ping timeout: 260 seconds) [06:53]
CoolCanuk has quit IRC (Quit: Connection closed for inactivity) [07:02]
bwn has joined #archiveteam-bs
CoolCanuk has joined #archiveteam-bs
[07:08]
..... (idle for 23mn)
wp494that's what arkiver said before going to sleep [07:33]
.... (idle for 17mn)
***Boppen has quit IRC (Ping timeout: 186 seconds)
ReimuHaku has joined #archiveteam-bs
Boppen has joined #archiveteam-bs
[07:50]
PurpleSymJAA, ola_norsk: The code using Chrome seems to be working fine. Just needs some automated tests and a distributed task queue. I’m looking at Kue and Celery. Suggestions are welcome. [07:55]
***ReimuHaku has quit IRC (Read error: Connection reset by peer)
ReimuHaku has joined #archiveteam-bs
tuluu has quit IRC (Quit: No Ping reply in 180 seconds.)
tuluu has joined #archiveteam-bs
[07:58]
dashcloud has quit IRC (Ping timeout: 260 seconds) [08:12]
wp494_ has joined #archiveteam-bs
dashcloud has joined #archiveteam-bs
wp494 has quit IRC (Ping timeout: 492 seconds)
wp494_ is now known as wp494
[08:18]
...... (idle for 26mn)
jschwart has joined #archiveteam-bs [08:51]
..... (idle for 21mn)
CoolCanuk has quit IRC (Quit: Connection closed for inactivity) [09:12]
..... (idle for 24mn)
Stiletto has quit IRC (Ping timeout: 246 seconds) [09:36]
nyaomi has quit IRC (Ping timeout: 250 seconds) [09:47]
nyaomi has joined #archiveteam-bs [09:57]
Stilett0 has joined #archiveteam-bs [10:06]
..... (idle for 21mn)
pizzaiolo has joined #archiveteam-bs [10:27]
.............. (idle for 1h5mn)
BlueMaxim has quit IRC (Quit: Leaving) [11:32]
SketchCowThe bot is an important part of this. [11:44]
.................. (idle for 1h26mn)
JAASketchCow: How many requests is the wiki getting roughly? I think you mentioned 100k visitors (unique accesses, or requests?) per month some while ago; is that still accurate? If so, that's only one request per 25 seconds...
(On average, obviously.)
[13:10]
Just to avoid misunderstandings: I'm grateful that you're paying and running this! It's just really frustrating to do anything on the wiki currently, and I think we can do better. [13:17]
SketchCowWhat I need to do is sit down and see how the whole thing is structured. Probably with Astrid.
And then have something to figure out what to fix.
I'm in SF all next week, likely then
Ah, here we are.
I have a cpanel login into the machine.
Cpu Usage is 99/100 for some reason.
Memory is 419/1048 so that's fine.
Processes is 10/20 and Disk space is 2.62gb/9gb, also find.
Bandwidth is at 13gb out of a terabyte a month.
Looks like we shoot out about 100gb a month in bandwidth
[13:19]
odemgLinux Journal is going away https://twitter.com/linuxjournal/status/936679052370481154 get the archive here: https://secure2.linuxjournal.com/pdf/dljdownload.php [13:23]
jrwrhow much are you paying SketchCow [13:23]
SketchCowSomething is blowing up memory usage. [13:23]
jrwrThe wiki is [13:23]
SketchCowYES THE WIKI IS [13:23]
jrwrmediawiki is a memory hog overall [13:23]
SketchCowTHANKS SHERLOCK [13:23]
jrwr:P
LOOK HERE MISTER
[13:23]
SketchCowI can seriously stop talking about this.
I'd rather, I don't know, go outside and sort leaves by color
Or put a nail into my eye
[13:24]
jrwrI'm giving you shit SketchCow, I've ran a few larger mediawikis, I can profile it and see whats going on
Up to you
[13:24]
SketchCowI'm not in the mood for being given shit
I'm mostly in the mood to report on stats people were asking for.
[13:25]
jrwr<3 [13:25]
SketchCowMonthly Statistics for November 2017
Total Hits 4034995
Total Files 3478298
Total Pages 2480175
Total Visits 414542
Total KBytes 102906318
Total Unique Sites 271552
Total Unique URLs 8952
Total Unique Referrers 23884
so, 271,000 unique visitors
[13:25]
jrwrthats not too bad [13:25]
odemg/month? [13:26]
SketchCowthat's november.
Monthly Statistics for December 2017
Total Hits 647812
Total Files 521735
Total Pages 431106
Total Visits 85111
Total KBytes 16705965
Total Unique Sites 43119
Total Unique URLs 2849
Total Unique Referrers 5524
Total Unique User Agents 5849
So far it's been 43,000 this month.
Note total kbytes.
[13:26]
JAAYeah, I agree, that doesn't look too bad.
I don't know much about MediaWiki though.
But 1.5 hits per second on average should be quite easy to achieve.
[13:27]
SketchCowI'm sure something is set low.
Now, my hosting bill
[13:27]
JAAodemg: Thanks for that, I'll throw it into ArchiveBot. (We're already grabbing the entire website through there as well.) [13:28]
SketchCow$239.40 USD a year. [13:29]
jrwronly thing I can suggest is turning on php7.0+
it will help a ton with mediawiki
[13:29]
odemgSpeaking of stats, CloudFlare are trying to push their enterprise plan on me now: https://imgur.com/a/oljZt [13:30]
SketchCowPHP version: 5.6.30 [13:30]
JAA$20 per month? That seems expensive. But I don't know US pricing too well. [13:30]
jrwrHrm, a dreamhost account is 119 [13:30]
SketchCowAh, see, this is where you are all still children
you think $ + ? = $
[13:30]
jrwrya, if you can turn on php7.0 then it will help a ton [13:31]
SketchCowI have a host here who hosts textfiles.com, as well as you
When I was come after for having bomb info, and they blocked my entire ISP/host for me having bomb info, this host moved me to a new subnet, claimed he'd banned me (so all his other customers weren't affected), and won't blink.
The australian government has called me a terrorist link
He has stood firm
I take loyalty seriously.
So $239/yr is just fine
We've not had anyone go after archiveteam to shut us down recently, but if they do, we have a sandbag levee in place
I'll look into php 7.0
[13:31]
JAASee, that's why I wrote that it "seems" expensive. I wasn't aware of that. [13:33]
SketchCowYou're not supposed to have to be
A lot of delightfully dark forces have things to say to me a lot
Also, I hate adminniing
I'll choose one of you fucksticks to take it over, but I'd prefer to meet you first
I'm still one of those oldschool "what am I dealing with" people
[13:33]
odemgThis shit https://parazite.the-eye.eu/ gets me many angry emails, not quite been called a terrorist yet though [13:35]
SketchCowThe sex with dogs FAQ on textfiles.com gets a lot of hatemail but they don't know where to send the hate mail
In other news, I went to a dentist for the first time since 2006 and that went well
One cavity
[13:35]
odemgIt's the sex with corpses that does it for us [13:36]
jrwrwell if you are ever up here in NY, come down to interlock, I hang out there all the time, I hear you visited once [13:37]
SketchCowEveryone has their waterloo
Are you fucking down the street from me
I'm in Beacon
[13:37]
odemgI guess we have the dog sex thing as well, I barely know what included, the site is much less organised and sane then textfiles: https://parazite.the-eye.eu/files.html#notpopsex [13:38]
SketchCowAlso, as a couple of you already know, I'm uploading 102,000 issues of manga.
2,663 issues so far
[13:39]
jrwroh man, thats a drive, but not too bad [13:40]
SketchCowOh, Rochester. [13:41]
jrwrya, im in Rochester [13:41]
SketchCowI always confuse interlock and the poughkeepsie guys [13:41]
jrwrlol
I'm rebuilding their website for interlock since its a little bit in disarray and no one wants to take it up
[13:41]
also SketchCow, I'm still open to anything you need digitized over at strong, wouldn't mind spending my weekends doing that [13:48]
***Valentine has quit IRC (Read error: Operation timed out) [13:50]
SketchCowI'll try and figure out how to get access to the machine via ssh
Sounds like an update should happen, and I should hand admin keys over to someone or another
[13:50]
***Valentine has joined #archiveteam-bs [13:51]
jrwrsounds like a plan, also vid.me should be starting today as well [14:01]
JAAYep, and we should get Roblox going ASAP too. [14:02]
jrwrholy shit they are closing to [14:04]
JAAYeah [14:04]
jrwrhrm,
that will be interesting to save
[14:04]
JAAWe have the code, but wp494 noticed that the old posts (from the sections they "deleted" a few months ago) are still online as well, just not easily discoverable.
If we wanted to grab everything, we'd have to try ~230 million post IDs.
I'm not sure if that's feasible.
[14:05]
jrwrmaybe with a discoery warrior
discovery
[14:05]
JAAThere is no way to discover posts in the hidden sections though. [14:06]
jrwris there anyway to get to them [14:06]
JAAYou can only access it by the post ID as far as I can see. [14:06]
jrwrah
so
discovery would legit be, try block 10000-20000
[14:06]
JAAAnd if we iterate over all possible post IDs, we'd grab everything multiple times, too. [14:06]
jrwrya, some dedupe would be needed [14:07]
JAALet's move this to #robloxd. [14:07]
***ZexaronS- has joined #archiveteam-bs
ZexaronS has quit IRC (Read error: Connection reset by peer)
ZexaronS- has quit IRC (Client Quit)
superkuh has quit IRC (Remote host closed the connection)
superkuh has joined #archiveteam-bs
[14:15]
..... (idle for 24mn)
bithippo has joined #archiveteam-bs [14:45]
DrasticAcMy postgres database of parsed Miiverse posts is now sitting at around 200 GBs. Will probably end up being 250~ or so when it's over.
The DB should be done by the end of the week, and the web frontend should hopefully be done by the end of the month.
[14:49]
godaneso i made a linux journal archive 2017 zim file [14:51]
DrasticAcIt should be pretty cool, since I can show all the posts in chronological order for all content, something you couldn't do on the site when it was online. And hopefully people will stop asking where they can find their posts; they can just search for them on this. [14:52]
***MrDignity has quit IRC (Read error: Operation timed out) [14:58]
......... (idle for 40mn)
sec0ndHow does one join the Archiveteam? [15:38]
***kimmer2 has joined #archiveteam-bs
Stilett0 is now known as Stiletto
[15:40]
JAADrasticAc: Sound great!
sec0nd: By being here and sharing our common goals. There is no membership signup form. ;-)
[15:42]
SketchCowsec0nd: You're already in Archive team [15:54]
godanei'm at 3957 items right now for this month: https://archive.org/details/@chris85?and[]=addeddate:2017-12 [16:04]
***Dimtree has quit IRC (Ping timeout: 506 seconds)
bithippo has quit IRC (Read error: Connection reset by peer)
[16:12]
SketchCowUnstoppable
I need help from someone on an easy way to pull files
Well, let me take one more shot at it.
[16:16]
.... (idle for 18mn)
***pizzaiolo has quit IRC (Ping timeout: 246 seconds)
ranavalon has quit IRC (Read error: Connection reset by peer)
ranavalon has joined #archiveteam-bs
[16:36]
.......... (idle for 47mn)
ranavalon has quit IRC (Read error: Connection reset by peer)
ranavalon has joined #archiveteam-bs
[17:24]
CoolCanuk has joined #archiveteam-bs
ranavalon has quit IRC (Read error: Connection reset by peer)
ranavalon has joined #archiveteam-bs
[17:32]
.... (idle for 18mn)
jschwart has quit IRC (Read error: Connection reset by peer)
jschwart has joined #archiveteam-bs
[17:52]
...... (idle for 26mn)
icedice has joined #archiveteam-bs
icedice has quit IRC (Connection closed)
[18:19]
.... (idle for 16mn)
w0rp has quit IRC (Read error: Operation timed out)
w0rp has joined #archiveteam-bs
[18:35]
.......... (idle for 47mn)
ld1 has quit IRC (Quit: ~) [19:24]
ld1 has joined #archiveteam-bs [19:36]
....... (idle for 30mn)
ld1 has quit IRC (Ping timeout: 260 seconds)
ld1 has joined #archiveteam-bs
[20:06]
...... (idle for 29mn)
wp494oh wow, look who's colleague made the news
https://globalnews.ca/news/3897163/airbnb-hidden-camera-room/
[20:35]
CoolCanukHehe
Global news reports everything
Best free resource to learn Python?
Id love to be able to help code warriors etc one day
[20:35]
***ZexaronS has joined #archiveteam-bs [20:38]
Arctic has joined #archiveteam-bs [20:44]
ArcticWHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [20:44]
***Arctic has quit IRC (Client Quit)
Arctic has joined #archiveteam-bs
Arctic has left
Arctic has joined #archiveteam-bs
[20:45]
ArcticWHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [20:48]
LastNinjao.O [20:49]
sorry Arctic - i just realised where that quote comes from! :-) it has been a long time....and i need a cup of tea [20:56]
CoolCanukone moment Arctic. Please stand by
an op will be with you shortly
[20:57]
jrwrWhat do you plan on editing Arctic [20:57]
CoolCanukcheck #archiveteam , jrwr :) [20:57]
***schbirid has joined #archiveteam-bs [20:59]
ArcticI was actually going to propose a project. [21:00]
***Arctic has quit IRC (Quit: Page closed)
ola_norsk has joined #archiveteam-bs
[21:09]
CoolCanukhopefully he comes back :( [21:10]
schbiridlolwat [21:10]
***schbirid has quit IRC (Quit: Leaving)
Mateon1 has quit IRC (Read error: Operation timed out)
Mateon1 has joined #archiveteam-bs
[21:10]
CoolCanukwe cant do much since it's behind cloudflare anyway lol [21:11]
JAAEh, CloudFlare isn't too hard to defeat. [21:15]
CoolCanukohoho
:P
[21:15]
JAAWe have the code, just need to port it to Python so we can integrate it into wpull and do fun things with it. [21:15]
CoolCanukorly [21:15]
JAAYou basically just need to evaluate a JSFuck string. [21:16]
CoolCanukI guess you haven't been introduced to this https://www.cloudflare.com/products/cloudflare-warp/ (:
no direct access to origin that way, only via cloudflare. :(
[21:16]
JAAYes, I'm talking about access via CF. [21:17]
CoolCanuko [21:17]
JAAThere might be other things in place, but the default "Just checking your browser" thing can be cracked easily. [21:17]
CoolCanukif you're suspected as a bot, you might receive a captcha.
your IP can grow to have permanent captchas over all cloudflare sites unless whitelisted.
[21:18]
***BlueMaxim has joined #archiveteam-bs [21:18]
JAAYeah, I've seen those before. I haven't had any issues so far though. [21:19]
CoolCanukI'll set up a cloudflare environment we can test in ;) including warp [21:19]
ola_norskCoolCanuk: I think that is where those 'earn bitcoin by solving captchas'-websites come in :d [21:30]
CoolCanuksomething like 2captcha is an option, yes. [21:30]
ola_norskno offense, but it's kind of nasty business though, to put it mildly ;)
literally often involving child labor..
[21:32]
CoolCanukwe can possibly use https://www.ghacks.net/2017/09/18/wave-goodbye-to-cloudflare-captchas-cloudflare-edge-pass-lands/ . (we'd need to re-code it into wput or something)
data entry is not child labour :P
[21:33]
ola_norskif they can recognize letters, they can do the task of punching them in :/
anyway, my joke got too dark
[21:34]
CoolCanuka little lol
plus I cant find any proof
[21:35]
ola_norskwhat do you think the 'tech support scam centers' do?
while their having people on the phone i mean..
there's no way it would be a thing unless there was profit in it..and the profit is fucking small
anyway, this is unrelated to what you guys were talking about :D and my joke is now making me queezy :/
..it's an option though, albeit an unethical one ;)
[21:36]
CoolCanuk:P [21:41]
***BlueMaxim has quit IRC (Read error: Connection reset by peer)
Arctic has joined #archiveteam-bs
[21:46]
ArcticI'm here.
So how are we going to do this?
[21:47]
CoolCanukProjects are usually quite large. We might use archivebot to just crawl [21:48]
ArcticOkay.
Is there a way to use that with Android?
[21:48]
CoolCanukwe do it on our side :) [21:48]
***BlueMaxim has joined #archiveteam-bs [21:49]
ArcticOkay. [21:49]
Kazarchivebot might not work too well, especially if there's not an easy way to list all posts [21:51]
ola_norsknoticed today that ~12 hours of a twitter hashtag capture = ~250MB [21:51]
ArcticI'll see if Arian will hand over the data. [21:52]
CoolCanukyou're right. there is infinite scroll.. ugh [21:52]
ola_norskif anyone is good at math, what would that entail, back to to ~2014-2016? [21:53]
ArcticIt actually goes back to sometime in September 2017. [21:53]
CoolCanuktwitter usage might be less in 2014 ;) [21:53]
ArcticOh, it's Twitter. [21:53]
ola_norskaye [21:53]
CoolCanuk2 convos at once. Yours is still important :) [21:54]
ArcticOkay.
I was just confused.
[21:54]
ola_norskforget about damn twitter, it's just bullshit there anyway [21:54]
JAAIt seems that you can access the posts through https://closed.pizza/posts/$postid, and $postid is only ~180k by now. [21:55]
ArcticWhat's the Twitter thing about anyway?
I'll try to find user id
[21:55]
ola_norskCoolCanuk: btw, for infininte scroll https://github.com/webrecorder/webrecorder [21:56]
ArcticCommunities are easy to find.
Where are you finding this?
[21:56]
ola_norskCoolCanuk: sadly their only service are limited to 3 hours of scroll [21:56]
CoolCanuka lot of manual work lol [21:57]
ola_norskno, press "Auto scroll", and it goes
oh
[21:57]
ArcticHow can we use the $postid URL? It doesn't show all posts. [21:57]
CoolCanukhttps://closed.pizza/posts/1 https://closed.pizza/posts/2 [21:57]
JAAArctic: Not literally $postid, but https://closed.pizza/posts/1, https://closed.pizza/posts/2, etc. [21:58]
CoolCanuk(pretend 2 works) [21:58]
ArcticOh, I see now. [21:58]
JAAIt might not grab all comments to a post though. [21:58]
CoolCanukcan use offset
https://closed.pizza/communities/7?offset=12 or https://closed.pizza/communities/7?offset=50
[21:59]
ArcticCan you access closed.pizza/users/$userid ? [22:00]
JAAAnyway, this is probably done best in a warrior project rather than ArchiveBot or wpull.
But we're kind of busy in that area right now due to vid.me, Roblox, CompuServe, and Wine.Woot shutting down.
[22:00]
ArcticIs there a way to use Warrior with Android or within a web browser?
Roblox is shutting down?
[22:00]
JAASo unless this platform is in immediate danger, I'd say we postpone it until we have more time and resources.
Only the forums, but yeah.
230 million posts or something.
[22:01]
ArcticWoah.
Okay.
[22:01]
ola_norskArctic: if that Android device has the abilitiy to run virtual machines, it might [22:02]
CoolCanukdont forget sears @JAA :P [22:02]
ArcticI don't believe so. [22:02]
astridJAA: which of these are warrior projects? compuswerve at the least is an archivebot job [22:02]
ArcticSears is shutting down? [22:03]
CoolCanukhttps://sears.ca Canada only [22:03]
JAACoolCanuk: Well yeah, but I don't see what we can do in terms of warrior projects there. The ArchiveBot job is doing what it can. [22:03]
CoolCanuk* https://www.sears.ca/ [22:03]
ola_norskArctic: don't be so quick to dismiss though.. [22:03]
JAAastrid: Right. I'm still not convinced that the ArchiveBot job will capture everything though. Wine.Woot is also only ArchiveBot at the moment, but I'd like to make a project out of it because that job will likely not grab everything either. [22:04]
ArcticI won't. Once again, I'll contact Arian soon to see if he'll hand over an archive of Closedverse data. [22:04]
ola_norskArctic: https://github.com/limboemu/limbo [22:04]
astridJAA: i looked into it the other day and convinced myself that archivebot *would* get all of compuswerve
=> #compuswerve for further of this
[22:04]
ola_norskArctic: maybe that is something? [22:04]
ArcticIs there a way to run programs in it?
If so, it'd be a godsend.
[22:05]
ola_norskit's qemu, that's all i saw..so...yes...i think.. :D
cant help any more than that :D
[22:06]
ArcticOkay. Thanks! [22:06]
ola_norski'm not saying it's a good idea though :/ [22:07]
ArcticHm...
I'll see how well it works.
[22:07]
JAAYeah, probably not a good idea. [22:09]
ArcticYou're probably right. [22:10]
JAANot much storage (some projects can have quite large item sizes), very energy-inefficient I'd guess, etc. [22:10]
ola_norskit would depend on the device, wouldn't it? But, a phone or a tablet.. that's gonna hurt :/
if anyone has a spare Oya box, maybe they could test it?
i'm putting my chips on it not being a good idea though
[22:11]
JAAAh true, Android is used everywhere nowadays. I always forget that. [22:14]
ArcticI could probably see if I can split the archive and put it up for download.
I'll try Webcorder
.
[22:15]
ola_norskMaybe this is a solution..a shitload of cheap, otherwise discarded' Ouya Boxes, placed all around the world. [22:19]
ArcticWe'll call it... OuyaNet... Or something. [22:20]
ola_norskaye.. There's playstation supercomputers, so why not? https://phys.org/news/2010-12-air-playstation-3s-supercomputer.html [22:21]
ArcticSo, Ouya supercomputer? [22:21]
ola_norskcapture-puter
all running Warrior..
[22:22]
ArcticA peer to peer network of PCs all running Warrior and forming a supernetwork to store archives? [22:24]
astridthere are some plans afoot to revamp archivebot to allow it to run on warrior-style disposable untrusted machines [22:24]
ArcticInteresting...
The more we go on with this, the more confusing it gets!
[22:25]
astridit's a lot of engineering effort and people are spread pretty thin [22:25]
ArcticIt's still an interesting proposal. Maybe I can ask around on the Lost Media Wiki. They are interested in rediscovering lost media and archiving media before it becomes lost.
We'll need a GUI frontend for each computer and code for the backend to connect each computer for the supernetwork.
[22:27]
ola_norskDHT? [22:29]
ArcticWhat's that? [22:29]
ola_norskArctic: https://en.wikipedia.org/wiki/Distributed_hash_table
it's e.g how torrent clients find eachother without trackers (i think)
[22:30]
ArcticAh. Seems promising. [22:31]
astridprobably fancier than we need [22:31]
ArcticYeah. [22:31]
JAAWe'll probably still want something central to be able to keep track of what's running etc. [22:32]
astridthere's nothing wrong with running a central node
makes design a lot simpler
[22:32]
ArcticYep. [22:32]
JAAIndeed [22:32]
ola_norskall we need is some geniuses to put it into action! https://youtu.be/EPHPu4PV-Bw
Decentralized warriors...how would that work?
[22:33]
astridlet's not waste time on that right now [22:34]
ola_norskoptional tasks? + other tasks? :D
aye
[22:34]
ArcticBut where would we find some?...
The Genius Bar maybe...
Archiving Closedverse?
[22:34]
JAAFor now, we need to focus on Roblox (6 days) and Vidme (10 days). [22:36]
ArcticOkay. What can I do? [22:36]
JAACompuServe might work through ArchiveBot, though I'm not sure yet if we'll manage to get everything in time.
Not much at the moment I'm afraid.
You can join the relevant channels, #robloxd and #vidmeh, if you want to follow the progress and discussions.
[22:36]
ArcticSure. I might soon. [22:37]
wp494FWIW the vidme project is supposed to start up at some point today [22:39]
ArcticWould using Archive.org to archive work? [22:39]
JAAWe are pushing our data to IA, but we're doing the actual archival directly. [22:40]
wp494that's where our stuff will end up anyway! [22:40]
ola_norskwp494: that's good news seeing as i paid $1 dollar to boost this an hour ago :D https://www.minds.com/newsfeed/784890241298210818
perfect timing, to say the least...hope someone notices it
[22:40]
ArcticI would love to try to port Warrior to Android, but I'm still learning HTML and CSS. [22:42]
JAABefore we port the warrior to new platforms, I'd say we need to upgrade it to a modern system (and ensure that it stays like that). [22:43]
ola_norskAndroid is pretty much Linux is it not? But, i'm thinking even the majority of tools used would require rooted Android device [22:43]
wp494Warrior on ARM would be pretty interesting but cellular carriers love to screw with stuff and restrict stuff too [22:43]
JAAI believe the VM is still running on an unpatched six-year-old system.
As in, unpatched kernel and everything.
[22:43]
ArcticI have a 4th generation Kindle Fire HD 7. Would that work? Would you need the OS version? [22:44]
ola_norskif it can run a virtual machine, it could do it..But, not a good idea i would say [22:45]
ArcticAh.
Is there anything else I can do?
[22:46]
ola_norski'm not running Warrior, i can't answer that question. I just manually archive shit :D [22:47]
***dboard2 is now known as dboard [22:47]
ola_norskin 'who we are' on archiveteam.org .. I'd be classified as 'Loudmouth' :d [22:48]
ArcticHow do you manually archive stuff? [22:48]
ola_norske.g web.archive.org/save/ [22:48]
ArcticOh. [22:49]
ola_norskor uploading pdf's or using e.g youtube-dl
..or even just subtitling shit..
[22:49]
ArcticI know this sounds stupid, but can you provide an example of the subtitling? [22:50]
ola_norskaye..one sec [22:50]
ArcticOkay, thanks. [22:51]
ola_norskArctic: https://archive.org/details/Filmavisen_1941_08_25
and no i did not steal the video, it belongs to norwegian people, and that includes me :d
[22:51]
ArcticOh, you mean subtitling videos! [22:52]
ola_norskyeah
SRT files
[22:52]
ArcticWhy are the Roblox forums shutting down? [22:54]
ola_norskthese are not even mine, sometimes i ask whoever i know that speaks the language to be translate https://github.com/DuckHP/archive_org_subtitles/tree/master/Misc [22:54]
JAAArctic: https://forum.roblox.com/Forum/ShowPost.aspx?PostID=228429979 [22:55]
ola_norskArctic: i use this program to make the subtitles http://www.gnomesubtitles.org/ [22:56]
ArcticCool.
So Roblox mods are re-building the forumd?
Re-build as in massacre, right?
[22:57]
JAAMore like adding different features and thus nuking the forums.
The groups aren't publicly accessible, by the way.
You need at least an account.
[22:57]
ArcticWelp, as if Roblox wasn't already fucked. [22:58]
ola_norskArctic: when coming across people willing to help transcribing, this is valuable tool http://otranscribe.com/ [22:58]
ArcticSounds great!
By the way, Archive.is ignores ROBOTS.TXT so we can archive websites Archive.org can't.
[22:59]
JAAWe also ignore robots.txt in all our efforts. [23:03]
ArcticThis is how I archived a lot of old Nintendo of Europe Flash promos in collaboration with members of the Lost Media Forums. [23:03]
JAAThere's a page on it on the wiki. [23:04]
ArcticI know, it's juat easier to archive that way. [23:04]
JAA(If you get it to load.) [23:04]
ArcticI read the page.
*just
[23:04]
JAAIt's unfortunate that IA follows robots.txt, but there are good reasons for it I'm sure. [23:05]
ola_norskwget -q --limit-rate=256k --user-agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/62.0.3202.89 Chrome/62.0.3202.89 Safari/537.36" --delete-after --page-requisites -e robots=off "https://web.archive.org/save/https://twitter.com/hashtag/netneutrality?f=tweets" [23:05]
JAA(Mostly legal reasons, I think.) [23:05]
ola_norskis it not the '-e robots=off' that prevent that? [23:05]
ArcticNot sure. [23:05]
JAAYes, but in this case it's about ignoring IA's robots.txt. [23:06]
CoolCanukmhm [23:06]
JAA(Which prevents you from saving the page requisites with this technique.) [23:06]
ArcticIf you'd like, I can send you the link to the post about Nintendo of Europe's old flash promos if you want to see them. [23:06]
JAAThere are several people in here dealing with those kinds of things, so sure. [23:07]
ArcticOkay. Be right back. [23:07]
ola_norskarchive.is seems kind of selfish .. http://web.archive.org/web/20171205230717/http://archive.is/ [23:07]
JAAWhat's most annoying re: robots.txt are the domain squatters. [23:08]
ola_norskoh, cloudflare [23:08]
ArcticWhat do you mean by archive.is seeming selfish? [23:09]
JAAParking domains often have a generic "disallow all" robots.txt, and that retroactively blocks all pages under that domain on IA... [23:09]
ola_norsktrying to /save/ an archive.is fails [23:09]
***qw3rty119 has joined #archiveteam-bs [23:10]
ola_norskArctic: ideally, they should be uploading their shit willingly.. [23:10]
JAAYeah, I hope when archive.is shuts down, the owner will be open to donating the collection to IA directly or something.
Even though it's in a very different format, so integrating it into the Wayback Machine might prove challenging.
[23:10]
ArcticHere is the Nintendo flash promo link: http://forums.lostmediawiki.com/thread/1267/nintendo-flash-games-ca-2006
What do you guys think?
[23:11]
astridwhat in the world are you talking about [23:13]
ola_norskArctic: http://web.archive.org/web/20170311070231/http://archive.is/ [23:13]
astridthere are entirely too many conversations here
and the loudest one wins, which makes me uncomfortable
[23:13]
ola_norskastrid: i think think it's just 2-3 topics going on [23:14]
***qw3rty118 has quit IRC (Read error: Operation timed out) [23:14]
astridyes and it's all super eager people who don't understand how archiveteam works but really want to change things
it's overwhelming
[23:14]
ola_norskrobots.txt , and related to archive.is' selfishness..and some other thing [23:14]
ArcticHow did we go from archiving Closedverse to how Archive.is is selfish? [23:15]
ola_norskrobots.txt and cloudflare [23:15]
ArcticSorry, I'm new to all of this. [23:15]
astridwe know :) [23:15]
ola_norskso am i, sorry for spamming [23:16]
ArcticSo... What now? [23:17]
ola_norskArctic: if archive.is could be done by wget..
or webrecorder.io
[23:18]
ArcticSounds good. [23:20]
JAAola_norsk: I'm sure that's possible. But be aware that archive.is is several hundred TB in size. Webrecorder doesn't actually store the stuff on their servers (long-term, I mean) as far as I know. [23:20]
astridoh i just realized why it feels like everything is being lit on fire this month [23:21]
ola_norskJAA: so do so many use archive.is? :/
JAA: why*
[23:21]
astridit's because it's the end of the calendar year and lots of more or less unprofitable shit is being lit on fire for tax reasons [23:21]
JAAYeah [23:21]
ArcticFor some reason, a lot of websites are shutting down on December 15th. [23:22]
astridit's before all the employees leave for christmas [23:22]
ola_norskchristmas bonuses are expensive [23:22]
astridand it's a friday [23:22]
JAAola_norsk: Because archive.is can grab pages behind robots.txt, handles Google Cache very well, is capable of dealing with JavaScript, and various other reasons.
It's a quite good platform really. Just a shame that it isn't open source and that it isn't saved as WARCs (as far as we know).
[23:22]
astridso far as we can tell, it saves the DOM of the page, not the resources used to construct it [23:23]
ola_norskJAA: check out webrecorder.io .. I'm pretty sure i saw 'java, flash' in one of the browsers they offer to use :/ [23:23]
ArcticAOL is shutting down Plaxo by the way. [23:23]
JAAYep, that's probably correct.
ola_norsk: I'm aware of Webrecorder. I was just explaining why so many people use archive.is.
[23:23]
ArcticI mean Comcast. [23:24]
astridArctic: hmm, thanks for the heads-up. does that still have any public content? [23:24]
JAAWebrecorder doesn't offer keeping the saved pages for others to see, so that's a big -1 for many people. [23:24]
ola_norskJAA: when registered they do..but yeah, i see that point
JAA: and no pdf download i think, only the WARC
[23:25]
ArcticWhat's WARC? [23:25]
ola_norskArctic: https://en.wikipedia.org/wiki/Web_ARChive [23:27]
JAAAlso http://archiveteam.org/index.php?title=The_WARC_Ecosystem [23:27]
ArcticSounds good. [23:28]
ola_norskwhy doesn't archiveteam.org utilize redirection to wayback when failing to load? [23:28]
***jschwart has quit IRC (Konversation terminated!) [23:29]
ArcticNot sure.
Is there a way to use WARC files on Android?
[23:29]
JAAola_norsk: Why would it? The information there would be outdated for many pages anyway. [23:30]
CoolCanukI have a channel similar to this on my satellite tv. wtf? http://teletext.mb21.co.uk/gallery/ceefax/bbc-world-210a-032000.gif [23:31]
JAAAnd stuff will be done to improve the performance of the wiki. [23:31]
ArcticCool. [23:31]
JAATeletext, nice. [23:31]
CoolCanukya [23:31]
JAAI wonder if there are any archives of that. [23:31]
ola_norskJAA: it does present sensible text. I'm not sure how often the wiki changes. But good point. [23:32]
CoolCanukis it axtual text that comes to the tv?
this appears to be just images
[23:32]
JAAhttps://en.wikipedia.org/wiki/Teletext
Short story: it's complicated.
[23:32]
CoolCanukit looks ecactly like this http://1.bp.blogspot.com/-MqF4BKkQ_mE/VOvxolBzuZI/AAAAAAAAGWI/bFWfZAFsOBw/s1600/retrotextnews.jpg
*exactly
[23:32]
ArcticInteresting. [23:33]
DrasticAcI just read the convo in #archiveteam about closeverse. Would be sad for the Miiverse community replacement to go down so soon. [23:34]
***Arctic_ has joined #archiveteam-bs [23:34]
Arctic_I'm back.
Accidentally closed the tab with the channel.
[23:34]
CoolCanukno problem :) [23:35]
Arctic_Is it possible to use WARC files on Android devices without a virtual machine? [23:36]
ola_norskArctic: webrecorder.io does allow upload of WARC files. It might work in any browser of Android devices. [23:36]
Arctic_How do I make WARC files on Android? [23:37]
ola_norskwebrecorder.io would be my best bet :D [23:37]
JAAYeah, likely. [23:37]
***Arctic has quit IRC (Ping timeout: 260 seconds) [23:37]
JAABasically every software we use is written for real computers. [23:37]
Arctic_Real computers? [23:38]
JAA;-) [23:38]
ola_norsk...JAA being a devicivist.... [23:38]
JAAMostly unixoid systems, really. Some stuff might also work on Windows to some degree. [23:38]
ola_norsk:D [23:38]
Arctic_Welp... [23:39]
ola_norskthere's debian for Wii...just sayin' :D [23:39]
JAA...butwhy.gif [23:40]
Arctic_I don't have an SD card to use with my Wii U. RIP. [23:40]
ola_norskwhy not? :D [23:40]
DrasticAcThe stuff I wrote for the Miiverse stuff was in .NET, run on Windows, but it should work in Mono. [23:40]
ola_norskJAA: why not?
Arctic_: i meant 'old wii'
[23:41]
JAASure, I guess. [23:41]
Arctic_There is a Wii mode on the Wii U that most Wii hacks work with. [23:41]
CoolCanukdebain write speed on wii and wii u are too slow [23:42]
ola_norskanyway, it's not a good idea even if it works..https://archive.org/details/iaCSS64_test .. 'two layer emulation' as someone here pointed out [23:42]
JAAI just don't see why you'd want to install a generic OS on a platform specifically designed for gaming, but whatever. [23:42]
Arctic_Any options for Android? [23:42]
JAA(Other than to figure out how to do it.) [23:43]
Arctic_I'm trying to figure that out. [23:43]
JAAI'm sure you can build something that kind-of works on Android.
Whether that's worth the time and effort is another question.
And whether you'll want to use it afterwards.
[23:44]
Arctic_True. [23:44]
DrasticAcWhy do you want to? [23:44]
ola_norskArctic_: i'm unsure what you mean by options. There are certianly options, but if it's a Kindle device, i'd say it's a bad idea to run a virtual machine on it that's running Warrior [23:44]
JAAYou could always rent a small VPS, use some SSH app to connect to it, and then do whatever you want there. [23:45]
Arctic_I just want to find a way to help archiving efforts with my shitty Android device. [23:46]
JAAI doubt that's possible without a *lot* of work. [23:46]
ola_norskJAA: that's bollocks.. [23:46]
JAAYeah, you can archive stuff, but not in the context of ArchiveTeam really. [23:47]
CoolCanukas I'm sure you've all seen, Archive.org currently has someone matching donations 3 to 1. Which, I think is 3x or $3 per $1 [23:47]
astridso you insert $1 and IA gets $4 [23:47]
Arctic_Is there a way to bulk archive pages on Android? Is there a website I can use? [23:48]
CoolCanukso 4x? :O [23:48]
astrid1, +3x [23:48]
ola_norskJAA: what about Manual projects ? [23:48]
JAAastrid: But it says "triple your impact"? [23:48]
astridwell maybe i'm misunderstanding [23:48]
ola_norskArctic_: yes, webrecorder.io [23:48]
Arctic_Okay, thanks! [23:49]
ola_norskArctic_: you'd have to register to share the captures though [23:49]
JAAola_norsk: Do you mean running the warrior scripts manually or manually doing independent stuff? [23:49]
ola_norskthe latter i guess [23:50]
Arctic_I registered earlier. [23:50]
ola_norskJAA: i'm not in any way being offensive or aggresive, but do i seriously have to be running Warrior software to be here in "BS" ? [23:52]
Arctic_Not sure. [23:53]
JAANot at all. [23:53]
ola_norskskål! :D [23:53]
DrasticAcMiiverse didn’t use Warrior.
Well, we tried to at one point, but it stared slowing down/stopping their servers and they started banning ips
But our manual jobs went through fine.
[23:54]
ola_norski'm mainly here secretly watching for ways to record ENTIRE twitter hashtags :/ [23:55]
DrasticAcBut our tooling was done on Windows and Linux.
And I had a bunch of Azure VMs running with the stuff I wrote.
[23:55]
JAAola_norsk: That's something I'd like to look into at some point. Also entire user accounts. Also Instagram and other services. [23:56]
ola_norskJAA: i'm just focused on '#netneutrality' at the moment, seeing as it its a big deal atm [23:57]
JAAastrid: On /donate, it explicitly says "That means for every dollar you donate right now, the Internet Archive will receive $4 in all." [23:57]
Arctic_I'm going to go contact Arian about sending us a Closedverse archive. [23:57]
JAABut yeah, it should be "quadruple your impact". [23:57]
ola_norskJAA: there's gonna be so much digging into 'netneutrality' , whatever way the voting goes, i figured it shouldn't just be a privilige of organizations that's got money to pay twitter for archive that should get to look into it [23:59]
astridthumbs up emoji, ola_norsk [23:59]

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)