Time |
Nickname |
Message |
00:10
π
|
|
Stiletto has quit IRC (Read error: Connection reset by peer) |
00:11
π
|
|
Stiletto has joined #archiveteam-bs |
00:15
π
|
|
Muad-Dib has quit IRC (Ping timeout: 260 seconds) |
00:17
π
|
|
Muad-Dib has joined #archiveteam-bs |
00:23
π
|
|
Stiletto has quit IRC (Read error: Connection reset by peer) |
00:23
π
|
|
Stiletto has joined #archiveteam-bs |
00:24
π
|
|
Stiletto has quit IRC (Read error: Connection reset by peer) |
00:24
π
|
|
Chorca has quit IRC (Read error: Operation timed out) |
00:25
π
|
|
Stiletto has joined #archiveteam-bs |
00:29
π
|
|
Chorca has joined #archiveteam-bs |
00:43
π
|
|
Stiletto has quit IRC (Read error: Connection reset by peer) |
00:44
π
|
|
Stiletto has joined #archiveteam-bs |
01:07
π
|
kyan |
I really want Terastash to be out. I'd like to hack it to use IA as a backend for rolling archives |
01:07
π
|
HCross |
kyan, you can send custom UA's |
01:07
π
|
kyan |
Wait, what? |
01:08
π
|
HCross |
--user-agent-alias firefox |
01:08
π
|
kyan |
OH! about the earlier comment |
01:08
π
|
HCross |
--user-agent-alias=firefox |
01:08
π
|
kyan |
Yes, I did that |
01:08
π
|
kyan |
thanks :) |
01:09
π
|
kyan |
I thought that was a reply to the one I just made, and was like wut |
01:23
π
|
|
Stiletto has quit IRC (Ping timeout: 246 seconds) |
01:37
π
|
|
SN4T14 has joined #archiveteam-bs |
01:52
π
|
|
Stiletto has joined #archiveteam-bs |
01:53
π
|
|
toad2 has joined #archiveteam-bs |
01:56
π
|
|
toad1 has quit IRC (Read error: Operation timed out) |
02:10
π
|
|
vitzli has joined #archiveteam-bs |
02:46
π
|
|
Frogging1 is now known as Frogging |
03:26
π
|
|
altlabel has quit IRC (hub.dk irc.homelien.no) |
03:26
π
|
|
i0npulse has quit IRC (hub.dk irc.homelien.no) |
03:26
π
|
|
PotcFdk has quit IRC (hub.dk irc.homelien.no) |
03:26
π
|
|
limebyte has quit IRC (hub.dk irc.homelien.no) |
03:26
π
|
|
coretx has quit IRC (hub.dk irc.homelien.no) |
03:26
π
|
|
pikhq has quit IRC (hub.dk irc.homelien.no) |
03:26
π
|
|
PurpleSym has quit IRC (hub.dk irc.homelien.no) |
04:02
π
|
|
achip has quit IRC (hub.efnet.us irc.Prison.NET) |
04:18
π
|
snape_ |
Heh, just found a zip file of nine text files from a BBS in 1994, none of which seem to be on textfiles.com, or anywhere online for that matter. :3 |
04:22
π
|
kyan |
Upload upload upload |
04:22
π
|
SketchCow |
Let's be clear. |
04:22
π
|
kyan |
:D |
04:22
π
|
SketchCow |
Internet Archive does not "slow things down" because we're "running out of space" |
04:23
π
|
SketchCow |
And I guarantee that when we get to the edge, we'll see another couple petabytes pop up |
04:24
π
|
kyan |
Some CC licensed media apparently got darked today |
04:25
π
|
kyan |
The Internet Archive is wonderful, but sometimes they do things that really seem at odds with their mission |
04:26
π
|
SketchCow |
You literally fucking have an IA guy here |
04:26
π
|
SketchCow |
Let's try and use that fact to get actual information |
04:26
π
|
|
Stiletto has quit IRC (Remote host closed the connection) |
04:26
π
|
SketchCow |
Instead of acting like you're guessing what the climate of Neptune is |
04:26
π
|
kyan |
That is why I mention it :) |
04:26
π
|
|
Stiletto has joined #archiveteam-bs |
04:26
π
|
SketchCow |
You did it wrong. |
04:26
π
|
SketchCow |
23:25 < kyan> The Internet Archive is wonderful, but sometimes they do things that really seem at odds with their mission |
04:26
π
|
SketchCow |
See that? |
04:27
π
|
kyan |
Yep, I think I have a pretty good point |
04:27
π
|
SketchCow |
That's how you piss me off. |
04:27
π
|
kyan |
Ok, so? I think I have valid concerns. If you don't like it, feel free to ban me from the channel |
04:29
π
|
SketchCow |
Oh, no doubt they are valid. |
04:29
π
|
SketchCow |
It's just the ridiculous conspiratorial way you're putting it. It's pathetic. |
04:29
π
|
SketchCow |
Somthing out of ntohing. |
04:30
π
|
kyan |
Conspiratorial? I'm hardly conspiring with anyone |
04:30
π
|
SketchCow |
"They do things" |
04:30
π
|
kyan |
(or accusing anyone of conspiring) |
04:30
π
|
|
Stiletto has quit IRC (Remote host closed the connection) |
04:30
π
|
kyan |
Yeah, IA sometimes makes decisions that I strongly disagree with |
04:30
π
|
kyan |
That's not nothing. It's a difference of viewpoints. |
04:31
π
|
|
Stiletto has joined #archiveteam-bs |
04:31
π
|
SketchCow |
Sigh. |
04:31
π
|
snape_ |
What was the media? Was it disturbing jihadi shit? I'm betting it was disturbing jihadi shit. |
04:31
π
|
kyan |
It was some guy playing EVE Online |
04:31
π
|
|
achip has joined #archiveteam-bs |
04:31
π
|
godane |
SketchCow: youtubearchive collection is blocked |
04:31
π
|
SketchCow |
Give me the item name, retard. |
04:31
π
|
SketchCow |
godane: Which? |
04:32
π
|
SketchCow |
Blocked which way |
04:32
π
|
kyan |
It was part of that collection, I assume. Heard about it from Fletcher |
04:32
π
|
Fletcher |
"We're contacting you as a courtesy to let you know that the items in the collection youtubearchive have been removed and your account locked. The uploaded items appear to not adhere to the archive.org terms of use (https://archive.org/about/terms.php)." emailed four hours ago |
04:33
π
|
Fletcher |
http://archive.org/details/youtubearchive |
04:33
π
|
snape_ |
http://69.30.218.174/EDENASC.ZIP <- Oct 1994 BBS text files |
04:34
π
|
Fletcher |
SketchCow, the CC items referenced above were in that collection and included the tag "SeamusDonohueEVE" (for easy searching) |
04:35
π
|
SketchCow |
OK, so, news flash |
04:35
π
|
SketchCow |
There are basically 2 people who do this work. |
04:35
π
|
SketchCow |
They are called Jeff and Chris. |
04:35
π
|
SketchCow |
Chris is on vacation. |
04:35
π
|
SketchCow |
So Jeff. |
04:35
π
|
SketchCow |
JEFF> |
04:35
π
|
kyan |
but it seems like part of a more systemic issue β 1. Bookplates suggested as a way to track donated books β they damage the books, and labeling the boxes would be better. 2. No way to search darked items to see if something's been archived yet. 3. No way to access non-public WARCs available by Wayback, even on an individual request basis. 4. No clear response to nicely worded requests for explanation / official policy on removal of Dabiq magazine. |
04:35
π
|
kyan |
5. Spam is darked, rather than noindexed and made browsable. |
04:35
π
|
SketchCow |
OK, kyan, you moron? Not a staff, not a councel. |
04:35
π
|
SketchCow |
It's a guy. |
04:35
π
|
SketchCow |
He listens. |
04:36
π
|
SketchCow |
See, I'm doing this for Fletcher. I'm done talking to you. |
04:36
π
|
SketchCow |
Fletcher: I'm sure Jeff got worried and knocked it out in total. |
04:36
π
|
kyan |
Ok, bye for now then. However, I stand by my concerns. |
04:36
π
|
SketchCow |
Go to the hell the other hells are afraid of. |
04:36
π
|
SketchCow |
I'll wait eagerly for your own |
04:37
π
|
Fletcher |
SketchCow, is there a set process for getting my account unbanned? |
04:37
π
|
SketchCow |
Writing to info@archive.org to discuss the issue. |
04:37
π
|
SketchCow |
But I'm also in there. I'm going to ask Jeff. |
04:38
π
|
Fletcher |
Thanks |
04:38
π
|
Fletcher |
On a side note, could you make sure Jeff knows he's sending emails as "collections-service@archive.org"? |
04:38
π
|
|
Chorca has quit IRC (Ping timeout: 252 seconds) |
04:38
π
|
SketchCow |
That's the accurate name. |
04:39
π
|
Fletcher |
kk |
04:39
π
|
SketchCow |
23:35 < kyan> 5. Spam is darked, rather than noindexed and made browsable. |
04:39
π
|
SketchCow |
Holy shit, what a moron |
04:39
π
|
kyan |
O RLY I think I've got a good point? |
04:39
π
|
kyan |
If spam is filtered out of browsing and search results, what's wrong with having it there? |
04:40
π
|
SketchCow |
You think the spammers who use IA bandwidth to put up porn and movies in our environment won't just use those links on their spam sites so they have free bandwidth AND they don't have to host, AND we are the ones who get banned? |
04:40
π
|
SketchCow |
Idiot. |
04:40
π
|
SketchCow |
Take a day off. |
04:40
π
|
|
SketchCow sets mode: +b *!*kyan@184.75.223.* |
04:40
π
|
|
kyan was kicked by SketchCow (kyan) |
04:40
π
|
|
Chorca has joined #archiveteam-bs |
04:40
π
|
SketchCow |
Idiot. |
04:41
π
|
SketchCow |
15,021 items in youtubearchive all blocked. |
04:41
π
|
SketchCow |
Darked, really. Interesting |
04:41
π
|
SketchCow |
(Blocked isn't the right term; blocked is) |
04:42
π
|
SketchCow |
SOrry |
04:42
π
|
SketchCow |
SO angry I'm swapping words. |
04:42
π
|
SketchCow |
(Blocked isn't the right term; darked is) |
04:42
π
|
SketchCow |
I'm assuming Jeff got a message about something mirrored on youtubearchive |
04:43
π
|
SketchCow |
I pinged him, but it's Sunday and we have tomorrow off. |
04:43
π
|
SketchCow |
He could be anywhere |
04:43
π
|
SketchCow |
Might not be resolved until Tuesday |
04:44
π
|
MrRadar |
Thank you for your work SketchCow |
04:44
π
|
Fletcher |
No problem, that's still a decent response time in the long run |
04:44
π
|
SketchCow |
Shhh, shh. I'm being unprofessional |
04:46
π
|
SketchCow |
I'm assuming we got a threat and Jeff did a non-surgical strike |
04:46
π
|
godane |
anyways i'm uploading 2008-07 of kpfa |
04:46
π
|
SketchCow |
Thanks. |
04:46
π
|
godane |
we have half of 2008 of kpfa done |
04:47
π
|
vitzli |
Fletcher, I only PMed you, no any further emails/PMs |
04:47
π
|
Fletcher |
vitzli? |
04:47
π
|
Fletcher |
oh right, no problem :) |
04:47
π
|
godane |
in other news i got a 128gb USB for $20 at staples |
04:48
π
|
SketchCow |
Yeah, they're getting nuts |
04:48
π
|
SketchCow |
https://twitter.com/kolubat/status/699091432741675008 |
04:48
π
|
SketchCow |
Double idiot |
04:48
π
|
SketchCow |
A cry in the wilderness with 7 followers |
04:48
π
|
Fletcher |
:/ |
04:48
π
|
MrRadar |
Wow |
04:49
π
|
godane |
wow exactly |
04:49
π
|
godane |
i thought what he was taking about was just in here |
04:50
π
|
SketchCow |
I have a special trigger for when someone has something like 9% of the information, quickly fills in the other 91% and just goes off on their situation. |
04:51
π
|
SketchCow |
Blocked him. |
04:52
π
|
SketchCow |
Archive Team: Don't Be Dumb |
04:53
π
|
SketchCow |
I'm positive that some letter came in because with 15,000 videos of youtube rips, someone got pissed. |
04:53
π
|
SketchCow |
And Jeff overreached because its sunday and Chris is on vacation |
04:54
π
|
Fletcher |
Given that it included The Fine Bros, Gametrailers and IGN I'm not really surprised |
04:54
π
|
vitzli |
I bet it's React folks |
04:54
π
|
SketchCow |
Right. |
04:54
π
|
SketchCow |
So, Jeff did a weekend, HOLIDAY WEEKEND staunch-the-bleed |
04:54
π
|
SketchCow |
Because unlike some moron and his army of 7, lawyers |
04:55
π
|
godane |
luckly my stuff didn't get blocked yet: https://archive.org/details/godaneinbox?and[]=subject%3A%22TheFineBros%22 |
04:56
π
|
SketchCow |
So close the store first, then go back and make sure just the brand of soup is not on the shelves. |
04:56
π
|
SketchCow |
They know you're mine, godane |
04:56
π
|
SketchCow |
They'll come to me |
04:56
π
|
godane |
ok |
04:56
π
|
MrRadar |
SketchCow: while you might not always be "professional" you do keep this group from falling victim to the geek social fallacies (http://www.plausiblydeniable.com/opinion/gsf.html) |
04:57
π
|
SketchCow |
Well aware of that document |
04:57
π
|
* |
SketchCow kicks out cat piss guy once a month |
04:58
π
|
snape_ |
Sad butthurt saved for posterity: https://web.archive.org/web/20160215044951/https:/twitter.com/kolubat/status/699091432741675008 |
04:59
π
|
SketchCow |
Ridiculous. |
05:01
π
|
vitzli |
Is there a proper way of doing an index/audit for a collection? I grabbed the list and counted all users in the telenor grab |
05:02
π
|
Fletcher |
SketchCow, for slack should I send an email through or is irc sufficient? |
05:02
π
|
SketchCow |
It's still up in the air |
05:03
π
|
SketchCow |
We definitely should do more. |
05:03
π
|
SketchCow |
I'm working with a guy to do a mediawiki-archive.org bridge |
05:03
π
|
SketchCow |
So it pulls in all the metadata from a collection, we edit it, and then it goes in. |
05:05
π
|
vitzli |
and since I said about it - there are 12151 users (~username) that have 200 OK page |
05:05
π
|
SketchCow |
So a bunch of people can do a group edit, and then boom. |
05:06
π
|
vitzli |
would it be possible to add collections? |
05:07
π
|
SketchCow |
You still have to ask. |
05:07
π
|
vitzli |
like peer-reviewed or something |
05:09
π
|
SketchCow |
Fletcher: It's because they're all React videos |
05:09
π
|
SketchCow |
(Or many) |
05:11
π
|
SketchCow |
And Jeff doesn't have time to sort right now |
05:11
π
|
Fletcher |
around 1400/15000, would it be possible to just dark those items and I'll work out another solution for archiving dmca magnets? |
05:11
π
|
SketchCow |
It's more this. |
05:11
π
|
SketchCow |
We need you and people to: |
05:11
π
|
SketchCow |
1. Find the items that are clearly CC |
05:12
π
|
SketchCow |
2. Apparently for the moment we are darking Youtube videos that are mirrors of actual hosted materials up |
05:14
π
|
|
Swizzle has quit IRC (Read error: Operation timed out) |
05:38
π
|
|
Sk1d has quit IRC (Ping timeout: 250 seconds) |
05:45
π
|
|
Sk1d has joined #archiveteam-bs |
05:54
π
|
SketchCow |
ANYWAY |
05:55
π
|
SketchCow |
Two quick ranty things, dovetailing into a conversation someone wanted me to have. |
05:56
π
|
SketchCow |
First, Archive Team is bigger than archive.org - archive.org is a grateful and excellent vendor and available place for archiveteam's output. But it's also going to have limits. |
05:57
π
|
SketchCow |
It is not archive.org's job to go everywhere, everyplace, in every radical direction and be considered shitlords for not doing that. |
05:58
π
|
SketchCow |
Second, the example of archiving /r/gonewild was brought up. |
06:00
π
|
SketchCow |
Within the context of what I think we've done, I do realize gonewild is a public forum and a public posting, and so therefore, archiving it is likely to happen, and we certainly have developed an amazing suite of tools to archive everything, agnostically, and quickly. |
06:00
π
|
SketchCow |
There's a chance archive.org will refuse to archive it. |
06:01
π
|
SketchCow |
And there's a chance that archive team members will all say "fuck it, not our project" |
06:01
π
|
SketchCow |
And I think it's up to whoever thinks their thing is a thing to then take tools and use them, but realize they might not have the full backup of the whole team and very likely might have to go pay the $60 and store it themselves. |
06:02
π
|
SketchCow |
Make sense? Thoughts? |
06:04
π
|
MrRadar |
That definitely makes sense to me. |
06:05
π
|
snape_ |
Makes sense to me, but my (personal) preference is for essentially curated collections of limited scope rather than unwieldy hoards, so... |
06:09
π
|
|
oldcad has quit IRC (Quit: Leaving.) |
06:11
π
|
snape_ |
SketchCow, if you don't mind, could you elaborate on the Archive darking material still on Youtube? Is it automated, or...? |
06:26
π
|
|
i0npulse has joined #archiveteam-bs |
06:26
π
|
|
altlabel has joined #archiveteam-bs |
06:26
π
|
|
PotcFdk has joined #archiveteam-bs |
06:26
π
|
|
limebyte has joined #archiveteam-bs |
06:26
π
|
|
coretx has joined #archiveteam-bs |
06:26
π
|
|
pikhq has joined #archiveteam-bs |
06:26
π
|
|
PurpleSym has joined #archiveteam-bs |
06:26
π
|
|
irc.homelien.no sets mode: +o PurpleSym |
06:30
π
|
SketchCow |
Not much to elaborate - clearly marked "don't distribute" - harder |
06:52
π
|
|
JW_work2 has joined #archiveteam-bs |
07:13
π
|
|
Aranje has quit IRC (Quit: Three sheets to the wind) |
07:15
π
|
|
jut has joined #archiveteam-bs |
08:09
π
|
Fletcher |
rsync is flying along now \o/ |
08:38
π
|
yipdw |
~8.5 TB of docstoc from archiveteam.kenshin.sg uploaded |
08:38
π
|
yipdw |
the speed on this machine is fantastic, ~630 Mbit/s sustained |
08:38
π
|
yipdw |
up to IA that is |
08:47
π
|
|
signius has quit IRC (Ping timeout: 300 seconds) |
08:49
π
|
|
antomatic has quit IRC (Read error: Connection reset by peer) |
08:50
π
|
|
antomatic has joined #archiveteam-bs |
08:51
π
|
vitzli |
HCross, viola-beach from soundcloud was archived to https://archive.org/details/soundcloud-indp-viola-beach |
09:00
π
|
|
signius has joined #archiveteam-bs |
09:28
π
|
|
schbirid has joined #archiveteam-bs |
09:46
π
|
arkiver |
<SketchCow>I'm working with a guy to do a mediawiki-archive.org bridge |
09:46
π
|
arkiver |
Will this be something like our wikiteam project? |
09:46
π
|
arkiver |
Because the WARC-part of the wikiteam project will save mediawiki to WARCs |
09:46
π
|
arkiver |
and external links from mediawikis to WARCs |
09:53
π
|
|
PurpleSym sets mode: +o arkiver |
10:25
π
|
|
lytv has quit IRC (Read error: Operation timed out) |
10:26
π
|
|
lytv has joined #archiveteam-bs |
10:31
π
|
SmileyG |
is there anymore to load for fotolog? |
11:11
π
|
|
VADemon has joined #archiveteam-bs |
11:16
π
|
|
Swizzle has joined #archiveteam-bs |
11:33
π
|
|
Swizzle has quit IRC (Read error: Operation timed out) |
11:43
π
|
|
i0npulse has quit IRC (leaving) |
11:47
π
|
|
i0npulse has joined #archiveteam-bs |
12:26
π
|
|
arkiver3 has joined #archiveteam-bs |
12:44
π
|
|
Rickster has quit IRC (Ping timeout: 260 seconds) |
12:44
π
|
|
marvinw has quit IRC (Ping timeout: 260 seconds) |
12:46
π
|
|
Kenshin has quit IRC (Read error: Connection reset by peer) |
12:46
π
|
|
Kenshin has joined #archiveteam-bs |
12:46
π
|
|
Famicoman has quit IRC (Ping timeout: 260 seconds) |
12:47
π
|
|
goekesmi has quit IRC (Ping timeout: 260 seconds) |
12:47
π
|
|
goekesmi has joined #archiveteam-bs |
12:55
π
|
|
Rickster has joined #archiveteam-bs |
13:00
π
|
|
marvinw has joined #archiveteam-bs |
13:33
π
|
Fletcher |
arkiver, the mediawiki bridge will allow group editing of metadata through the mediawiki interface that is then mirrored on IA |
13:34
π
|
|
VADemon has quit IRC (Read error: Operation timed out) |
13:36
π
|
|
Famicoman has joined #archiveteam-bs |
13:47
π
|
|
arkiver3 has quit IRC (Ping timeout: 252 seconds) |
13:51
π
|
|
arkiver3 has joined #archiveteam-bs |
13:53
π
|
arkiver3 |
Fletcher: I see, so it's totally different from the WikiTeam WARC project |
13:54
π
|
arkiver3 |
I still haven't started the grab of the actual mediawiki's yet |
13:54
π
|
arkiver3 |
Needs a bit of more testing, but it should be almost ready to go |
13:54
π
|
Fletcher |
yeah, it's just a way to get around the not fantastic IA user/item management |
13:55
π
|
phuzion |
arkiver3: I've got a question about that videobot you were talking about. Is it supposed to be more archivebot or newsbuddy? Meaning is it intended for on-demand archival or regular recurring archival? |
13:56
π
|
arkiver3 |
phuzion: currently I'm thinking more more archivebot |
13:56
π
|
arkiver3 |
But I'd like to add an option to automatically scrape channels periodically |
13:57
π
|
arkiver3 |
Basically the project will have special scripts for the supported websites to ensure as good playback as possible and the extraction of as much metadata as possible. |
13:58
π
|
HCross |
arkiver3, werent you thinking of adding Selenium or something to newsbuddy. Might that also work here? |
14:01
π
|
phuzion |
arkiver3: I only ask because podcasts would be great to add to it. |
14:01
π
|
phuzion |
Throw it an RSS and say "Check it every sunday" or whenever the podcast updates, and have the podcast automatically pushed to the proper collection and everything. |
14:01
π
|
HCross |
I know this is a long shot, what about Apple podcasts |
14:02
π
|
phuzion |
What about them? |
14:02
π
|
HCross |
could we get them? |
14:02
π
|
phuzion |
If they can be imported into a podcast program, then theoretically yeah |
14:02
π
|
phuzion |
Podcasts are basically RSS files pointing to audio files. |
14:03
π
|
phuzion |
http://feeds.twit.tv/twit.xml TWiT for example |
14:04
π
|
HCross |
ah, its possible but requires someone to use iTunes to grab it http://superuser.com/questions/78415/get-rss-feed-from-itunes-podcast-links/782413 |
14:05
π
|
phuzion |
Oh wow |
14:05
π
|
HCross |
Its Apple. They love doing stuff like this |
14:05
π
|
phuzion |
Didn't know Apple had a proprietary format for podcasts |
14:05
π
|
phuzion |
which is stupid |
14:22
π
|
|
arkiver3 has quit IRC (Ping timeout: 252 seconds) |
14:49
π
|
|
Boltsie__ has joined #archiveteam-bs |
14:50
π
|
|
Boltsie__ is now known as Boltsie |
14:55
π
|
|
VADemon has joined #archiveteam-bs |
14:57
π
|
|
arkiver3 has joined #archiveteam-bs |
15:17
π
|
|
arkiver3 has quit IRC (Ping timeout: 252 seconds) |
15:29
π
|
|
RichardG has quit IRC (Read error: Operation timed out) |
15:48
π
|
|
GLaDOS has quit IRC (Read error: Operation timed out) |
15:49
π
|
|
ndiddy has joined #archiveteam-bs |
15:56
π
|
|
RichardG has joined #archiveteam-bs |
16:14
π
|
|
GLaDOS has joined #archiveteam-bs |
16:23
π
|
|
VADemon has quit IRC (Quit: left4dead) |
16:26
π
|
SketchCow |
arkiver: This is not a wikimedia to archive.org bridge like "save a wiki". This is being able to group-edit archive.org descriptions and metadata. |
16:26
π
|
SketchCow |
I see this was answered. Sorry, just got up. |
17:16
π
|
arkiver |
Thanks |
17:22
π
|
arkiver |
HCross: phuzion: podcasts is aso something we can add to videobot |
17:22
π
|
phuzion |
ok cool |
17:22
π
|
arkiver |
And also add the option to upload the files as audio item to IA |
17:23
π
|
arkiver |
Other things might be support for some photo sites |
17:25
π
|
vitzli |
gif/webm? |
17:26
π
|
arkiver |
sure |
17:26
π
|
arkiver |
Basically videobot will be a bot with specially written support for websites to do a grab as good as possible |
17:26
π
|
arkiver |
SketchCow, more info on videobot ^ |
17:27
π
|
arkiver |
Where archivebot is a more general archiving bot, videobot would not support all websites, but the websites it does support will be grabbed better using videobot then archivebot |
17:29
π
|
phuzion |
arkiver: Would it be possible to have archivebot intelligently forward requests to videobot when it knows that videobot can handle it better? |
17:30
π
|
arkiver |
yeah, that can be added |
17:30
π
|
phuzion |
For example, if someone does !ao on a youtube channel, have archivebot be like "Hey, that would be a great job for videobot. Forwarding the request for your convenience" |
17:31
π
|
arkiver |
yes, but that would also need some change on the side of archivebot |
17:31
π
|
|
espes__ has quit IRC (Read error: Operation timed out) |
17:37
π
|
Fletcher |
Given the sometimes sporadic support for youtube-dl in archivebot it may be a good idea. |
18:02
π
|
|
vitzli has quit IRC (Leaving) |
18:06
π
|
|
Swizzle has joined #archiveteam-bs |
18:07
π
|
|
i0npulse has quit IRC (hub.dk irc.homelien.no) |
18:07
π
|
|
altlabel has quit IRC (hub.dk irc.homelien.no) |
18:07
π
|
|
PotcFdk has quit IRC (hub.dk irc.homelien.no) |
18:07
π
|
|
limebyte has quit IRC (hub.dk irc.homelien.no) |
18:07
π
|
|
coretx has quit IRC (hub.dk irc.homelien.no) |
18:07
π
|
|
pikhq has quit IRC (hub.dk irc.homelien.no) |
18:07
π
|
|
PurpleSym has quit IRC (hub.dk irc.homelien.no) |
18:38
π
|
|
i0npulse has joined #archiveteam-bs |
18:38
π
|
|
PurpleSym has joined #archiveteam-bs |
18:38
π
|
|
altlabel has joined #archiveteam-bs |
18:38
π
|
|
PotcFdk has joined #archiveteam-bs |
18:38
π
|
|
limebyte has joined #archiveteam-bs |
18:38
π
|
|
coretx has joined #archiveteam-bs |
18:38
π
|
|
pikhq has joined #archiveteam-bs |
18:56
π
|
|
wyatt8740 has joined #archiveteam-bs |
19:01
π
|
|
JW_work2 has quit IRC (Leaving.) |
19:25
π
|
|
zino has joined #archiveteam-bs |
19:31
π
|
zino |
I should probably move to a less analogue method of handling the data on my home servers... |
19:31
π
|
Frogging |
arkiver: So, how does archiveteam work? Who downloads files, and where do they get stored, and who submits them to IA? |
19:31
π
|
zino |
https://goo.gl/photos/2A56FZk448zWhGmL6 |
19:31
π
|
Frogging |
And how is it ensured that people aren't redundantly downloading the same things |
19:32
π
|
schbirid |
it's all REALLY WELL ORGANIZED |
19:33
π
|
Frogging |
Is it? :p |
19:34
π
|
Fletcher |
Frogging, smaller sites/individual pages are handled by ArchiveBot (#archivebot) which outsources jobs to volunteer pipelines. Those pipelines will get a copy of the content in Warc format and upload it to a staging server where they're then loaded into the wayback machine for general consumption |
19:35
π
|
snape_ |
It's the BEST ORGANIZED independent autonomous decentralized worldwide volunteer anarcho-syndicalist commune of archivist downloaders on the web today! |
19:35
π
|
|
metalcamp has joined #archiveteam-bs |
19:36
π
|
Fletcher |
Large projects are handled by a central tracker that hands out individual items to anyone running the ArchiveTeam Warrior (vm that runs wpull) and people running the scripts directly |
19:36
π
|
|
RichardG has quit IRC (Read error: Operation timed out) |
19:37
π
|
Fletcher |
From there content is uploaded to a staging server where it's prepared for uploading into IA, combining individual items into 50G warcs and performing any other tasks to make the content easily viewable |
19:37
π
|
joepie91 |
snape_: out of three! |
19:37
π
|
joepie91 |
:P |
19:37
π
|
zino |
There are 3? Do tell. |
19:38
π
|
joepie91 |
bibanon does a similar thing, there's another one I forgot the name of |
19:38
π
|
* |
zino Googles bibanon |
19:38
π
|
schbirid |
http://www.stephenfry.com/2016/02/15/peedinthepool/ |
19:38
π
|
* |
arkiver googles bibanon |
19:38
π
|
Frogging |
I see |
19:39
π
|
arkiver |
bibanon is mostly 4chan archiving it seems |
19:44
π
|
|
GLaDOS has quit IRC (Ping timeout: 260 seconds) |
19:45
π
|
joepie91 |
they do a bunch of stuff, also run an archivebot instance |
19:45
π
|
joepie91 |
as in |
19:46
π
|
joepie91 |
tracker |
19:46
π
|
joepie91 |
not quite archiveteam scale afaik |
19:46
π
|
xmc |
ah nice |
19:46
π
|
arkiver |
where is that located? |
19:46
π
|
joepie91 |
but they exist nevertheless :P |
19:46
π
|
arkiver |
ArchiveTeam still biggest :) |
19:46
π
|
xmc |
always nice to hear that quality things are getting reused |
19:46
π
|
joepie91 |
arkiver: best ask on their IRC |
19:50
π
|
schbirid |
http://nugnugnug.com/pc/master/ |
19:52
π
|
|
Laverne has joined #archiveteam-bs |
19:57
π
|
arkiver |
I see Fletcher just joined too |
19:57
π
|
Fletcher |
:D |
20:00
π
|
Fletcher |
"Archive Team is another rogue archiving organisation, one thatβs bigger than us. DO NOT bring up our organisation in their channels or in discussions with them unless you discuss it with Antonizoon or Dan_ on the IRC channel beforehand." |
20:00
π
|
arkiver |
wut |
20:00
π
|
arkiver |
where did you read that |
20:01
π
|
Fletcher |
the rules in their topic |
20:02
π
|
arkiver |
interesting |
20:02
π
|
arkiver |
I see a dan here |
20:04
π
|
ersi |
derpanon |
20:08
π
|
joepie91 |
Fletcher: arkiver: there's a reason for that |
20:08
π
|
joepie91 |
had some issues here in the past with some of their users |
20:08
π
|
arkiver |
only dec3199? |
20:08
π
|
joepie91 |
another one whose name I forgot |
20:08
π
|
joepie91 |
that has been resolved |
20:09
π
|
joepie91 |
but it's been decided to keep cross-pollination to a minimum to avoid future drama :P |
20:09
π
|
joepie91 |
which is probably a wise decision |
20:09
π
|
joepie91 |
so yeah, that topic is just a drama-prevention measure |
20:09
π
|
snape_ |
Do not poke the autistic weaboos, check. |
20:10
π
|
* |
ersi giggles |
20:10
π
|
arkiver |
I only know about dec3199 |
20:11
π
|
arkiver |
as for the rest, I might have missed something |
20:12
π
|
schbirid |
"organisation" http://images.memes.com/character/meme/dr-evil |
20:13
π
|
snape_ |
"organization" sounds better than "gang", you have to admit... |
20:14
π
|
godane |
i'm downloading all of the glennbeck facebook videos |
20:14
π
|
Frogging |
Archive Gang |
20:18
π
|
snape_ |
We could pretend to be a 3l33t scene group. 4RCHiV3T34M |
20:19
π
|
midas |
arkiver: dec was...wow |
20:19
π
|
zino |
AI is the new exclusive topsite. |
20:30
π
|
|
metalcamp has quit IRC (Ping timeout: 492 seconds) |
20:30
π
|
|
espes__ has joined #archiveteam-bs |
20:37
π
|
ersi |
midas: mindblowingly silly, to be mild |
20:41
π
|
godane |
i'm uploading 2008-08 of kpfa mp3s |
21:25
π
|
|
jut has quit IRC (Read error: Connection reset by peer) |
21:29
π
|
|
wyatt8740 has quit IRC (Read error: Operation timed out) |
21:34
π
|
|
RichardG has joined #archiveteam-bs |
21:36
π
|
|
schbirid has quit IRC (Quit: Leaving) |
21:46
π
|
|
RichardG has quit IRC (Ping timeout: 633 seconds) |
22:06
π
|
dan- |
yeah pretty much drama prevention stuff |
22:07
π
|
dan- |
as you guys probably know, users can be pretty annoyingly insistent regarding 4chan archiving ;) |
22:14
π
|
dan- |
usually best if they yell at us first, before running to you guys |
22:16
π
|
|
RichardG has joined #archiveteam-bs |
22:23
π
|
|
RichardG has quit IRC (Ping timeout: 360 seconds) |
22:32
π
|
|
Lord_Nigh has quit IRC (Ping timeout: 252 seconds) |
22:33
π
|
joepie91 |
dan-: ohai :P |
22:33
π
|
dan- |
joepie91: heyo! |
22:35
π
|
|
superkuh has quit IRC (Ping timeout: 252 seconds) |
22:37
π
|
|
Lord_Nigh has joined #archiveteam-bs |
22:39
π
|
|
superkuh has joined #archiveteam-bs |
23:28
π
|
|
mismatch has quit IRC (Remote host closed the connection) |
23:28
π
|
|
mismatch has joined #archiveteam-bs |
23:53
π
|
godane |
btw I'm getting a copy of Movie Magic tv series from 1994 |