Time |
Nickname |
Message |
00:57
🔗
|
kyan |
I'm still seeing nothing for abklex.de (doing a google search for "abklex" site:archive.org) |
00:58
🔗
|
kyan |
It was archivebotted a week or so ago. I didn't see it anywhere in the archivebot collection items from around when I thought it would have finished, either |
00:58
🔗
|
kyan |
(also: am I still banned from #archivebot? That would probably be a better place for these questions) |
01:05
🔗
|
|
kyan has quit IRC (Quit: Leaving) |
01:06
🔗
|
|
kyan has joined #archiveteam-bs |
01:07
🔗
|
chfoo |
kyan: you got banned? |
01:08
🔗
|
kyan |
chfoo: yes don't remember the circumstances exactly but if I recall correctly, I tried to give someone ops following a tutorial online that suggested using mode flags, typoed, and fscked up the channel modes |
01:08
🔗
|
kyan |
decided not to have ops in other peoples' channels any more :P |
01:10
🔗
|
chfoo |
kyan: but you were in archivebot recently though |
01:11
🔗
|
kyan |
chfoo: a few seconds ago by accident |
01:11
🔗
|
chfoo |
kyan: so you're not banned :) |
01:11
🔗
|
kyan |
wrong server � was trying to join #archivebot on localhost |
01:11
🔗
|
kyan |
oh? it would prevent me from joining if i was? |
01:12
🔗
|
kyan |
(this = why I don't accept ops anymore :P) |
01:13
🔗
|
chfoo |
kyan: yeah, that's basically what a ban is; it won't let you in. we have voice now so people with voice can use it now without needing ops. |
01:14
🔗
|
chfoo |
also archivebot suffered some unfortunate accident so things are being redone |
01:14
🔗
|
kyan |
chfoo, I see. Thanks! I'll join then and ask about that archive since I'm worried about it :) |
01:14
🔗
|
kyan |
Aah thats probably what ate it then |
01:14
🔗
|
aaaaaaaaa |
kyan: yours was requeued, so it will show up eventually. |
01:14
🔗
|
kyan |
I've got my own archivebot instance now so I'll probably just do it myself |
01:15
🔗
|
kyan |
aaaaaaaaa, ah, thanks :) I guess i won't then |
01:15
🔗
|
danneh_ |
ah, was that when fos died? |
01:16
🔗
|
chfoo |
no, it was human error |
01:18
🔗
|
danneh_ |
fair enough |
01:20
🔗
|
|
Coderjoe has quit IRC (Read error: Operation timed out) |
01:23
🔗
|
|
brayden__ has joined #archiveteam-bs |
01:24
🔗
|
chfoo |
let's just say uploading files into localhost into the same directory is not a good idea |
01:31
🔗
|
|
brayden_ has quit IRC (Read error: Operation timed out) |
01:37
🔗
|
|
Coderjoe has joined #archiveteam-bs |
02:02
🔗
|
|
schbirid2 has quit IRC (Read error: Operation timed out) |
02:02
🔗
|
|
norbert79 has quit IRC (Read error: Operation timed out) |
02:03
🔗
|
|
schbirid2 has joined #archiveteam-bs |
02:04
🔗
|
|
primus104 has quit IRC (Leaving.) |
02:08
🔗
|
|
norbert79 has joined #archiveteam-bs |
02:32
🔗
|
|
ex-parro1 has joined #archiveteam-bs |
02:52
🔗
|
godane |
i'm downloading those warc.gz that are randomly in my cbsnews.com video files |
02:53
🔗
|
godane |
i'm going to put them in another item so i can then delete them out of my cbsnews.com-video items |
02:54
🔗
|
|
schbirid2 has quit IRC (Read error: Operation timed out) |
02:59
🔗
|
|
zenguy_pc has joined #archiveteam-bs |
03:00
🔗
|
godane |
SketchCow: do who ximm@archive.org is? |
03:00
🔗
|
godane |
*do you know who |
03:02
🔗
|
godane |
SketchCow: anyways he upload warc.gz files to the my cbsnews.com-video items |
03:03
🔗
|
godane |
and i'm pissed off about it |
03:04
🔗
|
godane |
the way the items should be named in cbsnews.com collection he shouldn't have added them to my item names |
03:07
🔗
|
|
schbirid2 has joined #archiveteam-bs |
03:24
🔗
|
kyan |
godane, I would assume: http://monoskop.org/Aaron_Ximm |
03:28
🔗
|
godane |
ok |
03:32
🔗
|
godane |
i would guess that these warc.gz files are not even in wayback machine |
03:34
🔗
|
godane |
anyways i'm downloading the warc.gz files to put into one item |
03:34
🔗
|
godane |
in archiveteam-fire collection maybe |
03:35
🔗
|
godane |
also this way i can clean my items he started adding warc.gz too |
03:36
🔗
|
godane |
i'm on 2004-02-28 and i'm still finding warc.gz in them |
03:47
🔗
|
godane |
i'm on 2014-03-20 and still finding warc.gz files |
03:47
🔗
|
godane |
:-/ |
03:48
🔗
|
godane |
i'm took some screen shots of thiese pages and history logs |
03:48
🔗
|
godane |
just in case anyone asks for proof |
03:55
🔗
|
|
Sellyme has quit IRC (Read error: Connection reset by peer) |
04:09
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
04:41
🔗
|
|
ex-parro1 has quit IRC (Leaving.) |
04:49
🔗
|
|
mistym has joined #archiveteam-bs |
04:52
🔗
|
|
aaaaaaaaa has quit IRC (Leaving) |
05:34
🔗
|
|
APerti has joined #archiveteam-bs |
05:50
🔗
|
|
APerti has quit IRC () |
05:53
🔗
|
|
Famicoman has quit IRC (Read error: Connection reset by peer) |
06:05
🔗
|
|
godane has quit IRC (Ping timeout: 492 seconds) |
06:15
🔗
|
|
godane has joined #archiveteam-bs |
06:33
🔗
|
godane |
SketchCow: i think aaron ximm added warc.gz files to alot of my items now |
06:34
🔗
|
godane |
fuck me |
06:34
🔗
|
godane |
https://archive.org/details/cbsnews.com-video-2003-11-30 |
06:35
🔗
|
godane |
web archives are there too |
06:50
🔗
|
joepie91 |
can whoever alerted me to the bitcasa fiasco initially, PM me? I forgot who you were... it's kind of important |
07:04
🔗
|
joepie91 |
ohh there we go |
07:04
🔗
|
joepie91 |
damnit, not here anymore |
07:07
🔗
|
joepie91 |
k, found them |
07:10
🔗
|
|
midas has quit IRC (Read error: Operation timed out) |
07:11
🔗
|
|
midas has joined #archiveteam-bs |
07:12
🔗
|
|
espes__ has quit IRC (ircd.choopa.net irc.teksavvy.ca) |
07:12
🔗
|
|
Mayonaise has quit IRC (ircd.choopa.net irc.teksavvy.ca) |
07:12
🔗
|
|
jk[SVP] has quit IRC (ircd.choopa.net irc.teksavvy.ca) |
07:12
🔗
|
|
closure has quit IRC (ircd.choopa.net irc.teksavvy.ca) |
07:12
🔗
|
|
joepie91 has quit IRC (ircd.choopa.net irc.teksavvy.ca) |
07:15
🔗
|
|
espes__ has joined #archiveteam-bs |
07:15
🔗
|
|
Mayonaise has joined #archiveteam-bs |
07:15
🔗
|
|
jk[SVP] has joined #archiveteam-bs |
07:15
🔗
|
|
closure has joined #archiveteam-bs |
07:15
🔗
|
|
joepie91 has joined #archiveteam-bs |
07:15
🔗
|
|
irc.teksavvy.ca sets mode: +oo closure joepie91 |
07:38
🔗
|
|
primus104 has joined #archiveteam-bs |
08:01
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
08:20
🔗
|
|
xtr-107 has joined #archiveteam-bs |
08:26
🔗
|
|
xtr-201 has quit IRC (Read error: Operation timed out) |
08:49
🔗
|
|
primus104 has quit IRC (Leaving.) |
09:44
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
10:38
🔗
|
godane |
i'm over 50k items |
10:39
🔗
|
godane |
in my godaneinbox collection |
10:53
🔗
|
|
Sellyme has joined #archiveteam-bs |
10:55
🔗
|
godane |
so i have found about 11gb of warc.gz files so far in my cbsnews.com-videos collection |
10:59
🔗
|
midas |
damn |
11:01
🔗
|
midas |
1d19h36s | ... Percent Done: 5.2% Peers: ^ 0 kB/s to 0, v 227 kB/s from 1, of 1 (Ratio: 0.0) (0s idle) |
11:11
🔗
|
godane |
i'm now at 12gb |
11:12
🔗
|
godane |
:-/ |
11:12
🔗
|
midas |
still from the same guy godane ? |
11:12
🔗
|
godane |
i think so |
11:12
🔗
|
godane |
its all cbsnews.com web archives too |
11:12
🔗
|
godane |
he created that collection by the looks of things |
11:13
🔗
|
schbirid2 |
no reply from him yet? |
11:13
🔗
|
godane |
no reply yet |
11:14
🔗
|
godane |
i still don't see how he could have made type of mistake |
11:14
🔗
|
godane |
just based on the item names it should have been different |
11:15
🔗
|
|
primus104 has joined #archiveteam-bs |
11:15
🔗
|
godane |
his collection: https://archive.org/details/cbsnews.com |
11:15
🔗
|
midas |
and based on the collectionname too |
11:15
🔗
|
godane |
https://archive.org/details/cbsnews.com-20140324-205801 |
11:16
🔗
|
godane |
thats a example of a item name |
11:16
🔗
|
godane |
i shouldn't have had video or dash in the dates |
11:16
🔗
|
schbirid2 |
still, just assume good faith and wait for his reply |
11:16
🔗
|
schbirid2 |
no use worrying and speculating :) |
11:17
🔗
|
godane |
i will have to reupload it anyway |
11:17
🔗
|
midas |
why? |
11:17
🔗
|
godane |
its not in the web collection |
11:17
🔗
|
godane |
for starters |
11:18
🔗
|
godane |
also i want to 'clean' the warc.gz out of my video tiems |
11:18
🔗
|
godane |
*items |
11:18
🔗
|
midas |
yeah but they (archive.org) can move the warc files from your collection to the correct collection |
11:18
🔗
|
midas |
atleast, i assume they should be able to do that |
11:18
🔗
|
godane |
i think they would move the full item |
11:19
🔗
|
godane |
not just warc.gz but everything in the item |
11:19
🔗
|
godane |
thats why this has to be done |
11:20
🔗
|
midas |
lets wait for the guy to reply or SketchCow to reply. |
11:20
🔗
|
godane |
i'm just downloading the warc.gz for now |
11:21
🔗
|
tfgbd |
Is there a section for OEM/bundled stuff? |
11:31
🔗
|
|
robink has quit IRC (ny.us.hub west.us.hub) |
11:31
🔗
|
|
torvik has quit IRC (ny.us.hub west.us.hub) |
11:31
🔗
|
|
lysobit has quit IRC (ny.us.hub west.us.hub) |
11:31
🔗
|
|
amerrykan has quit IRC (ny.us.hub west.us.hub) |
11:31
🔗
|
|
Baljem_ has quit IRC (ny.us.hub west.us.hub) |
11:31
🔗
|
|
cloudmons has quit IRC (ny.us.hub west.us.hub) |
12:02
🔗
|
|
robink has joined #archiveteam-bs |
12:02
🔗
|
|
cloudmons has joined #archiveteam-bs |
12:02
🔗
|
|
Baljem_ has joined #archiveteam-bs |
12:02
🔗
|
|
amerrykan has joined #archiveteam-bs |
12:02
🔗
|
|
lysobit has joined #archiveteam-bs |
12:02
🔗
|
|
torvik has joined #archiveteam-bs |
12:02
🔗
|
|
west.us.hub sets mode: +o Baljem_ |
12:06
🔗
|
|
username1 has joined #archiveteam-bs |
12:10
🔗
|
|
schbirid2 has quit IRC (Read error: Operation timed out) |
12:55
🔗
|
|
SadDM has joined #archiveteam-bs |
13:13
🔗
|
|
SN4T14 has joined #archiveteam-bs |
13:14
🔗
|
|
SN4T14_ has quit IRC (Ping timeout: 246 seconds) |
13:18
🔗
|
godane |
so i have +18gb of warc.gz in my cbsnews.com-video colleciton |
13:45
🔗
|
|
BiggieJon has quit IRC (Leaving.) |
14:03
🔗
|
|
sankin has joined #archiveteam-bs |
14:04
🔗
|
SadDM |
So here's a question... is there any reason that I shouldn't just use wpull instead of wget on a day-to-day basis? |
14:41
🔗
|
godane |
any plans on archiving deviantART? |
14:42
🔗
|
godane |
looks like the older stuff maybe premium now |
14:52
🔗
|
midas |
wow, that will be huge godane |
14:56
🔗
|
godane |
ok |
14:57
🔗
|
godane |
i wish we could go after the older images first |
14:58
🔗
|
godane |
i found a way |
14:58
🔗
|
godane |
www.deviantart.com/download/170936604 |
14:58
🔗
|
godane |
we can brute force it by ranges |
14:59
🔗
|
ersi |
godane: I remember SketchCow talking about Deviantart before. Think he knows the people behind it etc |
14:59
🔗
|
godane |
ok |
15:00
🔗
|
godane |
if anything else we at least know we can brute force it by range now |
15:00
🔗
|
arkiver |
godane: I'd like to create a project for it :) |
15:00
🔗
|
godane |
ok |
15:00
🔗
|
arkiver |
but it will be very big |
15:01
🔗
|
arkiver |
SketchCow needs to approve first |
15:01
🔗
|
godane |
ok |
15:01
🔗
|
godane |
i figure that |
15:04
🔗
|
midas |
Downloaded: 1164905 files, 709G in 2d 18h 48m 44s (3.02 MB/s) 728G 2014.11.ftp.sunet.se-X11.tar |
15:04
🔗
|
midas |
what? |
15:04
🔗
|
midas |
28GB bigger when compressed? |
15:05
🔗
|
arkiver |
midas: I'd think it handles the small files not as efficient as your os |
15:06
🔗
|
midas |
hm thats one option |
15:06
🔗
|
|
primus104 has quit IRC (Leaving.) |
15:06
🔗
|
midas |
the content textfile is already 100MB in size :p |
15:12
🔗
|
SadDM |
man DeviantArt... that thing is as much a social network as it is an image sharing site. Lots of comments, and blogs, and user inter-connectedness. |
15:12
🔗
|
midas |
and furries |
15:12
🔗
|
midas |
lots and lots of furries |
15:17
🔗
|
ersi |
How 'bout imgur? |
15:19
🔗
|
midas |
ah imgur, the site thats famous for being totally crap. they are rather good at breaking stuff. we could do something there yeah |
15:19
🔗
|
balrog |
did anyone see vice's article about archive.is |
15:20
🔗
|
balrog |
http://motherboard.vice.com/read/dear-gamergate-please-stop-stealing-our-shit |
15:23
🔗
|
username1 |
ugh vice |
15:25
🔗
|
balrog |
I'm not pro gamergate |
15:25
🔗
|
balrog |
but archive.today has been used by both sides |
15:25
🔗
|
balrog |
well, the other side has used it to prevent not-so-nice stuff from disappearing |
15:25
🔗
|
balrog |
as is fairly common |
15:27
🔗
|
godane |
i never know what gamergate was |
15:28
🔗
|
balrog |
archive.today pulls from google cache which IA doesn't |
15:28
🔗
|
ersi |
it's some debacle about gaming and sexes |
15:30
🔗
|
godane |
thats what i thought it was |
15:30
🔗
|
balrog |
how do I explain this |
15:31
🔗
|
balrog |
http://digiday.com/brands/wtf-gamergate/ |
15:31
🔗
|
balrog |
or look at the @-replies to anyone who speaks out against gamergate on twitter (especially if they're a woman) |
15:34
🔗
|
username1 |
or just realise that 99% is trolling and your life is better off by not getting involved in poop flinging contests |
15:35
🔗
|
godane |
ximm has still not replied |
15:35
🔗
|
|
aaaaaaaaa has joined #archiveteam-bs |
15:35
🔗
|
SketchCow |
Do not archive DeviantArt |
15:35
🔗
|
godane |
ok |
15:36
🔗
|
balrog |
what's going on with DA? |
15:36
🔗
|
godane |
SketchCow: do you know why Ximm added warc.gz to my cbsnews.com-video collection |
15:36
🔗
|
SketchCow |
I would not know that, no. |
15:36
🔗
|
SketchCow |
Give me an example. |
15:37
🔗
|
godane |
https://archive.org/details/cbsnews.com-video-2003-12-25 |
15:38
🔗
|
SketchCow |
Interesting. |
15:38
🔗
|
SketchCow |
His name is Aaron Ximm, by the way. |
15:39
🔗
|
godane |
i know |
15:39
🔗
|
SketchCow |
If you mailed him, I'm sure he'll respond. This happened 203 days ago, as you know - so there's maybe some thing that was being done for a reason, or an unintended overlap of IDs |
15:39
🔗
|
godane |
i guessing overlap |
15:40
🔗
|
godane |
image of one too: http://tinypic.com/r/bjg9b9/8 |
15:40
🔗
|
godane |
SketchCow: just know its not just one |
15:40
🔗
|
godane |
its lot of them in the cbsnews.com-video collection |
15:41
🔗
|
godane |
best i can tell i happens to items with less videos in them |
15:42
🔗
|
godane |
his collection for cbsnews.com: https://archive.org/details/cbsnews.com |
15:42
🔗
|
arkiver |
Are the videos in the items that have the warc's combined less then 5 GB? |
15:42
🔗
|
godane |
normal item name: https://archive.org/details/cbsnews.com-20140324-205801 |
15:42
🔗
|
arkiver |
and an item with a lot of videos that doesn't have warc's more then 5 GB? |
15:42
🔗
|
godane |
? |
15:43
🔗
|
godane |
not of the cbsnews.com videos have anything close to 5gb |
15:43
🔗
|
arkiver |
ah ok |
15:43
🔗
|
godane |
it was like 200mb to 400mb between 2004 to 2006 |
15:44
🔗
|
arkiver |
He might be finding all items with a name cbsnews.com-* |
15:44
🔗
|
arkiver |
Then checks them for how big they are |
15:45
🔗
|
arkiver |
and the items that aren't the limit size get new warc's |
15:45
🔗
|
arkiver |
so the warc's end up in your items |
15:45
🔗
|
arkiver |
but that's just speculation, not sure about that |
15:46
🔗
|
arkiver |
for example liveweb items are all max 5 GB, so that might also be the case for those websites and newsites crawls of aaron |
15:47
🔗
|
godane |
ok |
15:47
🔗
|
joepie91 |
balrog: some leaked email document ended up on PDFy about gamergate also |
15:47
🔗
|
joepie91 |
got reasonably much traffic |
15:48
🔗
|
antomatic |
(frets) |
15:48
🔗
|
godane |
the size limit would have to be much smaller |
15:48
🔗
|
antomatic |
What? Deviantart? Why mention Deviantart? It's alright isn't it? It's not closing is it? IS IT? Answer me!!1!! |
15:48
🔗
|
* |
antomatic paces up and down |
15:49
🔗
|
midas |
not that we know of |
15:49
🔗
|
godane |
antomatic: its not closing |
15:49
🔗
|
antomatic |
phew |
15:49
🔗
|
godane |
we are not archiving it |
15:49
🔗
|
|
mistym has joined #archiveteam-bs |
15:50
🔗
|
godane |
SketchCow: i emailed him yesterday |
15:50
🔗
|
godane |
i have not got a reply yet |
15:51
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
15:52
🔗
|
aaaaaaaaa |
He's probably afraid of contacting you back; you are the great godane archiving machine after all. |
15:53
🔗
|
godane |
also his score is 86k items |
15:53
🔗
|
godane |
i'm past +270k |
15:53
🔗
|
godane |
:-D |
15:54
🔗
|
midas |
https://i.imgur.com/A7cdb.jpg fits godane |
16:19
🔗
|
|
mistym has joined #archiveteam-bs |
16:34
🔗
|
|
aaaaaaaaa has quit IRC (Leaving) |
16:40
🔗
|
|
aaaaaaaaa has joined #archiveteam-bs |
17:11
🔗
|
|
primus104 has joined #archiveteam-bs |
17:14
🔗
|
|
sankin has quit IRC (Leaving.) |
17:22
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
17:38
🔗
|
|
mistym has joined #archiveteam-bs |
18:54
🔗
|
|
tfgbd has quit IRC (Ping timeout: 265 seconds) |
19:06
🔗
|
|
primus_ has quit IRC (Remote host closed the connection) |
19:13
🔗
|
|
primus has joined #archiveteam-bs |
19:48
🔗
|
username1 |
haha, nice. i want to buy a book but all i get is pdfs and a torrent :) http://www.engineerguy.com/fourier/ |
19:58
🔗
|
joepie91 |
username1: ehhh, they need to do something about their presentation |
19:58
🔗
|
joepie91 |
first mental association that page brought up was "ugh, affiliate marketing ebooks" |
19:59
🔗
|
joepie91 |
:P |
19:59
🔗
|
username1 |
:) |
19:59
🔗
|
username1 |
watch the youtube vids, it's awesome |
20:01
🔗
|
joepie91 |
it seems cool, it's just the online presentation that's a bit eh |
20:03
🔗
|
|
ex-parro1 has joined #archiveteam-bs |
20:12
🔗
|
|
eprillios has quit IRC (Ping timeout: 252 seconds) |
20:12
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
20:15
🔗
|
|
eprillios has joined #archiveteam-bs |
20:19
🔗
|
joepie91 |
SketchCow: you were faster than me :) |
20:20
🔗
|
|
balrog_ has joined #archiveteam-bs |
20:20
🔗
|
|
swebb sets mode: +o balrog_ |
20:21
🔗
|
|
Kazzy_ has joined #archiveteam-bs |
20:21
🔗
|
|
Arkiver2 has joined #archiveteam-bs |
20:21
🔗
|
Kazzy_ |
zzz, what happened |
20:21
🔗
|
|
arkiver has quit IRC (Write error: Broken pipe) |
20:21
🔗
|
|
balrog has quit IRC (Read error: Connection reset by peer) |
20:21
🔗
|
|
RainbowCo has quit IRC (Read error: Connection reset by peer) |
20:21
🔗
|
|
GLaDOS has quit IRC (Write error: Connection reset by peer) |
20:21
🔗
|
|
Kazzy has quit IRC (Write error: Connection reset by peer) |
20:21
🔗
|
|
Kazzy_ is now known as Kazzy |
20:22
🔗
|
|
balrog_ is now known as balrog |
20:22
🔗
|
|
GLaDOS has joined #archiveteam-bs |
20:22
🔗
|
|
swebb sets mode: +o GLaDOS |
20:23
🔗
|
|
deathy___ has joined #archiveteam-bs |
20:24
🔗
|
|
RainbowCo has joined #archiveteam-bs |
20:33
🔗
|
|
mistym has joined #archiveteam-bs |
20:43
🔗
|
|
antomati_ has joined #archiveteam-bs |
20:45
🔗
|
|
antomatic has quit IRC (Read error: Operation timed out) |
21:06
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
21:09
🔗
|
|
antomatic has joined #archiveteam-bs |
21:09
🔗
|
|
antomati_ has quit IRC (Read error: Connection reset by peer) |
21:23
🔗
|
|
mistym has joined #archiveteam-bs |
22:22
🔗
|
|
GLaDOS has quit IRC (Ping timeout: 272 seconds) |
23:11
🔗
|
|
GLaDOS has joined #archiveteam-bs |
23:11
🔗
|
|
swebb sets mode: +o GLaDOS |
23:40
🔗
|
|
Jonimus has quit IRC (Read error: Operation timed out) |