Time |
Nickname |
Message |
00:35
🔗
|
|
chazchaz has quit IRC (Read error: Operation timed out) |
00:36
🔗
|
|
HCross has quit IRC (Read error: Connection reset by peer) |
00:37
🔗
|
|
HCross has joined #archiveteam-bs |
00:40
🔗
|
|
chazchaz has joined #archiveteam-bs |
00:47
🔗
|
|
Stiletto has quit IRC (Ping timeout: 246 seconds) |
01:06
🔗
|
kyan |
bzc6p: FWIW: IMO unless you have really slow upload, there's no sense in using any lossy compression on scans. Given a folder of TIFFs, you can just put it into a losslessly compressed windows-style "ZIP" file, upload it to IA, and set it as a "Generic Raw Book Zip" and queue a rederive. It should work like a charm. |
01:17
🔗
|
|
logan has quit IRC (ircd.choopa.net irc.teksavvy.ca) |
01:17
🔗
|
|
zenguy has quit IRC (ircd.choopa.net irc.teksavvy.ca) |
01:17
🔗
|
|
Mayonaise has quit IRC (ircd.choopa.net irc.teksavvy.ca) |
01:17
🔗
|
|
closure has quit IRC (ircd.choopa.net irc.teksavvy.ca) |
01:17
🔗
|
|
Nertsy has quit IRC (ircd.choopa.net irc.teksavvy.ca) |
01:17
🔗
|
|
Baljem has quit IRC (ircd.choopa.net irc.teksavvy.ca) |
01:17
🔗
|
|
mr-b has quit IRC (ircd.choopa.net irc.teksavvy.ca) |
01:21
🔗
|
|
JesseW has joined #archiveteam-bs |
01:35
🔗
|
|
tephra has joined #archiveteam-bs |
01:35
🔗
|
|
tephra_ has quit IRC (Read error: Connection reset by peer) |
01:36
🔗
|
|
logan has joined #archiveteam-bs |
01:38
🔗
|
|
zenguy has joined #archiveteam-bs |
01:38
🔗
|
|
closure has joined #archiveteam-bs |
01:38
🔗
|
|
midas sets mode: +o closure |
01:38
🔗
|
|
Nertsy has joined #archiveteam-bs |
01:39
🔗
|
|
lbft_ has quit IRC (Read error: Operation timed out) |
01:39
🔗
|
|
mr-b has joined #archiveteam-bs |
01:39
🔗
|
|
GLaDOS has quit IRC (Ping timeout: 633 seconds) |
01:39
🔗
|
|
achip has quit IRC (Read error: Operation timed out) |
01:40
🔗
|
|
GLaDOS has joined #archiveteam-bs |
01:40
🔗
|
|
midas sets mode: +o GLaDOS |
01:40
🔗
|
|
lbft has joined #archiveteam-bs |
01:41
🔗
|
|
kvieta has quit IRC (Excess Flood) |
01:41
🔗
|
|
wyatt8750 has quit IRC (Read error: Operation timed out) |
01:41
🔗
|
|
Baljem has joined #archiveteam-bs |
01:45
🔗
|
|
Mayonaise has joined #archiveteam-bs |
01:46
🔗
|
|
kyan_ has joined #archiveteam-bs |
01:48
🔗
|
|
kvieta has joined #archiveteam-bs |
01:53
🔗
|
|
JesseW has quit IRC (Leaving.) |
01:55
🔗
|
|
kyan has quit IRC (Ping timeout: 663 seconds) |
01:56
🔗
|
|
beardicus has quit IRC (Read error: Operation timed out) |
01:57
🔗
|
|
lbft has quit IRC (Ping timeout: 626 seconds) |
01:58
🔗
|
|
arkiver has quit IRC (Read error: Operation timed out) |
01:58
🔗
|
|
lbft has joined #archiveteam-bs |
02:00
🔗
|
|
jk[SVP] has quit IRC (Read error: Operation timed out) |
02:01
🔗
|
|
jk[[SVP]] has joined #archiveteam-bs |
02:01
🔗
|
|
jk[[SVP]] is now known as jk[SVP] |
02:01
🔗
|
|
JesseW has joined #archiveteam-bs |
02:02
🔗
|
|
kyan has joined #archiveteam-bs |
02:02
🔗
|
|
arkiver has joined #archiveteam-bs |
02:02
🔗
|
|
wyatt8750 has joined #archiveteam-bs |
02:02
🔗
|
|
kvieta has quit IRC (Ping timeout: 629 seconds) |
02:02
🔗
|
|
Sanqui has quit IRC (Ping timeout: 629 seconds) |
02:10
🔗
|
|
kyan_ has quit IRC (Ping timeout: 620 seconds) |
02:10
🔗
|
|
phuzion has quit IRC (Read error: Operation timed out) |
02:10
🔗
|
|
toad1 has quit IRC (Ping timeout: 851 seconds) |
02:10
🔗
|
|
mutoso_ has quit IRC (Read error: Operation timed out) |
02:10
🔗
|
|
Sanqui has joined #archiveteam-bs |
02:10
🔗
|
|
mutoso has joined #archiveteam-bs |
02:21
🔗
|
|
logchfoo3 starts logging #archiveteam-bs at Tue Jan 19 02:21:49 2016 |
02:21
🔗
|
|
logchfoo3 has joined #archiveteam-bs |
02:21
🔗
|
|
logchfoo3 has quit IRC (Connection closed) |
02:22
🔗
|
|
logchfoo4 starts logging #archiveteam-bs at Tue Jan 19 02:22:52 2016 |
02:22
🔗
|
|
logchfoo4 has joined #archiveteam-bs |
02:26
🔗
|
|
username1 has joined #archiveteam-bs |
02:29
🔗
|
|
schbirid2 has quit IRC (Read error: Operation timed out) |
02:49
🔗
|
|
beardicus has joined #archiveteam-bs |
02:49
🔗
|
|
kvieta has joined #archiveteam-bs |
02:56
🔗
|
|
kvieta has quit IRC (Read error: Operation timed out) |
03:14
🔗
|
|
kyan_ has joined #archiveteam-bs |
03:15
🔗
|
|
kyan has quit IRC (Ping timeout: 260 seconds) |
03:16
🔗
|
|
kyan_ is now known as kyan |
03:18
🔗
|
kyan |
We might want to archive the BBC iPlayer stuff. |
03:18
🔗
|
kyan |
I've brought this up before in the context of the iPM Radio 4 podcast |
03:18
🔗
|
kyan |
but I've found this page (needs UK IP) http://www.bbc.co.uk/iplayer/cbbc/a-z |
03:19
🔗
|
kyan |
shows a bunch of TV shows |
03:19
🔗
|
kyan |
most of which are only available for a few days, apparently. |
03:19
🔗
|
kyan |
The TV shows mostly look dreadful, but that doesn't mean they shouldn't be archived |
03:19
🔗
|
|
beardicus has quit IRC (Read error: Operation timed out) |
03:21
🔗
|
kyan |
I mean who green-lit a show called "Gangsta Granny", with what appears to be an entirely white cast. SRSLY? |
03:23
🔗
|
kyan |
And it requires Adobe Flush, I mean Adobe Crash, um, well the thing that crashes and was flushed down the toilet in fucking 2007 by the iPhone |
03:24
🔗
|
|
kvieta has joined #archiveteam-bs |
03:24
🔗
|
|
beardicus has joined #archiveteam-bs |
03:24
🔗
|
dashcloud |
kyan: if you set your user-agent to an iPad string, you can usually get around the Flash requirement |
03:25
🔗
|
kyan |
dashcloud: Ooh, that sounds handy! (why doesn't it detect a lack of flash for desktop. Browser plugins are so Netscape 4) |
03:25
🔗
|
kyan |
Thanks :) |
03:27
🔗
|
|
dashcloud has quit IRC (Quit: No Ping reply in 180 seconds.) |
03:29
🔗
|
|
dashcloud has joined #archiveteam-bs |
03:29
🔗
|
kyan |
Unfortunately that didn't work, AFAICT. Cleared bbc cookies, cleared cache, and reloaded; still getting served Flash |
03:33
🔗
|
kyan |
Urgh it's hosted as some sort of fragmented streaming thing like YouTube, as .f4f and .f4m files. At least it looks like there's an app to convert that |
03:41
🔗
|
kyan |
As it turns out, Firefox doesn't like having too many bookmarks, apparently. Opened my bookmarks menu and it crashed -_- |
03:44
🔗
|
kyan |
Man BBC's web site sucks. Gives an error message, and pressing refresh keyboard shortcut makes the error message smaller. (Why can plugins capture browser keyboard shortcuts, lol) (#WTF) |
03:55
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
04:08
🔗
|
kyan |
Ok, got a simple-ish way working to download those videos. Unfortunately the ones I was actually visiting the BBC's Web site for say "Sorry, this episode is not currently available"= |
04:27
🔗
|
kyan |
LOL binsearch's retention time is shorter than my provider |
05:34
🔗
|
JesseW |
anyone happen to know the uncompressed size of https://archive.org/download/ia-bak-census_20150304/public-file-size-md_20150304205357.json.gz ? |
05:34
🔗
|
JesseW |
it's over 12G |
05:38
🔗
|
JesseW |
sorry, 17 (and still going) |
05:40
🔗
|
JesseW |
20 |
05:44
🔗
|
JesseW |
21 G is the final total: 22522862598 bytes |
05:51
🔗
|
JesseW |
now added to http://archiveteam.org/index.php?title=Internet_Archive_Census |
06:01
🔗
|
kyan |
Getting connection refused from IA |
06:01
🔗
|
kyan |
for their homepage |
06:01
🔗
|
kyan |
and for s3 api |
06:03
🔗
|
|
kyan_ has joined #archiveteam-bs |
06:04
🔗
|
kyan_ |
Still happening on another IP |
06:07
🔗
|
|
kyan has quit IRC (Ping timeout: 260 seconds) |
06:07
🔗
|
|
kyan_ is now known as kyan |
06:09
🔗
|
kyan |
who.is says archive.org is down https://who.is/whois/archive.org |
06:09
🔗
|
kyan |
I guess it's not my imagination |
06:14
🔗
|
kyan |
Ah, up again. Yay! |
06:18
🔗
|
kyan |
Theoretically this can download a Jamendo album in FLAC. https://gist.github.com/niclashoyer/10426194 |
06:18
🔗
|
kyan |
I can't get it to work, but if it can be gotten to be, then it might be some good stuff to archive |
06:19
🔗
|
kyan |
(FLACs are only downloadable by the website with a pricy commercial license) |
06:24
🔗
|
SketchCow |
Back |
06:25
🔗
|
JesseW |
was that the result of the people DDOSing? |
06:32
🔗
|
kyan |
Out of the top 5 public non-collection items on IA, numbers 1, 3, 4, and 5 are Islamic religious works. Number 2 is a movie called "About Bananas". Wut. |
06:33
🔗
|
* |
kyan loves IA |
06:35
🔗
|
JesseW |
top 5 by what count? views? |
06:36
🔗
|
kyan |
yep |
06:36
🔗
|
kyan |
https://archive.org/search.php?query=mediatype%3A*+-mediatype%3Acollection&sort=-downloads&page=2 |
06:36
🔗
|
kyan |
or, not the &page=2, but yeah |
06:37
🔗
|
kyan |
https://twitter.com/kolubat/status/689335622645972992 |
06:37
🔗
|
JesseW |
This ... looks wrong: https://archive.org/metadata/landthatisdesolaTest000001mbp -- check the "identifier" value. |
06:38
🔗
|
kyan |
cause there are 2 values? |
06:39
🔗
|
JesseW |
yeah. that shouldn't happen, afaik... |
07:05
🔗
|
JesseW |
and the census file has that item list 5 times... (?!) |
07:06
🔗
|
JesseW |
s/list/listed/ |
07:06
🔗
|
JesseW |
along with one item without an id at all |
07:06
🔗
|
JesseW |
such interesting oddities |
07:09
🔗
|
JesseW |
And there's one identifier duplicated in the identifier list. It is: "e-dv212_boston_14_harvardsquare_09-05_001.ogg" (obviously! :-) ) |
07:16
🔗
|
kyan |
JesseW, on the topic of interesting identifiers: my user uploads page links to https://archive.org/details/__new_item__ |
07:16
🔗
|
kyan |
(it's not a valid item) |
07:16
🔗
|
JesseW |
hm, interesting |
07:20
🔗
|
|
phuzion has quit IRC (Quit: No Ping reply in 180 seconds.) |
07:24
🔗
|
|
phuzion has joined #archiveteam-bs |
07:27
🔗
|
JesseW |
The item without an identifier is https://archive.org/details/lecture_10195 (jake, the person who ran the census, fixed it soon after running the census) |
08:00
🔗
|
JesseW |
Hm, the main census file has only 13,075,195 normal string identifiers, with 113 duplicates |
08:03
🔗
|
|
JesseW has quit IRC (Read error: Operation timed out) |
08:04
🔗
|
yipdw |
proposal: add Archive Team motto http://oddstuffmagazine.com/wp-content/uploads/2015/12/All-your-files-are-exactly-where-you-left-them.jpg |
08:48
🔗
|
|
JesseW has joined #archiveteam-bs |
09:40
🔗
|
|
JesseW has quit IRC (Read error: Operation timed out) |
10:03
🔗
|
|
godane has quit IRC (Quit: Leaving.) |
10:05
🔗
|
|
SmileyG has quit IRC (Read error: Operation timed out) |
10:10
🔗
|
|
ersi_ is now known as ersi |
10:50
🔗
|
|
Smiley has joined #archiveteam-bs |
11:01
🔗
|
|
PotcFdk has quit IRC (Ping timeout: 506 seconds) |
11:20
🔗
|
|
brayden has quit IRC (Quit: Leaving) |
11:39
🔗
|
|
brayden has joined #archiveteam-bs |
11:45
🔗
|
espes___ |
kyan: https://github.com/get-iplayer/get_iplayer, dono if it still works |
12:19
🔗
|
|
Ravenloft has joined #archiveteam-bs |
12:29
🔗
|
|
BlueMaxim has quit IRC (Read error: Connection reset by peer) |
12:32
🔗
|
joepie91 |
I think this will be of interest to people here |
12:32
🔗
|
joepie91 |
https://worldbuilding.stackexchange.com/questions/33559/a-treasure-chest-for-your-post-apocalyptic-children |
12:45
🔗
|
coretx |
"Bottlecaps... lots of Bottlecaps. As many as he can find." |
13:22
🔗
|
|
brayden has quit IRC (Quit: Leaving) |
13:34
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
13:37
🔗
|
|
dashcloud has joined #archiveteam-bs |
14:06
🔗
|
|
brayden has joined #archiveteam-bs |
14:06
🔗
|
|
Ravenloft has quit IRC (Ping timeout: 370 seconds) |
14:34
🔗
|
|
HCross2 has joined #archiveteam-bs |
14:45
🔗
|
|
Ravenloft has joined #archiveteam-bs |
16:15
🔗
|
|
chazchaz has quit IRC (Read error: Operation timed out) |
16:16
🔗
|
|
Ravenloft has quit IRC (Ping timeout: 360 seconds) |
16:20
🔗
|
|
chazchaz has joined #archiveteam-bs |
16:34
🔗
|
ersi |
modarchive.org apparently has failed |
16:41
🔗
|
SketchCow |
Tah dahhh |
17:21
🔗
|
|
zerkalo_ has quit IRC (Read error: Connection reset by peer) |
17:50
🔗
|
|
JesseW has joined #archiveteam-bs |
17:59
🔗
|
|
JesseW has quit IRC (Leaving.) |
18:13
🔗
|
|
JW_work has quit IRC (Read error: Operation timed out) |
18:21
🔗
|
|
JW_work has joined #archiveteam-bs |
20:19
🔗
|
SketchCow |
https://www.freelancer.com/projects/php/Web-Scraping-entire-forum-sub/ |
20:19
🔗
|
SketchCow |
Who the fuck |
20:23
🔗
|
JW_work |
lol — pointing them at https://archive-it.org/ might be a good idea. |
20:27
🔗
|
|
JW_work has quit IRC (Quit: Leaving.) |
20:31
🔗
|
SimpBrain |
lol |
20:32
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
20:46
🔗
|
|
godane has joined #archiveteam-bs |
21:00
🔗
|
|
VADemon has joined #archiveteam-bs |
21:05
🔗
|
phuzion |
Would it unethical to claim the cash and just archivebot that? |
21:06
🔗
|
SimpBrain |
call it a donation ;) |
21:07
🔗
|
xmc |
if you turn the cash into hosting fees for an archivebot pipeline, seems legit to me |
21:07
🔗
|
phuzion |
Would it be more unethical to say "This project would require at least six hours of my time, but if you put up $300, I'll work on it nonstop until it's done"? |
21:08
🔗
|
xmc |
if you keep an eye on the logs that sounds reasonable |
21:08
🔗
|
xmc |
:) |
21:08
🔗
|
phuzion |
Would it be even more unethical to mug the guy IRL and use his credit cards to pay for digital ocean instances to run archivebot pipelines? |
21:09
🔗
|
phuzion |
Maybe I'm getting a bit carried away. |
21:09
🔗
|
SimpBrain |
charge per url |
21:09
🔗
|
Sanqui |
and don't ignore /calendar/ |
21:10
🔗
|
HCross |
Even more UK news is going to begin marching its way into the archive :) |
21:11
🔗
|
|
JesseW has joined #archiveteam-bs |
21:12
🔗
|
|
JW_work has joined #archiveteam-bs |
21:18
🔗
|
|
JW_work has quit IRC (Quit: Leaving.) |
21:34
🔗
|
|
antomatic has joined #archiveteam-bs |
21:34
🔗
|
|
LordNigh2 has joined #archiveteam-bs |
21:35
🔗
|
|
brayden has quit IRC (hub.dk irc.inet.tele.dk) |
21:35
🔗
|
|
tephra has quit IRC (hub.dk irc.inet.tele.dk) |
21:35
🔗
|
|
Start has quit IRC (hub.dk irc.inet.tele.dk) |
21:35
🔗
|
|
antomati_ has quit IRC (hub.dk irc.inet.tele.dk) |
21:35
🔗
|
|
w0rp has quit IRC (hub.dk irc.inet.tele.dk) |
21:35
🔗
|
|
mismatch_ has quit IRC (hub.dk irc.inet.tele.dk) |
21:35
🔗
|
|
afics has quit IRC (hub.dk irc.inet.tele.dk) |
21:35
🔗
|
|
Lord_Nigh has quit IRC (hub.dk irc.inet.tele.dk) |
21:35
🔗
|
|
Apathy has quit IRC (hub.dk irc.inet.tele.dk) |
21:35
🔗
|
|
dan- has quit IRC (hub.dk irc.inet.tele.dk) |
21:36
🔗
|
|
w0rp_ has joined #archiveteam-bs |
21:36
🔗
|
|
mismatchm has joined #archiveteam-bs |
21:37
🔗
|
|
lytv has quit IRC (Quit: Leaving) |
21:41
🔗
|
|
tephra_ has joined #archiveteam-bs |
21:49
🔗
|
|
lytv has joined #archiveteam-bs |
21:50
🔗
|
|
Apathy has joined #archiveteam-bs |
21:50
🔗
|
|
afics has joined #archiveteam-bs |
21:50
🔗
|
|
slyphic is now known as slyphic|a |
21:50
🔗
|
|
w0rp_ is now known as w0rp |
21:51
🔗
|
|
LordNigh2 is now known as Lord_Nigh |
22:09
🔗
|
ersi |
Good! Make her majesty's government fear The Bot |
22:13
🔗
|
Kazzy |
what's important right now? been away for a while |
22:14
🔗
|
HCross |
Friends Reunited starting soon, Gcode and MyVIP |
22:14
🔗
|
|
achip has joined #archiveteam-bs |
22:15
🔗
|
arkiver |
Hey Kazzy |
22:16
🔗
|
Kazzy |
hiya |
22:16
🔗
|
|
Start has joined #archiveteam-bs |
22:18
🔗
|
HCross |
I should have said Hello! |
22:18
🔗
|
Kazzy |
o/ |
22:49
🔗
|
|
dan- has joined #archiveteam-bs |
23:10
🔗
|
|
JesseW has quit IRC (Leaving.) |
23:54
🔗
|
|
Rotab has quit IRC (hub.se irc.du.se) |
23:54
🔗
|
|
Boppen has quit IRC (hub.se irc.du.se) |
23:54
🔗
|
|
w0rp has quit IRC (hub.se irc.underworld.no) |
23:54
🔗
|
|
mutoso has quit IRC (hub.se irc.underworld.no) |
23:54
🔗
|
|
wednesday has quit IRC (hub.se irc.underworld.no) |
23:54
🔗
|
|
ersi has quit IRC (hub.se irc.underworld.no) |
23:54
🔗
|
|
midas has quit IRC (hub.se irc.underworld.no) |
23:54
🔗
|
|
Rye has quit IRC (hub.se irc.underworld.no) |
23:54
🔗
|
|
Fletcher has quit IRC (hub.se irc.underworld.no) |
23:54
🔗
|
|
espes___ has quit IRC (hub.se irc.underworld.no) |
23:54
🔗
|
|
useretai- has quit IRC (hub.se irc.underworld.no) |
23:54
🔗
|
|
will has quit IRC (hub.se irc.underworld.no) |
23:54
🔗
|
|
antomatic has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
HCross2 has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
kyan has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
GLaDOS has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
unstable has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
_desu___ has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
wp494 has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
Muad-Dib has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
Famicoma1 has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
ivan` has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
SilSte has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
Kazzy has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
mistym has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
zhongfu has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
pikhq has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
Kenshin has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
Rickster has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
Ctrl-S___ has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
zyphlar_ has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
bauruine has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
sigkell has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
Fusl has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
joepie91 has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
SadDM has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
JSharp___ has quit IRC (hub.se efnet.port80.se) |
23:54
🔗
|
|
deathy has quit IRC (hub.se efnet.port80.se) |
23:55
🔗
|
|
JesseW has joined #archiveteam-bs |
23:59
🔗
|
|
Rotab has joined #archiveteam-bs |
23:59
🔗
|
|
Boppen has joined #archiveteam-bs |