#archiveteam-bs 2017-09-10,Sun

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)

WhoWhatWhen
***kristian_ has quit IRC (Quit: Leaving) [00:22]
.... (idle for 17mn)
Dimtree has quit IRC (Quit: Peace.)
dashcloud has quit IRC (Ping timeout: 255 seconds)
[00:39]
drumstick has quit IRC (Ping timeout: 255 seconds)
dashcloud has joined #archiveteam-bs
[00:48]
.... (idle for 16mn)
Dimtree has joined #archiveteam-bs [01:06]
.... (idle for 18mn)
BlueMaxim has joined #archiveteam-bs [01:24]
.... (idle for 17mn)
schbirid2 has joined #archiveteam-bs
schbirid has quit IRC (Read error: Operation timed out)
[01:41]
..... (idle for 20mn)
drumstick has joined #archiveteam-bs [02:04]
...... (idle for 29mn)
Mateon1 has quit IRC (Ping timeout: 260 seconds)
Mateon1 has joined #archiveteam-bs
jspiros has quit IRC (leaving)
jspiros has joined #archiveteam-bs
[02:33]
........... (idle for 54mn)
kisspunchSketchCow: does IA want 500TB of github.com
I don't have it yet, working out a plan to buy drives+bandwidth
And if I can sneakernet drives once a month it saves me $5-$10K
I plan to grab all git hosting sites, but in terms of % data that's a good ballpark number, it's mostly github these days
[03:33]
............ (idle for 58mn)
Froggingwhen was it ever not mostly github? :p [04:35]
.... (idle for 19mn)
***Sk1d has quit IRC (Ping timeout: 194 seconds) [04:54]
Sk1d has joined #archiveteam-bs [05:00]
hook54321kisspunch: I think there's already a github archive [05:08]
kisspunchnope. there's githubarchive.org, which archives only the timeline, and ghtorrent, which archives all the metadata
or do you mean on IA?
also there are a lot of projects that claim they will someday be an archive of github and have <1TB of data
[05:10]
i have most of ghtorrent, except the really old stuff hosted on a separate server that's never accessible
and all of the timeline
[05:17]
uh some of the 1TB projects did archive the top-starred sites, which is pretty worthwhile, just not what i'm going to do
not trying to disparage any of these existing projects :)
[05:24]
..... (idle for 23mn)
***Stilett0 has joined #archiveteam-bs [05:48]
...... (idle for 26mn)
drumstick has quit IRC (Ping timeout: 255 seconds) [06:14]
hook54321kisspunch: I thought githubarchive.org does archive the stuff on the repos [06:21]
***wabu has quit IRC (Read error: Operation timed out)
TC04 has quit IRC (Read error: Operation timed out)
[06:22]
godane!a https://diskprices.com/
oops wrong channel
its in the right one now
[06:23]
***TC01 has joined #archiveteam-bs [06:24]
wabu has joined #archiveteam-bs [06:32]
..... (idle for 20mn)
kisspunchhook54321: githubarchive.org does not even archive all the metadata, no. it's just the events timeline. [06:52]
***drumstick has joined #archiveteam-bs [06:56]
kisspunchas per usual, i strong encourage someone to start the second archiver--that's ephemeral data and only one person is grabbing it [06:58]
godane!a http://lite.cnn.io/
sorry did it again
[07:01]
........... (idle for 52mn)
***jspiros_ has joined #archiveteam-bs
closure has quit IRC (Read error: Operation timed out)
jspiros has quit IRC (Read error: Operation timed out)
refeed has joined #archiveteam-bs
refeed has quit IRC (Connection closed)
refeed has joined #archiveteam-bs
_refeed_ has joined #archiveteam-bs
[07:53]
kristian_ has joined #archiveteam-bs
_refeed_ has quit IRC (Remote host closed the connection)
[08:05]
kristian_ has quit IRC (Quit: Leaving)
closure has joined #archiveteam-bs
midas sets mode: +o closure
[08:17]
.... (idle for 16mn)
BlueMaxim has quit IRC (Read error: Connection reset by peer) [08:34]
.... (idle for 16mn)
drumstick has quit IRC (Ping timeout: 255 seconds) [08:50]
drumstick has joined #archiveteam-bs [09:02]
schbirid2 has quit IRC (Quit: Leaving) [09:09]
BartoCH has joined #archiveteam-bs
refeed has quit IRC (Ping timeout: 633 seconds)
[09:14]
......................... (idle for 2h3mn)
drumstick has quit IRC (Ping timeout: 255 seconds) [11:17]
....... (idle for 31mn)
refeed has joined #archiveteam-bs
refeed has quit IRC (Client Quit)
[11:48]
namibj1 has quit IRC (Read error: Operation timed out) [11:58]
.... (idle for 17mn)
namibj1 has joined #archiveteam-bs [12:15]
....... (idle for 30mn)
odemg has quit IRC (Read error: Operation timed out) [12:45]
......... (idle for 41mn)
namibj1 has quit IRC (Ping timeout: 506 seconds)
SilSte has quit IRC (Ping timeout: 194 seconds)
[13:26]
........ (idle for 35mn)
godane has quit IRC (Read error: Operation timed out) [14:05]
...... (idle for 27mn)
godane has joined #archiveteam-bs [14:32]
............. (idle for 1h0mn)
odemg has joined #archiveteam-bs [15:32]
....... (idle for 31mn)
fie_ has quit IRC (Ping timeout: 250 seconds) [16:03]
fie_ has joined #archiveteam-bs [16:17]
...... (idle for 27mn)
hook54321JAA: I can't talk much right now, but thought I should mention that the number of URLs for imgh.us on the Wayback Machine doesn't seem to be going up. [16:44]
JAAhook54321: Almost all of those jobs were on zino's pipeline, so they go through FOS first and might only show up on IA a few days later. Also, there were recently some issues where ArchiveBot grabs didn't show up in Wayback Machine even after weeks and although the derive job ran. So no reason to be alarmed (yet). [16:48]
.......................... (idle for 2h8mn)
***what_the_ has quit IRC (Quit: Page closed) [18:56]
.... (idle for 19mn)
hook54321JAA (and anyone else that's interested in imgh.us):
Here's some of the things I think we should consider doing next
Grab imgh.us links from Reddit, then change them into the new URL format.
Use a simple list of words in a text file and to try to bruteforce more.
Attempt to compile all the URLs that we successfully grabbed.
[19:15]
.... (idle for 17mn)
***TheLovina has quit IRC (Read error: Operation timed out) [19:35]
....... (idle for 30mn)
dashcloud has quit IRC (Read error: Connection reset by peer) [20:05]
.... (idle for 15mn)
DogsRNice has joined #archiveteam-bs [20:20]
DogsRNicehello [20:21]
***frontop has joined #archiveteam-bs [20:22]
.... (idle for 19mn)
jspiros has joined #archiveteam-bs
jspiros_ has quit IRC (Ping timeout: 492 seconds)
[20:41]
.... (idle for 15mn)
DogsRNice has quit IRC (Quit: Page closed) [20:59]
.... (idle for 18mn)
n00b646 has joined #archiveteam-bs [21:17]
n00b646hey all [21:18]
.... (idle for 19mn)
***n00b646 has quit IRC (Quit: Page closed) [21:37]
.... (idle for 15mn)
kristian_ has joined #archiveteam-bs
TheLovina has joined #archiveteam-bs
[21:52]
drumstick has joined #archiveteam-bs [22:05]
........ (idle for 38mn)
kristian_ has quit IRC (Quit: Leaving) [22:43]
......... (idle for 43mn)
BartoCH has quit IRC (Quit: WeeChat 1.9)
etudier has quit IRC (Quit: My MacBook has gone to sleep. ZZZzzz…)
[23:26]

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)