Time |
Nickname |
Message |
00:38
🔗
|
|
godane has joined #archiveteam-ot |
00:40
🔗
|
|
adinbied has quit IRC (Read error: Operation timed out) |
00:41
🔗
|
|
adinbied has joined #archiveteam-ot |
01:25
🔗
|
hook54321 |
Someone ran into the library that I'm sitting in |
01:25
🔗
|
hook54321 |
(with their car) |
01:27
🔗
|
vectr0n |
0o |
01:27
🔗
|
|
BlueMax has joined #archiveteam-ot |
01:39
🔗
|
|
vectr0n has quit IRC (ZNC - https://znc.in) |
01:39
🔗
|
|
vectr0n has joined #archiveteam-ot |
02:40
🔗
|
|
kiskabak has quit IRC (Read error: Connection reset by peer) |
02:40
🔗
|
|
Albardin has quit IRC (Read error: Connection reset by peer) |
02:40
🔗
|
|
w0rmybak has quit IRC (Read error: Connection reset by peer) |
03:11
🔗
|
Flashfire |
lol what? |
04:11
🔗
|
wmvhater |
does anyone have any experience with working with ZIM files? |
04:14
🔗
|
wmvhater |
how good is MWoffliner? |
05:25
🔗
|
|
adinbied has quit IRC (hub.efnet.us west.us.hub) |
05:25
🔗
|
|
Mateon1 has quit IRC (hub.efnet.us west.us.hub) |
05:25
🔗
|
|
mr_archiv has quit IRC (hub.efnet.us west.us.hub) |
05:25
🔗
|
|
sknebel has quit IRC (hub.efnet.us west.us.hub) |
05:25
🔗
|
|
robogoat has quit IRC (hub.efnet.us west.us.hub) |
05:25
🔗
|
|
svchfoo3 has quit IRC (hub.efnet.us west.us.hub) |
05:25
🔗
|
|
Jusque has quit IRC (hub.efnet.us west.us.hub) |
05:29
🔗
|
|
Stiletto has joined #archiveteam-ot |
05:30
🔗
|
|
Mateon1 has joined #archiveteam-ot |
05:30
🔗
|
|
mr_archiv has joined #archiveteam-ot |
05:30
🔗
|
|
sknebel has joined #archiveteam-ot |
05:30
🔗
|
|
robogoat has joined #archiveteam-ot |
05:30
🔗
|
|
svchfoo3 has joined #archiveteam-ot |
05:30
🔗
|
|
Jusque has joined #archiveteam-ot |
05:30
🔗
|
|
irc.mzima.net sets mode: +o svchfoo3 |
05:33
🔗
|
|
Stilett0 has quit IRC (Read error: Operation timed out) |
05:41
🔗
|
|
Stilett0 has joined #archiveteam-ot |
05:46
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
05:53
🔗
|
|
adinbied has joined #archiveteam-ot |
06:27
🔗
|
|
robogoat has quit IRC (Read error: Operation timed out) |
06:47
🔗
|
|
BlueMax has quit IRC (Read error: Connection reset by peer) |
06:48
🔗
|
|
BlueMax has joined #archiveteam-ot |
07:00
🔗
|
|
robogoat has joined #archiveteam-ot |
07:05
🔗
|
|
BlueMax has quit IRC (Read error: Connection reset by peer) |
07:06
🔗
|
|
BlueMax has joined #archiveteam-ot |
08:05
🔗
|
|
robogoat has quit IRC (Read error: Operation timed out) |
08:13
🔗
|
|
robogoat has joined #archiveteam-ot |
08:23
🔗
|
|
robogoat has quit IRC (Read error: Operation timed out) |
09:00
🔗
|
|
robogoat has joined #archiveteam-ot |
09:35
🔗
|
|
BlueMax has quit IRC (Quit: Leaving) |
12:03
🔗
|
|
VerifiedJ has joined #archiveteam-ot |
12:55
🔗
|
jrwr |
JAA arkiver you guys around, I have question |
12:57
🔗
|
JAA |
jrwr: Yeah |
13:01
🔗
|
jrwr |
I PM'd you JAA |
14:11
🔗
|
|
eggplanti has quit IRC (Read error: Connection reset by peer) |
15:45
🔗
|
jrwr |
JAA: should I use a single item every time or a new item for every backup I do |
15:50
🔗
|
JAA |
jrwr: Whichever you prefer. Note that large items (over several hundred GB) tend to cause issues apparently. Maybe group by month/season/year, depending on the size and frequency? |
15:57
🔗
|
* |
jrwr pokes SketchCow |
15:57
🔗
|
jrwr |
I guess he would be the best to ask |
15:58
🔗
|
jrwr |
I have a website I want to export its dataset to the internet archive on a weekly basis, as my userbase has expressed they want this. so far its 10 files and about 50GB |
15:59
🔗
|
jrwr |
Whats the way you prefer that it was published to the IA as (even metadata/collection) |
16:00
🔗
|
fenn |
in general IA doesn't like data that is continuously updated |
16:01
🔗
|
fenn |
you could scrape your website as .warc files, and there is a process to put warcs in the wayback machine |
16:01
🔗
|
jrwr |
I mostly want somewhere to put the data for long term storage for the public so if someone wanted to clone the site or process the database information |
16:02
🔗
|
jrwr |
current plans was a weekly torrent |
16:02
🔗
|
fenn |
whatever it is, weekly is too often, in my opinion |
16:02
🔗
|
jrwr |
I get about 2000 new items per wek right now |
16:03
🔗
|
jrwr |
I can do once a month |
16:05
🔗
|
JAA |
How about this? You create weekly dumps on your server and make them available there, and less frequently, you push to IA. |
16:09
🔗
|
JAA |
I do agree that weekly dumps are probably a bit much to upload to IA. |
16:09
🔗
|
JAA |
Unless you want to do monthly or seasonal full dumps and weekly increments. |
17:05
🔗
|
|
schbirid has joined #archiveteam-ot |
17:12
🔗
|
|
wp494 has quit IRC (Ping timeout: 364 seconds) |
17:12
🔗
|
|
wp494 has joined #archiveteam-ot |
18:53
🔗
|
|
Mateon1 has quit IRC (Read error: Operation timed out) |
18:54
🔗
|
|
Mateon1 has joined #archiveteam-ot |
19:05
🔗
|
|
godane has quit IRC (Ping timeout: 260 seconds) |
19:06
🔗
|
Vito` |
jrwr: dat archive instead? p2p mirrors instead of IA, and it supports incremental updates (as new files, changed files get re-pushed whole) |
19:07
🔗
|
Vito` |
then you host a peer privately and your interested users can peer as well |
20:29
🔗
|
jrwr |
@Vito` its done :0 https://datbase.org/beatsaver/BeatSaver-Data-Set |
20:29
🔗
|
jrwr |
I like this protocol |
20:29
🔗
|
jrwr |
since its updates it self |
20:32
🔗
|
Vito` |
yeah so if your updates are just deltas or diffs as new files, I think peers won't have to redownload everything (I think if you replace json.tar and songs.tar they would) |
20:33
🔗
|
jrwr |
Ya |
20:34
🔗
|
Vito` |
also if your dat gets over 300GB the current implementations have issues last I heard |
20:34
🔗
|
Vito` |
but they're aware of them and are fixing them |
20:36
🔗
|
jrwr |
ya |
20:54
🔗
|
|
godane has joined #archiveteam-ot |
21:24
🔗
|
|
tuluu has quit IRC (Remote host closed the connection) |
21:25
🔗
|
|
tuluu has joined #archiveteam-ot |
21:39
🔗
|
|
BlueMax has joined #archiveteam-ot |
21:50
🔗
|
|
schbirid has quit IRC (Remote host closed the connection) |
22:48
🔗
|
|
adinbied has quit IRC (Read error: Operation timed out) |
22:51
🔗
|
|
adinbied has joined #archiveteam-ot |
23:43
🔗
|
SketchCow |
What |
23:44
🔗
|
SketchCow |
What are the items |