#archiveteam-bs 2017-08-19,Sat

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)

WhoWhatWhen
***username1 has joined #archiveteam-bs
schbirid2 has quit IRC (Read error: Operation timed out)
pizzaiolo has joined #archiveteam-bs
[01:53]
...... (idle for 29mn)
j08nY has quit IRC (Remote host closed the connection)
Mateon1 has quit IRC (Ping timeout: 268 seconds)
Mateon1 has joined #archiveteam-bs
[02:28]
.......... (idle for 49mn)
Ravenloft has joined #archiveteam-bs
pizzaiolo has quit IRC (Remote host closed the connection)
[03:21]
bluesoulso, i have my first warc and it is confirmed working and playable with web archive player 1.4.7
now what?
[03:26]
***qw3rty111 has joined #archiveteam-bs [03:26]
DFJustinhttps://archive.org/upload/ [03:28]
bluesouli guess the web uploader has settings to allow for a 1GB+ warc that might take half an hour to upload
do i throw this cdx file in with it?
[03:29]
***qw3rty119 has quit IRC (Read error: Operation timed out)
fie has quit IRC (Ping timeout: 250 seconds)
[03:30]
bluesouleh, we'll try it [03:35]
DFJustinyou don't need to, it will generate one
shouldn't hurt anything though
[03:42]
bluesoulwe will find out if it hurt anything eventually
i suspect nothing will catch fire
[03:43]
***Asparagir has joined #archiveteam-bs [03:52]
Stilett0 has quit IRC (Read error: Operation timed out) [04:02]
.... (idle for 15mn)
Stilett0 has joined #archiveteam-bs [04:17]
Sk1d has quit IRC (Ping timeout: 250 seconds)
TheLovina has quit IRC (Ping timeout: 370 seconds)
TheLovina has joined #archiveteam-bs
[04:22]
bluesoulhttps://archive.org/details/dep.ca does this look right? [04:26]
***Sk1d has joined #archiveteam-bs [04:29]
DFJustinseems ok
I'd recommend putting a crawl date on it as well
[04:32]
bluesoulwould that just be the generic date field? wasn't sure if that was supposed to be the age of the data or the archive [04:35]
DFJustinthis is how IA's go in https://archive.org/details/WIDE-20170819012251-crawl802
I don't think the details matter that much as long as it's basically sensible
something so you can tell it apart if the same site gets crawled more than once
[04:45]
bluesoulah i see, firstfiledate and lastfiledate [04:46]
DFJustinthose are probably internal use fields
the exact timestamps are stored in the warc itself
yeah I think it generates those fields when they process it for the wayback machine, here's one I uploaded and I never filled in firstfiledate etc https://archive.org/details/archive.pdp11.org.ru-20130504
but I put a date in the title and item name for identification when browsing through
[04:47]
bluesoulokay, i think i'm starting to understand the logic/structure of things
i need to recategorize this as a "just in time grab" as the company's bankrupt so i just change the collection name to archiveteam-fire yeah?
[04:53]
DFJustinyou can't, only IA admins have access to do that
for now adding archiveteam to the subject keywords would be good
btw for small jobs like this we have #archivebot which does the crawl and upload automatically if you just give it a base URL
[04:54]
bluesoulcool, archivebot was brought up but this was a good learning experience for me as well [05:00]
..... (idle for 22mn)
***Asparagir has quit IRC (Asparagir) [05:22]
....... (idle for 34mn)
Honno has joined #archiveteam-bs [05:56]
alfie has quit IRC (Ping timeout: 260 seconds) [06:08]
.......... (idle for 47mn)
BlueMaxim has joined #archiveteam-bs [06:55]
Ravenloft has quit IRC (Read error: Connection reset by peer)
Honno_ has joined #archiveteam-bs
[07:08]
Honno has quit IRC (Read error: Operation timed out)
schbirid2 has joined #archiveteam-bs
username1 has quit IRC (Read error: Operation timed out)
[07:22]
.... (idle for 16mn)
fie has joined #archiveteam-bs [07:45]
..................... (idle for 1h41mn)
pie_ has joined #archiveteam-bs [09:26]
pie_hi guys, is there a way to get archive to start recursively archiving a website?
also hm, so archive.is isnt a frontend for archive.org?
[09:26]
***zyphlar has joined #archiveteam-bs [09:31]
Aoedeit isn't
for archiving a website see #archivebot
[09:31]
***schbirid2 has quit IRC (Quit: Leaving) [09:31]
Aoedepie_ ^ [09:32]
pie_thanks [09:33]
..... (idle for 24mn)
***schbirid has joined #archiveteam-bs [09:57]
j08nY has joined #archiveteam-bs [10:04]
Lothericbluesoul: I downloaded the torrent, what do I do with a .warc file ? [10:04]
schbiridcherish it
archive it
love it
hexdump it
[10:05]
Lotheric:) [10:06]
I got it to work with webarchiveplayer.exe
sweet! thanks a lot :)
Seems like it got everything
[10:19]
.......... (idle for 48mn)
***Mateon1 has quit IRC (Remote host closed the connection)
BlueMaxim has quit IRC (Read error: Operation timed out)
[11:07]
......... (idle for 40mn)
ld1 has quit IRC (Ping timeout: 260 seconds)
Mateon1 has joined #archiveteam-bs
[11:51]
zyphlar has quit IRC (Quit: Connection closed for inactivity) [12:00]
................. (idle for 1h24mn)
RichardG has quit IRC (Read error: Connection reset by peer)
RichardG has joined #archiveteam-bs
[13:24]
........ (idle for 36mn)
odemgSketchCow, replied on reddit [14:00]
SketchCowSorry, I never assume I know the people, even though I often do.
Just give me the link when I can grab it, I'll pull them down, make a collection, shove them up.
I wish someone would write descriptions of them
[14:06]
odemgSketchCow, sound, his upload is going steady so I'm not syncing to the dhevel server until it's all done then I'll put it in all the places, post the torrent and nudge you here.
At his current speed it should be done in a little over 5 hours.
[14:09]
SketchCowGreat.
I'll be out on a date, will be back tonight, we'll have a nice collection
[14:13]
odemgOhh sweet, have fun! I'll be having a bbq myself so late tonight works :D [14:16]
arkiverwhat is this about? [14:23]
...... (idle for 28mn)
pie_the fuck https://www.youtube.com/watch?v=CO-NaKJIXPA
fuck wrong channel
[14:51]
......... (idle for 44mn)
joepie91this fella needs some archival directed at them, I think: https://twitter.com/themaddimension
one of the 'unite the right' people, apparently deleting tweets
[15:35]
........... (idle for 50mn)
***BnAboyZ66 has quit IRC (Ping timeout: 260 seconds)
fie has quit IRC (Ping timeout: 250 seconds)
Meroje has quit IRC (Ping timeout: 260 seconds)
[16:25]
Meroje has joined #archiveteam-bs
fie has joined #archiveteam-bs
[16:33]
.... (idle for 16mn)
BartoCH has quit IRC (Ping timeout: 260 seconds) [16:53]
RichardG has quit IRC (Read error: Connection reset by peer)
brayden has quit IRC (Read error: Connection reset by peer)
brayden_ has joined #archiveteam-bs
swebb sets mode: +o brayden_
brayden_ is now known as brayden
[17:01]
j08nY has quit IRC (Read error: Operation timed out) [17:10]
.... (idle for 17mn)
Pudsey has joined #archiveteam-bs
Asparagir has joined #archiveteam-bs
[17:27]
Pudsey_ has joined #archiveteam-bs
pizzaiolo has joined #archiveteam-bs
Pudsey has quit IRC (Ping timeout: 245 seconds)
[17:32]
BartoCH has joined #archiveteam-bs [17:42]
odemg has quit IRC (Read error: Operation timed out) [17:49]
..... (idle for 23mn)
Pudsey_ has quit IRC (Remote host closed the connection)
RichardG has joined #archiveteam-bs
[18:12]
RichardG_ has joined #archiveteam-bs
RichardG has quit IRC (Ping timeout: 370 seconds)
[18:24]
schbirid2 has joined #archiveteam-bs
schbirid has quit IRC (Read error: Operation timed out)
[18:41]
Odd0002 has quit IRC (Remote host closed the connection) [18:51]
JAAjoepie91: I already archived him five days ago. :-) [18:56]
***RichardG_ has quit IRC (Read error: Connection reset by peer)
RichardG has joined #archiveteam-bs
[18:58]
HCross2Kenshin: you around? [19:09]
***RichardG has quit IRC (Read error: No route to host)
RichardG has joined #archiveteam-bs
[19:10]
RichardG_ has joined #archiveteam-bs
RichardG has quit IRC (Ping timeout: 250 seconds)
RichardG_ has quit IRC (Read error: Connection reset by peer)
RichardG has joined #archiveteam-bs
[19:18]
AsparagirHere are some posts that ArchiveTeam might find interesting, about Twitter archiving and massive data processing of the tweetstreams:
https://inkdroid.org/2017/08/15/utr/
and its follow-up post https://inkdroid.org/2017/08/18/delete-forensics/
It ran off a collection of 165,314 tweets, of which 16,492 (9.9%) were later deleted. Has interesting stats and musings.
[19:35]
***Odd0002 has joined #archiveteam-bs [19:47]
wp494 has quit IRC (Read error: Connection reset by peer) [20:00]
Odd0002 has quit IRC (Read error: Operation timed out) [20:10]
Odd0002 has joined #archiveteam-bs
Lotheric has quit IRC (Leaving)
[20:15]
............ (idle for 56mn)
wp494 has joined #archiveteam-bs [21:12]
ZexaronS has joined #archiveteam-bs [21:20]
.... (idle for 16mn)
TheLovina has quit IRC (Ping timeout: 370 seconds)
Famicoman has joined #archiveteam-bs
Stilett0 has quit IRC (Ping timeout: 250 seconds)
[21:36]
godaneso i have uploaded 2527 items so far this month [21:55]
AsparagirAwesome! [22:01]
***Odd0002 has quit IRC (Remote host closed the connection) [22:06]
Odd0002 has joined #archiveteam-bs [22:12]
.... (idle for 16mn)
Odd0002_ has joined #archiveteam-bs
Odd0002 has quit IRC (Read error: Operation timed out)
Odd0002_ has quit IRC (Remote host closed the connection)
[22:28]
Odd0002 has joined #archiveteam-bs [22:40]
.... (idle for 16mn)
pie_ has quit IRC (Ping timeout: 246 seconds) [22:56]
....... (idle for 31mn)
Asparagir has quit IRC (Read error: Connection reset by peer)
Asparagir has joined #archiveteam-bs
[23:27]
..... (idle for 21mn)
dashcloud has quit IRC (Read error: Operation timed out) [23:50]
dashcloud has joined #archiveteam-bs
Ravenloft has joined #archiveteam-bs
[23:55]
Ravenlofthttp://www.blackfalcongames.net/?p=183 [23:55]

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)