Time |
Nickname |
Message |
00:00
π
|
|
fie has quit IRC (Ping timeout: 268 seconds) |
00:18
π
|
|
Asparagir has quit IRC (Asparagir) |
00:27
π
|
|
BlueMax has joined #archiveteam-bs |
00:28
π
|
|
Mateon1 has quit IRC (Remote host closed the connection) |
00:38
π
|
|
Mateon1 has joined #archiveteam-bs |
00:40
π
|
|
Jusque has joined #archiveteam-bs |
00:44
π
|
|
Jusque_ has joined #archiveteam-bs |
00:47
π
|
|
fie has joined #archiveteam-bs |
00:54
π
|
|
Jusque has quit IRC (Read error: Operation timed out) |
00:54
π
|
|
Jusque_ is now known as Jusque |
01:11
π
|
|
kisspunch has quit IRC (Ping timeout: 260 seconds) |
01:14
π
|
|
kisspunch has joined #archiveteam-bs |
02:06
π
|
|
ta9le has quit IRC (Quit: Connection closed for inactivity) |
02:16
π
|
|
BlueMax has quit IRC (Leaving) |
02:54
π
|
|
BlueMax has joined #archiveteam-bs |
02:57
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
03:00
π
|
|
Sk1d has joined #archiveteam-bs |
03:19
π
|
|
qw3rty115 has joined #archiveteam-bs |
03:22
π
|
|
odemg has quit IRC (Ping timeout: 260 seconds) |
03:23
π
|
|
qw3rty114 has quit IRC (Read error: Operation timed out) |
03:34
π
|
|
odemg has joined #archiveteam-bs |
04:36
π
|
|
Asparagir has joined #archiveteam-bs |
04:37
π
|
|
svchfoo1 sets mode: +o Asparagir |
04:39
π
|
bsmith093 |
so, my fanfic dumps got dmca'd. |
04:39
π
|
bsmith093 |
just ffnet not ao3 |
04:51
π
|
tapedrive |
bsmith093: which item? I can still access https://archive.org/download/fanfictiondotnet_repack and https://archive.org/download/Fanfictiondotnet1011dump fine. |
04:52
π
|
bsmith093 |
those and the smaller 100k-links-at-a-time ones. |
04:52
π
|
bsmith093 |
should i just keeps going? |
04:52
π
|
bsmith093 |
*keep |
04:52
π
|
tapedrive |
Do you have them somewhere else? |
04:53
π
|
bsmith093 |
not really, once i found out i grabbed them asap, where's a good place? |
05:19
π
|
godane |
SkethcCow: alot of items recently upload got dark : https://archive.org/details/@seanpaulk |
05:19
π
|
godane |
there |
05:20
π
|
godane |
all i got from one of the items is this for why its dark: Darked using /home/jake/scripts/lock_delete_darke_user.py |
05:21
π
|
godane |
https://catalogd.archive.org/log/900209219 |
05:21
π
|
godane |
maybe the malware script darked them but not sure |
05:23
π
|
godane |
for all the items that got dark: http://archive.org/metamgr.php?&w_uploader=seanpaulk@hotmail.com&mode=more |
05:24
π
|
godane |
even the guys fav-seanpaulk collection got dark |
05:27
π
|
hook54321 |
godane: Is there a way for me to get a list of what items of mine are darked? |
05:28
π
|
godane |
using your email address for you account yes |
05:29
π
|
hook54321 |
I tried putting my address in there and still got "not authorized to access this service" |
05:30
π
|
godane |
what is your user name? |
05:30
π
|
hook54321 |
hook54321a, dm me what it says is darked though. |
05:35
π
|
godane |
my account size: size: 131,113,480,968 KB |
05:36
π
|
godane |
131.1TB |
05:37
π
|
hook54321 |
What exactly is the Meta-manager used for? |
05:38
π
|
godane |
i think maybe you can see it cause your not admin |
05:38
π
|
godane |
anyways i can see your items |
05:38
π
|
godane |
http://archive.org/metamgr.php?&w_uploader=hook54321a@gmail.com&mode=more |
05:39
π
|
godane |
anyways you got 10 items that dark |
05:40
π
|
godane |
2 are in test_collectrion |
05:40
π
|
godane |
*test_collection |
05:51
π
|
godane |
my full item count include darked ones is : 1,416,872 |
05:55
π
|
|
DragonMon has joined #archiveteam-bs |
06:07
π
|
|
Mateon1 has quit IRC (Read error: Operation timed out) |
06:07
π
|
|
Mateon1 has joined #archiveteam-bs |
06:10
π
|
|
Asparagir has quit IRC (Asparagir) |
07:41
π
|
|
omglolbah has quit IRC (Ping timeout: 268 seconds) |
09:46
π
|
|
ta9le has joined #archiveteam-bs |
09:49
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
09:50
π
|
|
Sk1d has joined #archiveteam-bs |
10:09
π
|
|
elise has joined #archiveteam-bs |
10:12
π
|
|
elise has quit IRC (Quit: leaving) |
10:13
π
|
|
elise has joined #archiveteam-bs |
10:13
π
|
elise |
08:59 < elise> Hello archivers ! |
10:13
π
|
elise |
08:59 < elise> I have a question regarding youtube content |
10:13
π
|
elise |
09:00 < elise> more specifically metadata, not the videos content per se |
10:13
π
|
elise |
09:00 < elise> I saw that there is an ongoing effort regarding the videos, but is there something similar for metadata ? |
10:13
π
|
elise |
09:01 < elise> What are the tools used for the organization of metadata ? |
10:16
π
|
ivan |
would you like to start a project to archive all of youtube's metadata? I can provide design assistance |
10:16
π
|
ivan |
it's just a few billion public videos so it should fit on a server |
10:20
π
|
ivan |
well, maybe more than one if you're grabbing video descriptions and comments and such |
10:22
π
|
ivan |
keeping track of all this metadata would help a lot with figuring out what to archive and keeping track of unlisted videos |
10:30
π
|
|
omglolbah has joined #archiveteam-bs |
10:33
π
|
elise |
Sorry, I have not been clear : I am not interested in the video, only the metadata : channels, playlist, comments, descriptions, timestamps |
10:34
π
|
elise |
If nothing exists, I would like to do something on my own |
10:34
π
|
elise |
If something exists, I am curious about the implementation |
10:34
π
|
ivan |
I haven't heard of anyone archiving youtube metadata |
10:35
π
|
elise |
ok |
10:36
π
|
elise |
my raw idea would be to actually organize everything and be able to request ressource the same way you request YT API |
10:36
π
|
ivan |
is the youtube API any good? I would want to just run SQL queries against the database |
10:37
π
|
elise |
I am thinking about mirroring kind of so that the same code can be used against the archive |
10:38
π
|
hook54321 |
elise: Like, metadata of the videos? |
10:38
π
|
elise |
yes |
10:38
π
|
|
hook54321 sets mode: +o ivan |
10:38
π
|
elise |
well, it s broader, it's any ressources that are not videos |
10:39
π
|
hook54321 |
tubeup (a comoon set of scripts that people oftentimes use to upload videos to archive.org) uploads at least some metadata relating to videos. |
10:44
π
|
elise |
I already have some bits of code to request ressources from the API, I am currently thinking in how to organize all of this to query it easily |
10:45
π
|
elise |
also, wanted your input as archivers on my idea of mirroring |
10:45
π
|
elise |
I don't know if you would use this kind of solution to archive database that are accessible by a public API |
11:02
π
|
|
DragonMon has quit IRC (Quit: Leaving) |
11:05
π
|
ivan |
what kind of rate limits do you get on their public API? |
11:37
π
|
PurpleSym |
arkiver: Iβll stop my grab then? |
11:37
π
|
arkiver |
PurpleSym: maybe wait until this project is succesful |
11:37
π
|
PurpleSym |
Ok. |
11:37
π
|
arkiver |
in case something goes wrong |
11:37
π
|
PurpleSym |
My last estimate was ~5 TB, btw. |
11:37
π
|
arkiver |
5 TB is fine too |
11:39
π
|
arkiver |
all numbers before the URLs give me 722417136690 |
11:39
π
|
arkiver |
which is ~722GB |
11:39
π
|
arkiver |
assuming the number in front are sizes? |
11:40
π
|
PurpleSym |
No, the first number in audiourls.txt is the original song ID. |
11:40
π
|
arkiver |
ah oops |
11:41
π
|
arkiver |
but 5 TB should be fine too, yeah |
11:41
π
|
PurpleSym |
Iβm not sure whether filenames are unique, so Iβm saving to {id}.{ext} |
11:41
π
|
arkiver |
I see |
11:44
π
|
PurpleSym |
How many files did you combine into one item, arkiver ? |
11:45
π
|
arkiver |
5 URLs/item |
11:45
π
|
arkiver |
all items are in the tracker now |
11:49
π
|
PurpleSym |
Any chance I can get access to the rsync target once the grab finished? Moving 5 TB from IA to Europe and back is quite painful. |
11:49
π
|
arkiver |
it's already going to IA now |
11:49
π
|
arkiver |
to FOS |
11:50
π
|
PurpleSym |
Yeah, would be easier if I extracted the WARCs directly on FOS to upload individual items. (If thatβs still something we want.) |
11:51
π
|
arkiver |
I could let a script run on IA servers to process the WARCs when they are in IA |
11:51
π
|
arkiver |
as in when they are in items on IA |
11:51
π
|
PurpleSym |
Sure, whatever works. |
11:52
π
|
arkiver |
I think we can sort something out |
11:52
π
|
PurpleSym |
Are you an IA employee nowadays, arkiver ? |
11:54
π
|
arkiver |
yep |
12:05
π
|
|
BlueMax has quit IRC (Read error: Connection reset by peer) |
12:07
π
|
|
BlueMax has joined #archiveteam-bs |
12:08
π
|
|
DragonMon has joined #archiveteam-bs |
12:39
π
|
|
elise has quit IRC (hub.efnet.us irc.efnet.nl) |
12:39
π
|
|
godane has quit IRC (hub.efnet.us irc.efnet.nl) |
12:43
π
|
|
elise_ has joined #archiveteam-bs |
13:01
π
|
|
godane has joined #archiveteam-bs |
13:01
π
|
|
svchfoo1 sets mode: +o godane |
13:12
π
|
|
schbirid has joined #archiveteam-bs |
14:00
π
|
elise_ |
\ivan 1 million read per day |
14:05
π
|
|
SilSte has joined #archiveteam-bs |
14:16
π
|
arkiver |
What do you think of archiving websites where images can be stored? |
14:16
π
|
arkiver |
For example website used to images on forums, etc. |
14:16
π
|
arkiver |
Also for websites that are not currently in dange |
14:16
π
|
arkiver |
danger* |
14:19
π
|
tapedrive |
It would be great if it happened. |
14:20
π
|
tapedrive |
I know most of the free services used delete images after a while. |
14:50
π
|
|
BlueMax has quit IRC (Leaving) |
15:09
π
|
tyzoid |
PurpleSym: I'm uploading my downloads to https://archive.org/download/tyzoid-acidplanet-audio |
15:10
π
|
PurpleSym |
Ok, tyzoid. arkiver: β |
15:20
π
|
tyzoid |
arkiver / tapedrive: There was that one image hosting service that was really popular ~2002, which expires images after a while |
15:20
π
|
tyzoid |
can't remember the name of the service, but I think it's a good idea - especially on internet message boards |
15:21
π
|
tyzoid |
blogs / etc, I'm not as concerned with the images, since they tend to be hosted first-party. |
15:21
π
|
tyzoid |
on a forum, it's almost always 3rd party |
15:23
π
|
eientei95 |
Photobucket? |
15:24
π
|
tyzoid |
I think so |
15:31
π
|
tyzoid |
PurpleSym: Do you know if there's a way to get wget to remove the leftover files once they've been added to the warc? |
15:32
π
|
tyzoid |
or is it a matter of just pausing the download, cleaning it up, and restarting it? |
15:32
π
|
PurpleSym |
Not sure, -O /dev/null perhaps? |
15:34
π
|
tyzoid |
Maybe, we'll see. Btw, I'm uploading in 1gb warc chunks |
15:37
π
|
arkiver |
I'm not sure if there already is one, but we could start listing image hosters |
15:38
π
|
tyzoid |
Do we not have a category on archiveteam.org for it? |
15:40
π
|
tyzoid |
Unrelated question: Anyone here have wiki adminship? The password recovery tool says there's no email for 'tyzoid', but account creation says the username 'tyzoid' is already in use. |
15:42
π
|
JAA |
tyzoid, PurpleSym: --delete-after is the option you're looking for. |
15:43
π
|
tyzoid |
huh, nice. |
16:05
π
|
|
DoragonMo has joined #archiveteam-bs |
16:06
π
|
|
DoragonMo has quit IRC (Remote host closed the connection) |
16:17
π
|
odemg |
https://pastebin.com/raw/Gh2xbSbG |
16:18
π
|
|
SilSte has quit IRC (Read error: Operation timed out) |
16:33
π
|
|
HCross has quit IRC (Read error: Connection reset by peer) |
16:34
π
|
|
HCross has joined #archiveteam-bs |
16:51
π
|
|
Asparagir has joined #archiveteam-bs |
16:52
π
|
|
svchfoo1 sets mode: +o Asparagir |
16:52
π
|
tyzoid |
arkiver: Should I stop my download, then? |
16:52
π
|
tyzoid |
now that we've got a warrior project? |
16:56
π
|
tyzoid |
odemg: Once it uploads, I'll be viewable here; https://archive.fart.website/archivebot/viewer/job/36ynv |
16:57
π
|
odemg |
tyzoid, I also stuck it here; https://the-eye.eu/public/Random/psvita_devnetleakv1/ |
17:09
π
|
odemg |
nobody even told me about acid, but I'm getting |
17:09
π
|
odemg |
configure: error: lua not found |
17:09
π
|
odemg |
wget-lua not successfully built. |
17:11
π
|
odemg |
presuming, liblua5.1-dev will fix... |
17:11
π
|
odemg |
trying |
17:12
π
|
lindalap |
"I'll be shutting things down and upgrading tonight; aiming to have the plugin itself installed in the next couple days." https://community.tulpa.info/thread-gdpr |
17:12
π
|
lindalap |
Already grabbed the whole forum few days ago. |
17:12
π
|
lindalap |
More precise link: https://community.tulpa.info/thread-gdpr?pid=203752#pid203752 |
17:18
π
|
odemg |
tyzoid, arkiver channel for acid? |
17:19
π
|
odemg |
If we don't have #acidburns ? |
17:19
π
|
Aoede |
<tyzoid> Aoede / arkiver: I'm holding #acidrain, but I don't really see a need to use it |
17:20
π
|
odemg |
okay, I need pizza then will fire up a few more machines for this |
19:02
π
|
Jens |
"Process RsyncUpload returned exit code 5 for Item[...blah...]" |
19:02
π
|
Jens |
What's failing here? |
19:03
π
|
Jens |
Disregard that... "max connections (120) reached" |
19:03
π
|
Jens |
I just don't know on what end. |
19:15
π
|
|
wp494 has quit IRC (Ping timeout: 260 seconds) |
19:16
π
|
|
wp494 has joined #archiveteam-bs |
19:16
π
|
|
svchfoo1 sets mode: +o wp494 |
19:16
π
|
|
Asparagir has quit IRC (Asparagir) |
19:19
π
|
|
wp494 has quit IRC (Client Quit) |
19:20
π
|
JAA |
Jens: That's almost certainly because the connection limit on the rsync target (i.e. FOS) has been reached. Specifically, "ERROR: max connections (120) reached -- try again later". |
19:20
π
|
|
wp494 has joined #archiveteam-bs |
19:21
π
|
JAA |
Reading the output of pipeline.py isn't always easy because everything from all processes is interleaved and you often don't really know where which line is coming from. |
19:21
π
|
JAA |
Plus rsync's --progress messes it up even further because you end up with multiple things on the same line. |
19:21
π
|
|
svchfoo1 sets mode: +o wp494 |
19:26
π
|
odemg |
Are there anymore items after this round or? |
19:28
π
|
|
chirlu has quit IRC (Ping timeout: 255 seconds) |
19:36
π
|
Jens |
FOS has terrible transfer speed to EU. |
19:37
π
|
Jens |
I'm getting <1 MB/s from Germany. |
19:37
π
|
Jens |
>10 MB/s from communist California. |
19:40
π
|
JAA |
What else is new? |
19:41
π
|
Jens |
Well, it's new to me. FOS hasn't really been the bottleneck in other projects I've done. |
19:46
π
|
Jens |
It's interesting how each project is bottlenecked. My SF VM is now CPU limited. |
19:49
π
|
|
t2t2 has quit IRC (Remote host closed the connection) |
19:49
π
|
Jens |
By wget-lua it seems. |
19:50
π
|
Jens |
http://www.goatse.sx/img/acid.png |
19:50
π
|
Jens |
And rsync. |
19:57
π
|
|
t2t2 has joined #archiveteam-bs |
20:02
π
|
|
schbirid has quit IRC (Quit: Leaving) |
20:50
π
|
godane |
so i have like 5 tapes to digitize from what i got from ebay |
20:50
π
|
godane |
so i did like 20 tapes since tuesday |
21:00
π
|
odemg |
Jens, we don't always sync to FOS |
21:06
π
|
|
DragonMon has quit IRC (se.hub irc.underworld.no) |
21:06
π
|
|
i0npulse has quit IRC (se.hub irc.underworld.no) |
21:06
π
|
|
medowar has quit IRC (se.hub irc.underworld.no) |
21:06
π
|
|
Jens has quit IRC (se.hub irc.underworld.no) |
21:06
π
|
|
PurpleSym has quit IRC (se.hub irc.underworld.no) |
21:06
π
|
|
decay__ has quit IRC (se.hub irc.underworld.no) |
21:11
π
|
odemg |
arkiver, archive image hosts, I've been doing 1TB/day of imgur for the last 4 months, is that worth putting on ia? |
21:11
π
|
odemg |
I've been hashing everything for my own search by image tool |
21:13
π
|
odemg |
expanded upon this, https://github.com/4pr0n/irarchives I'll host it again at some point, I let that project die |
21:19
π
|
|
decay_ has joined #archiveteam-bs |
21:28
π
|
|
Jens has joined #archiveteam-bs |
21:28
π
|
|
medowar has joined #archiveteam-bs |
21:28
π
|
|
PurpleSym has joined #archiveteam-bs |
21:28
π
|
|
decay__ has joined #archiveteam-bs |
21:28
π
|
|
Jens has quit IRC (Ping timeout: 252 seconds) |
21:28
π
|
|
decay__ has quit IRC (Ping timeout: 252 seconds) |
21:28
π
|
|
medowar has quit IRC (Ping timeout: 252 seconds) |
21:28
π
|
|
PurpleSym has left |
21:29
π
|
|
medowar has joined #archiveteam-bs |
21:30
π
|
|
i0npulse has joined #archiveteam-bs |
21:40
π
|
|
Jens has joined #archiveteam-bs |
22:14
π
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
22:19
π
|
|
Sk1d has joined #archiveteam-bs |
22:20
π
|
|
bmcginty has quit IRC (Read error: Operation timed out) |
22:41
π
|
odemg |
arkiver, no fucking around on this one (acid) any more items after this? |
23:28
π
|
|
BlueMax has joined #archiveteam-bs |
23:36
π
|
|
bmcginty has joined #archiveteam-bs |
23:36
π
|
|
bmcginty has quit IRC (Connection closed) |
23:38
π
|
godane |
SketchCow: did you ever upload Wizard Magazine? |
23:39
π
|
godane |
i only has cause i found it here: http://empire-dcp-minutemen-scanss.blogspot.com/2015/11/wizard-price-guide-magazine-1991-2011.html |
23:39
π
|
godane |
but can't find it on archive.org but not sure if you did upload then when dark |
23:39
π
|
godane |
anyways grabbing for my private collection |