Time |
Nickname |
Message |
00:00
🔗
|
|
acridAxid has joined #archiveteam-bs |
00:58
🔗
|
|
BlueMax has quit IRC (Read error: Connection reset by peer) |
01:43
🔗
|
|
Raccoon has quit IRC (Remote host closed the connection) |
02:14
🔗
|
|
HashbangI has quit IRC (Remote host closed the connection) |
02:25
🔗
|
|
HashbangI has joined #archiveteam-bs |
02:42
🔗
|
|
ivan- has joined #archiveteam-bs |
02:42
🔗
|
|
Fusl____ sets mode: +o ivan- |
02:42
🔗
|
|
Fusl sets mode: +o ivan- |
02:42
🔗
|
|
Fusl_ sets mode: +o ivan- |
02:49
🔗
|
|
godane has quit IRC (Ping timeout: 360 seconds) |
02:52
🔗
|
|
ivan_ has quit IRC (Ping timeout: 746 seconds) |
03:00
🔗
|
|
ivan- is now known as ivan_ |
03:15
🔗
|
|
godane has joined #archiveteam-bs |
03:26
🔗
|
|
qw3rty has joined #archiveteam-bs |
03:34
🔗
|
|
odemgi_ has joined #archiveteam-bs |
03:35
🔗
|
|
qw3rty2 has quit IRC (Ping timeout: 745 seconds) |
03:36
🔗
|
|
odemgi has quit IRC (Ping timeout: 252 seconds) |
03:47
🔗
|
|
DopefishJ has joined #archiveteam-bs |
03:51
🔗
|
|
DogsRNice has quit IRC (Read error: Connection reset by peer) |
03:51
🔗
|
|
DFJustin has quit IRC (Ping timeout: 745 seconds) |
04:00
🔗
|
|
SynMonger has quit IRC (Quit: Wait, what?) |
04:01
🔗
|
|
SynMonger has joined #archiveteam-bs |
04:56
🔗
|
|
Raccoon has joined #archiveteam-bs |
05:15
🔗
|
|
BlueMax has joined #archiveteam-bs |
08:03
🔗
|
|
VADemon has quit IRC (Read error: Connection reset by peer) |
08:09
🔗
|
|
schbirid has joined #archiveteam-bs |
09:33
🔗
|
|
zhongfu has quit IRC (Quit: cya losers) |
09:37
🔗
|
|
zhongfu has joined #archiveteam-bs |
10:32
🔗
|
|
kiskabak has quit IRC (Ping timeout (120 seconds)) |
10:33
🔗
|
|
kiskabak has joined #archiveteam-bs |
10:33
🔗
|
|
Fusl____ sets mode: +o kiskabak |
10:33
🔗
|
|
Fusl sets mode: +o kiskabak |
10:33
🔗
|
|
Fusl_ sets mode: +o kiskabak |
10:33
🔗
|
|
TC01_ has quit IRC (Read error: Operation timed out) |
10:33
🔗
|
|
dxrt has quit IRC (Remote host closed the connection) |
10:33
🔗
|
|
dxrt has joined #archiveteam-bs |
10:33
🔗
|
|
Fusl____ sets mode: +o dxrt |
10:33
🔗
|
|
TC01 has joined #archiveteam-bs |
10:33
🔗
|
|
Fusl sets mode: +o dxrt |
10:33
🔗
|
|
Fusl_ sets mode: +o dxrt |
10:33
🔗
|
|
asdf0101 has quit IRC (Read error: Operation timed out) |
10:34
🔗
|
|
asdf0101 has joined #archiveteam-bs |
10:46
🔗
|
JAA |
Soo, I have a size estimate for picosong: 72 million requests, 9.9 TB rx, 4.9 TB WARCs |
10:49
🔗
|
JAA |
Test for random 1 ‰ of the IDs took 9 minutes, so if they let me run at this speed continuously, the whole thing would take about 6 days. |
10:49
🔗
|
|
m007a83_ has joined #archiveteam-bs |
10:51
🔗
|
|
m007a83 has quit IRC (Ping timeout: 252 seconds) |
10:53
🔗
|
JAA |
Oh, actually, some of the downloads failed because I got banned already from the S3 bucket. So it'll be a bit larger and might not work at that speed. |
10:55
🔗
|
JAA |
1358 of the 29822 attempted song IDs exist. |
10:58
🔗
|
JAA |
Didn't get banned from S3, just some random requests failed with a 403 for whatever reason. I actually saw that earlier in tests but couldn't reproduce it. Maybe I need to add a delay there. |
10:58
🔗
|
JAA |
Will look into this further in the evening. |
11:46
🔗
|
arkiver |
ah forgot about that one |
11:46
🔗
|
arkiver |
JAA: I see deadline is at october 21 |
11:48
🔗
|
arkiver |
JAA: can you please add the info to https://www.archiveteam.org/index.php?title=Picosong ? |
12:12
🔗
|
|
themadpr0 has joined #archiveteam-bs |
12:13
🔗
|
themadpr0 |
hey hey hey |
12:16
🔗
|
themadpr0 |
I'm planning on starting a circle at my school for teaching people about web archiving |
12:17
🔗
|
themadpr0 |
Any tips for what might make for some good topics? |
12:34
🔗
|
markedL |
I don't have the examples, but it would be interesting to go into examples of unexpected uses by researchers. |
12:39
🔗
|
|
BlueMax has quit IRC (Read error: Connection reset by peer) |
12:43
🔗
|
asie |
this, and also just teaching people that (a) they should be archiving stuff they value (and why), and (b) how to do it |
12:58
🔗
|
themadpr0 |
asie yeah that's basically what I'm going for |
12:59
🔗
|
themadpr0 |
though I do plan to also throw in some discussion on 'why some stuff in particular doesn't stay on the internet forever' (censorship, digital distrubition etc.) |
13:00
🔗
|
themadpr0 |
but again, I'm looking for some nice openers |
13:00
🔗
|
themadpr0 |
the WHY |
13:01
🔗
|
themadpr0 |
I'm going to get a lot of "how much money is there in this" and "aren't there people already doing this for us" or "maybe it's for the best that these do vanish" |
13:02
🔗
|
themadpr0 |
so how do I keep people entertained long enough to avoid derailing (o-o) |
13:02
🔗
|
asie |
you won't |
13:02
🔗
|
asie |
you'd need to be a high-grade comedian to sustain an audience for that long |
13:02
🔗
|
asie |
let alone make them care |
13:03
🔗
|
Kaz |
the 'why' is different for everyone |
13:17
🔗
|
markedL |
Most people know the nazi's burning of books and art but it's portrayed as an anomaly when it's not unusual. If you're concerned about motivation, consider making it interactive. Using a news|games|music|video|art|photo trove let people find things that would excite them. |
13:30
🔗
|
kiska |
themadpr0: Make a tweet on twitter, use the save page now feature, then delete the tweet in class/your session. Pull up wayback machine and show your tweet is still there |
13:31
🔗
|
kiska |
That was how I got some people interested in archiving stuff |
14:02
🔗
|
|
DogsRNice has joined #archiveteam-bs |
15:31
🔗
|
|
h3ndr1k has quit IRC (Quit: ) |
15:31
🔗
|
|
h3ndr1k has joined #archiveteam-bs |
15:48
🔗
|
|
themadpr0 has quit IRC (Ping timeout: 260 seconds) |
16:54
🔗
|
|
bluefoo has joined #archiveteam-bs |
17:32
🔗
|
Dash |
JAA: can your WARC grabber bypass cloudflare's email protection? |
17:32
🔗
|
Dash |
picosong has a lot of songs that cloudflare thinks have emails in the filename |
17:34
🔗
|
Dash |
and how'd you get banned from their S3 bucket? I was able to grab single-threaded for a solid week or two and not get nabbed |
17:49
🔗
|
markedL |
on the wiki, what's the difference between Project Status and Archiving Status ? |
18:03
🔗
|
JAA |
Dash: I didn't get banned, but some downloads returned 403s. Do you have an example for such an "email" file? |
18:04
🔗
|
JAA |
markedL: "Project Status" is the status of the website/service/whatever, i.e. whether it's still online, endangered, or offline. "Archiving Status" is the status of the archival, e.g. saved/upcoming/not saved yet. |
18:44
🔗
|
Dash |
JAA: I suppose most of the issue would be in the pages themselves, but look here: https://picosong.com/S9mc/ |
18:44
🔗
|
Dash |
If you view source, the filename shows up as SLOW DEATH [email protected] |
18:53
🔗
|
|
bluefoo has quit IRC (Ping timeout: 252 seconds) |
19:11
🔗
|
JAA |
Dash: Ah, yeah, that's just decoded by some JS. There's nothing special needed for it. |
19:17
🔗
|
|
bluefoo has joined #archiveteam-bs |
19:34
🔗
|
|
JAA sets mode: -b *!*ShellyRol@*.hsd1.wa.comcast.net |
19:48
🔗
|
|
icedice has joined #archiveteam-bs |
19:57
🔗
|
|
HashbangI has quit IRC (Remote host closed the connection) |
20:05
🔗
|
|
HashbangI has joined #archiveteam-bs |
20:48
🔗
|
|
icedice has quit IRC (Ping timeout: 252 seconds) |
20:51
🔗
|
|
icedice has joined #archiveteam-bs |
20:52
🔗
|
|
icedice has quit IRC (Connection closed) |
20:52
🔗
|
|
icedice has joined #archiveteam-bs |
20:52
🔗
|
|
icedice has quit IRC (Connection closed) |
21:03
🔗
|
|
icedice has joined #archiveteam-bs |
21:03
🔗
|
|
icedice has quit IRC (Connection closed) |
21:03
🔗
|
|
icedice has joined #archiveteam-bs |
21:24
🔗
|
|
killsushi has joined #archiveteam-bs |
21:28
🔗
|
|
BlueMax has joined #archiveteam-bs |
22:33
🔗
|
|
Shen has quit IRC (Quit: wheeee) |
22:40
🔗
|
|
Shen has joined #archiveteam-bs |
23:03
🔗
|
|
schbirid has quit IRC (Remote host closed the connection) |
23:55
🔗
|
|
icedice has quit IRC (Read error: Operation timed out) |