[00:00] *** acridAxid has joined #archiveteam-bs
[00:58] *** BlueMax has quit IRC (Read error: Connection reset by peer)
[01:43] *** Raccoon has quit IRC (Remote host closed the connection)
[02:14] *** HashbangI has quit IRC (Remote host closed the connection)
[02:25] *** HashbangI has joined #archiveteam-bs
[02:42] *** ivan- has joined #archiveteam-bs
[02:42] *** Fusl____ sets mode: +o ivan-
[02:42] *** Fusl sets mode: +o ivan-
[02:42] *** Fusl_ sets mode: +o ivan-
[02:49] *** godane has quit IRC (Ping timeout: 360 seconds)
[02:52] *** ivan_ has quit IRC (Ping timeout: 746 seconds)
[03:00] *** ivan- is now known as ivan_
[03:15] *** godane has joined #archiveteam-bs
[03:26] *** qw3rty has joined #archiveteam-bs
[03:34] *** odemgi_ has joined #archiveteam-bs
[03:35] *** qw3rty2 has quit IRC (Ping timeout: 745 seconds)
[03:36] *** odemgi has quit IRC (Ping timeout: 252 seconds)
[03:47] *** DopefishJ has joined #archiveteam-bs
[03:51] *** DogsRNice has quit IRC (Read error: Connection reset by peer)
[03:51] *** DFJustin has quit IRC (Ping timeout: 745 seconds)
[04:00] *** SynMonger has quit IRC (Quit: Wait, what?)
[04:01] *** SynMonger has joined #archiveteam-bs
[04:56] *** Raccoon has joined #archiveteam-bs
[05:15] *** BlueMax has joined #archiveteam-bs
[08:03] *** VADemon has quit IRC (Read error: Connection reset by peer)
[08:09] *** schbirid has joined #archiveteam-bs
[09:33] *** zhongfu has quit IRC (Quit: cya losers)
[09:37] *** zhongfu has joined #archiveteam-bs
[10:32] *** kiskabak has quit IRC (Ping timeout (120 seconds))
[10:33] *** kiskabak has joined #archiveteam-bs
[10:33] *** Fusl____ sets mode: +o kiskabak
[10:33] *** Fusl sets mode: +o kiskabak
[10:33] *** Fusl_ sets mode: +o kiskabak
[10:33] *** TC01_ has quit IRC (Read error: Operation timed out)
[10:33] *** dxrt has quit IRC (Remote host closed the connection)
[10:33] *** dxrt has joined #archiveteam-bs
[10:33] *** Fusl____ sets mode: +o dxrt
[10:33] *** TC01 has joined #archiveteam-bs
[10:33] *** Fusl sets mode: +o dxrt
[10:33] *** Fusl_ sets mode: +o dxrt
[10:33] *** asdf0101 has quit IRC (Read error: Operation timed out)
[10:34] *** asdf0101 has joined #archiveteam-bs
[10:46] <JAA> Soo, I have a size estimate for picosong: 72 million requests, 9.9 TB rx, 4.9 TB WARCs
[10:49] <JAA> Test for random 1 ‰ of the IDs took 9 minutes, so if they let me run at this speed continuously, the whole thing would take about 6 days.
[10:49] *** m007a83_ has joined #archiveteam-bs
[10:51] *** m007a83 has quit IRC (Ping timeout: 252 seconds)
[10:53] <JAA> Oh, actually, some of the downloads failed because I got banned already from the S3 bucket. So it'll be a bit larger and might not work at that speed.
[10:55] <JAA> 1358 of the 29822 attempted song IDs exist.
[10:58] <JAA> Didn't get banned from S3, just some random requests failed with a 403 for whatever reason. I actually saw that earlier in tests but couldn't reproduce it. Maybe I need to add a delay there.
[10:58] <JAA> Will look into this further in the evening.
[11:46] <arkiver> ah forgot about that one
[11:46] <arkiver> JAA: I see deadline is at october 21
[11:48] <arkiver> JAA: can you please add the info to https://www.archiveteam.org/index.php?title=Picosong ?
[12:12] *** themadpr0 has joined #archiveteam-bs
[12:13] <themadpr0> hey hey hey
[12:16] <themadpr0> I'm planning on starting a circle at my school for teaching people about web archiving
[12:17] <themadpr0> Any tips for what might make for some good topics?
[12:34] <markedL> I don't have the examples, but it would be interesting to go into examples of unexpected uses by researchers.
[12:39] *** BlueMax has quit IRC (Read error: Connection reset by peer)
[12:43] <asie> this, and also just teaching people that (a) they should be archiving stuff they value (and why), and (b) how to do it
[12:58] <themadpr0> asie yeah that's basically what I'm going for
[12:59] <themadpr0> though I do plan to also throw in some discussion on 'why some stuff in particular doesn't stay on the internet forever' (censorship, digital distrubition etc.)
[13:00] <themadpr0> but again, I'm looking for some nice openers
[13:00] <themadpr0> the WHY 
[13:01] <themadpr0> I'm going to get a lot of "how much money is there in this" and "aren't there people already doing this for us" or "maybe it's for the best that these do vanish"
[13:02] <themadpr0> so how do I keep people entertained long enough to avoid derailing (o-o)
[13:02] <asie> you won't
[13:02] <asie> you'd need to be a high-grade comedian to sustain an audience for that long
[13:02] <asie> let alone make them care
[13:03] <Kaz> the 'why' is different for everyone
[13:17] <markedL> Most people know the nazi's burning of books and art but it's portrayed as an anomaly when it's not unusual.  If you're concerned about motivation, consider making it interactive.  Using a news|games|music|video|art|photo trove let people find things that would excite them.
[13:30] <kiska> themadpr0: Make a tweet on twitter, use the save page now feature, then delete the tweet in class/your session. Pull up wayback machine and show your tweet is still there
[13:31] <kiska> That was how I got some people interested in archiving stuff
[14:02] *** DogsRNice has joined #archiveteam-bs
[15:31] *** h3ndr1k has quit IRC (Quit:  )
[15:31] *** h3ndr1k has joined #archiveteam-bs
[15:48] *** themadpr0 has quit IRC (Ping timeout: 260 seconds)
[16:54] *** bluefoo has joined #archiveteam-bs
[17:32] <Dash> JAA: can your WARC grabber bypass cloudflare's email protection?
[17:32] <Dash> picosong has a lot of songs that cloudflare thinks have emails in the filename
[17:34] <Dash> and how'd you get banned from their S3 bucket? I was able to grab single-threaded for a solid week or two and not get nabbed
[17:49] <markedL> on the wiki, what's the difference between Project Status and Archiving Status ?
[18:03] <JAA> Dash: I didn't get banned, but some downloads returned 403s. Do you have an example for such an "email" file?
[18:04] <JAA> markedL: "Project Status" is the status of the website/service/whatever, i.e. whether it's still online, endangered, or offline. "Archiving Status" is the status of the archival, e.g. saved/upcoming/not saved yet.
[18:44] <Dash> JAA: I suppose most of the issue would be in the pages themselves, but look here: https://picosong.com/S9mc/
[18:44] <Dash> If you view source, the filename shows up as SLOW DEATH [email protected]
[18:53] *** bluefoo has quit IRC (Ping timeout: 252 seconds)
[19:11] <JAA> Dash: Ah, yeah, that's just decoded by some JS. There's nothing special needed for it.
[19:17] *** bluefoo has joined #archiveteam-bs
[19:34] *** JAA sets mode: -b *!*ShellyRol@*.hsd1.wa.comcast.net
[19:48] *** icedice has joined #archiveteam-bs
[19:57] *** HashbangI has quit IRC (Remote host closed the connection)
[20:05] *** HashbangI has joined #archiveteam-bs
[20:48] *** icedice has quit IRC (Ping timeout: 252 seconds)
[20:51] *** icedice has joined #archiveteam-bs
[20:52] *** icedice has quit IRC (Connection closed)
[20:52] *** icedice has joined #archiveteam-bs
[20:52] *** icedice has quit IRC (Connection closed)
[21:03] *** icedice has joined #archiveteam-bs
[21:03] *** icedice has quit IRC (Connection closed)
[21:03] *** icedice has joined #archiveteam-bs
[21:24] *** killsushi has joined #archiveteam-bs
[21:28] *** BlueMax has joined #archiveteam-bs
[22:33] *** Shen has quit IRC (Quit: wheeee)
[22:40] *** Shen has joined #archiveteam-bs
[23:03] *** schbirid has quit IRC (Remote host closed the connection)
[23:55] *** icedice has quit IRC (Read error: Operation timed out)