[00:00] *** acridAxid has joined #archiveteam-bs [00:58] *** BlueMax has quit IRC (Read error: Connection reset by peer) [01:43] *** Raccoon has quit IRC (Remote host closed the connection) [02:14] *** HashbangI has quit IRC (Remote host closed the connection) [02:25] *** HashbangI has joined #archiveteam-bs [02:42] *** ivan- has joined #archiveteam-bs [02:42] *** Fusl____ sets mode: +o ivan- [02:42] *** Fusl sets mode: +o ivan- [02:42] *** Fusl_ sets mode: +o ivan- [02:49] *** godane has quit IRC (Ping timeout: 360 seconds) [02:52] *** ivan_ has quit IRC (Ping timeout: 746 seconds) [03:00] *** ivan- is now known as ivan_ [03:15] *** godane has joined #archiveteam-bs [03:26] *** qw3rty has joined #archiveteam-bs [03:34] *** odemgi_ has joined #archiveteam-bs [03:35] *** qw3rty2 has quit IRC (Ping timeout: 745 seconds) [03:36] *** odemgi has quit IRC (Ping timeout: 252 seconds) [03:47] *** DopefishJ has joined #archiveteam-bs [03:51] *** DogsRNice has quit IRC (Read error: Connection reset by peer) [03:51] *** DFJustin has quit IRC (Ping timeout: 745 seconds) [04:00] *** SynMonger has quit IRC (Quit: Wait, what?) [04:01] *** SynMonger has joined #archiveteam-bs [04:56] *** Raccoon has joined #archiveteam-bs [05:15] *** BlueMax has joined #archiveteam-bs [08:03] *** VADemon has quit IRC (Read error: Connection reset by peer) [08:09] *** schbirid has joined #archiveteam-bs [09:33] *** zhongfu has quit IRC (Quit: cya losers) [09:37] *** zhongfu has joined #archiveteam-bs [10:32] *** kiskabak has quit IRC (Ping timeout (120 seconds)) [10:33] *** kiskabak has joined #archiveteam-bs [10:33] *** Fusl____ sets mode: +o kiskabak [10:33] *** Fusl sets mode: +o kiskabak [10:33] *** Fusl_ sets mode: +o kiskabak [10:33] *** TC01_ has quit IRC (Read error: Operation timed out) [10:33] *** dxrt has quit IRC (Remote host closed the connection) [10:33] *** dxrt has joined #archiveteam-bs [10:33] *** Fusl____ sets mode: +o dxrt [10:33] *** TC01 has joined #archiveteam-bs [10:33] *** Fusl sets mode: +o dxrt [10:33] *** Fusl_ sets mode: +o dxrt [10:33] *** asdf0101 has quit IRC (Read error: Operation timed out) [10:34] *** asdf0101 has joined #archiveteam-bs [10:46] Soo, I have a size estimate for picosong: 72 million requests, 9.9 TB rx, 4.9 TB WARCs [10:49] Test for random 1 ‰ of the IDs took 9 minutes, so if they let me run at this speed continuously, the whole thing would take about 6 days. [10:49] *** m007a83_ has joined #archiveteam-bs [10:51] *** m007a83 has quit IRC (Ping timeout: 252 seconds) [10:53] Oh, actually, some of the downloads failed because I got banned already from the S3 bucket. So it'll be a bit larger and might not work at that speed. [10:55] 1358 of the 29822 attempted song IDs exist. [10:58] Didn't get banned from S3, just some random requests failed with a 403 for whatever reason. I actually saw that earlier in tests but couldn't reproduce it. Maybe I need to add a delay there. [10:58] Will look into this further in the evening. [11:46] ah forgot about that one [11:46] JAA: I see deadline is at october 21 [11:48] JAA: can you please add the info to https://www.archiveteam.org/index.php?title=Picosong ? [12:12] *** themadpr0 has joined #archiveteam-bs [12:13] hey hey hey [12:16] I'm planning on starting a circle at my school for teaching people about web archiving [12:17] Any tips for what might make for some good topics? [12:34] I don't have the examples, but it would be interesting to go into examples of unexpected uses by researchers. [12:39] *** BlueMax has quit IRC (Read error: Connection reset by peer) [12:43] this, and also just teaching people that (a) they should be archiving stuff they value (and why), and (b) how to do it [12:58] asie yeah that's basically what I'm going for [12:59] though I do plan to also throw in some discussion on 'why some stuff in particular doesn't stay on the internet forever' (censorship, digital distrubition etc.) [13:00] but again, I'm looking for some nice openers [13:00] the WHY [13:01] I'm going to get a lot of "how much money is there in this" and "aren't there people already doing this for us" or "maybe it's for the best that these do vanish" [13:02] so how do I keep people entertained long enough to avoid derailing (o-o) [13:02] you won't [13:02] you'd need to be a high-grade comedian to sustain an audience for that long [13:02] let alone make them care [13:03] the 'why' is different for everyone [13:17] Most people know the nazi's burning of books and art but it's portrayed as an anomaly when it's not unusual. If you're concerned about motivation, consider making it interactive. Using a news|games|music|video|art|photo trove let people find things that would excite them. [13:30] themadpr0: Make a tweet on twitter, use the save page now feature, then delete the tweet in class/your session. Pull up wayback machine and show your tweet is still there [13:31] That was how I got some people interested in archiving stuff [14:02] *** DogsRNice has joined #archiveteam-bs [15:31] *** h3ndr1k has quit IRC (Quit: ) [15:31] *** h3ndr1k has joined #archiveteam-bs [15:48] *** themadpr0 has quit IRC (Ping timeout: 260 seconds) [16:54] *** bluefoo has joined #archiveteam-bs [17:32] JAA: can your WARC grabber bypass cloudflare's email protection? [17:32] picosong has a lot of songs that cloudflare thinks have emails in the filename [17:34] and how'd you get banned from their S3 bucket? I was able to grab single-threaded for a solid week or two and not get nabbed [17:49] on the wiki, what's the difference between Project Status and Archiving Status ? [18:03] Dash: I didn't get banned, but some downloads returned 403s. Do you have an example for such an "email" file? [18:04] markedL: "Project Status" is the status of the website/service/whatever, i.e. whether it's still online, endangered, or offline. "Archiving Status" is the status of the archival, e.g. saved/upcoming/not saved yet. [18:44] JAA: I suppose most of the issue would be in the pages themselves, but look here: https://picosong.com/S9mc/ [18:44] If you view source, the filename shows up as SLOW DEATH [email protected] [18:53] *** bluefoo has quit IRC (Ping timeout: 252 seconds) [19:11] Dash: Ah, yeah, that's just decoded by some JS. There's nothing special needed for it. [19:17] *** bluefoo has joined #archiveteam-bs [19:34] *** JAA sets mode: -b *!*ShellyRol@*.hsd1.wa.comcast.net [19:48] *** icedice has joined #archiveteam-bs [19:57] *** HashbangI has quit IRC (Remote host closed the connection) [20:05] *** HashbangI has joined #archiveteam-bs [20:48] *** icedice has quit IRC (Ping timeout: 252 seconds) [20:51] *** icedice has joined #archiveteam-bs [20:52] *** icedice has quit IRC (Connection closed) [20:52] *** icedice has joined #archiveteam-bs [20:52] *** icedice has quit IRC (Connection closed) [21:03] *** icedice has joined #archiveteam-bs [21:03] *** icedice has quit IRC (Connection closed) [21:03] *** icedice has joined #archiveteam-bs [21:24] *** killsushi has joined #archiveteam-bs [21:28] *** BlueMax has joined #archiveteam-bs [22:33] *** Shen has quit IRC (Quit: wheeee) [22:40] *** Shen has joined #archiveteam-bs [23:03] *** schbirid has quit IRC (Remote host closed the connection) [23:55] *** icedice has quit IRC (Read error: Operation timed out)