#archiveteam-bs 2019-09-28,Sat

↑back Search

Time Nickname Message
00:00 🔗 acridAxid has joined #archiveteam-bs
00:58 🔗 BlueMax has quit IRC (Read error: Connection reset by peer)
01:43 🔗 Raccoon has quit IRC (Remote host closed the connection)
02:14 🔗 HashbangI has quit IRC (Remote host closed the connection)
02:25 🔗 HashbangI has joined #archiveteam-bs
02:42 🔗 ivan- has joined #archiveteam-bs
02:42 🔗 Fusl____ sets mode: +o ivan-
02:42 🔗 Fusl sets mode: +o ivan-
02:42 🔗 Fusl_ sets mode: +o ivan-
02:49 🔗 godane has quit IRC (Ping timeout: 360 seconds)
02:52 🔗 ivan_ has quit IRC (Ping timeout: 746 seconds)
03:00 🔗 ivan- is now known as ivan_
03:15 🔗 godane has joined #archiveteam-bs
03:26 🔗 qw3rty has joined #archiveteam-bs
03:34 🔗 odemgi_ has joined #archiveteam-bs
03:35 🔗 qw3rty2 has quit IRC (Ping timeout: 745 seconds)
03:36 🔗 odemgi has quit IRC (Ping timeout: 252 seconds)
03:47 🔗 DopefishJ has joined #archiveteam-bs
03:51 🔗 DogsRNice has quit IRC (Read error: Connection reset by peer)
03:51 🔗 DFJustin has quit IRC (Ping timeout: 745 seconds)
04:00 🔗 SynMonger has quit IRC (Quit: Wait, what?)
04:01 🔗 SynMonger has joined #archiveteam-bs
04:56 🔗 Raccoon has joined #archiveteam-bs
05:15 🔗 BlueMax has joined #archiveteam-bs
08:03 🔗 VADemon has quit IRC (Read error: Connection reset by peer)
08:09 🔗 schbirid has joined #archiveteam-bs
09:33 🔗 zhongfu has quit IRC (Quit: cya losers)
09:37 🔗 zhongfu has joined #archiveteam-bs
10:32 🔗 kiskabak has quit IRC (Ping timeout (120 seconds))
10:33 🔗 kiskabak has joined #archiveteam-bs
10:33 🔗 Fusl____ sets mode: +o kiskabak
10:33 🔗 Fusl sets mode: +o kiskabak
10:33 🔗 Fusl_ sets mode: +o kiskabak
10:33 🔗 TC01_ has quit IRC (Read error: Operation timed out)
10:33 🔗 dxrt has quit IRC (Remote host closed the connection)
10:33 🔗 dxrt has joined #archiveteam-bs
10:33 🔗 Fusl____ sets mode: +o dxrt
10:33 🔗 TC01 has joined #archiveteam-bs
10:33 🔗 Fusl sets mode: +o dxrt
10:33 🔗 Fusl_ sets mode: +o dxrt
10:33 🔗 asdf0101 has quit IRC (Read error: Operation timed out)
10:34 🔗 asdf0101 has joined #archiveteam-bs
10:46 🔗 JAA Soo, I have a size estimate for picosong: 72 million requests, 9.9 TB rx, 4.9 TB WARCs
10:49 🔗 JAA Test for random 1 ‰ of the IDs took 9 minutes, so if they let me run at this speed continuously, the whole thing would take about 6 days.
10:49 🔗 m007a83_ has joined #archiveteam-bs
10:51 🔗 m007a83 has quit IRC (Ping timeout: 252 seconds)
10:53 🔗 JAA Oh, actually, some of the downloads failed because I got banned already from the S3 bucket. So it'll be a bit larger and might not work at that speed.
10:55 🔗 JAA 1358 of the 29822 attempted song IDs exist.
10:58 🔗 JAA Didn't get banned from S3, just some random requests failed with a 403 for whatever reason. I actually saw that earlier in tests but couldn't reproduce it. Maybe I need to add a delay there.
10:58 🔗 JAA Will look into this further in the evening.
11:46 🔗 arkiver ah forgot about that one
11:46 🔗 arkiver JAA: I see deadline is at october 21
11:48 🔗 arkiver JAA: can you please add the info to https://www.archiveteam.org/index.php?title=Picosong ?
12:12 🔗 themadpr0 has joined #archiveteam-bs
12:13 🔗 themadpr0 hey hey hey
12:16 🔗 themadpr0 I'm planning on starting a circle at my school for teaching people about web archiving
12:17 🔗 themadpr0 Any tips for what might make for some good topics?
12:34 🔗 markedL I don't have the examples, but it would be interesting to go into examples of unexpected uses by researchers.
12:39 🔗 BlueMax has quit IRC (Read error: Connection reset by peer)
12:43 🔗 asie this, and also just teaching people that (a) they should be archiving stuff they value (and why), and (b) how to do it
12:58 🔗 themadpr0 asie yeah that's basically what I'm going for
12:59 🔗 themadpr0 though I do plan to also throw in some discussion on 'why some stuff in particular doesn't stay on the internet forever' (censorship, digital distrubition etc.)
13:00 🔗 themadpr0 but again, I'm looking for some nice openers
13:00 🔗 themadpr0 the WHY
13:01 🔗 themadpr0 I'm going to get a lot of "how much money is there in this" and "aren't there people already doing this for us" or "maybe it's for the best that these do vanish"
13:02 🔗 themadpr0 so how do I keep people entertained long enough to avoid derailing (o-o)
13:02 🔗 asie you won't
13:02 🔗 asie you'd need to be a high-grade comedian to sustain an audience for that long
13:02 🔗 asie let alone make them care
13:03 🔗 Kaz the 'why' is different for everyone
13:17 🔗 markedL Most people know the nazi's burning of books and art but it's portrayed as an anomaly when it's not unusual. If you're concerned about motivation, consider making it interactive. Using a news|games|music|video|art|photo trove let people find things that would excite them.
13:30 🔗 kiska themadpr0: Make a tweet on twitter, use the save page now feature, then delete the tweet in class/your session. Pull up wayback machine and show your tweet is still there
13:31 🔗 kiska That was how I got some people interested in archiving stuff
14:02 🔗 DogsRNice has joined #archiveteam-bs
15:31 🔗 h3ndr1k has quit IRC (Quit: )
15:31 🔗 h3ndr1k has joined #archiveteam-bs
15:48 🔗 themadpr0 has quit IRC (Ping timeout: 260 seconds)
16:54 🔗 bluefoo has joined #archiveteam-bs
17:32 🔗 Dash JAA: can your WARC grabber bypass cloudflare's email protection?
17:32 🔗 Dash picosong has a lot of songs that cloudflare thinks have emails in the filename
17:34 🔗 Dash and how'd you get banned from their S3 bucket? I was able to grab single-threaded for a solid week or two and not get nabbed
17:49 🔗 markedL on the wiki, what's the difference between Project Status and Archiving Status ?
18:03 🔗 JAA Dash: I didn't get banned, but some downloads returned 403s. Do you have an example for such an "email" file?
18:04 🔗 JAA markedL: "Project Status" is the status of the website/service/whatever, i.e. whether it's still online, endangered, or offline. "Archiving Status" is the status of the archival, e.g. saved/upcoming/not saved yet.
18:44 🔗 Dash JAA: I suppose most of the issue would be in the pages themselves, but look here: https://picosong.com/S9mc/
18:44 🔗 Dash If you view source, the filename shows up as SLOW DEATH [email protected]
18:53 🔗 bluefoo has quit IRC (Ping timeout: 252 seconds)
19:11 🔗 JAA Dash: Ah, yeah, that's just decoded by some JS. There's nothing special needed for it.
19:17 🔗 bluefoo has joined #archiveteam-bs
19:34 🔗 JAA sets mode: -b *!*ShellyRol@*.hsd1.wa.comcast.net
19:48 🔗 icedice has joined #archiveteam-bs
19:57 🔗 HashbangI has quit IRC (Remote host closed the connection)
20:05 🔗 HashbangI has joined #archiveteam-bs
20:48 🔗 icedice has quit IRC (Ping timeout: 252 seconds)
20:51 🔗 icedice has joined #archiveteam-bs
20:52 🔗 icedice has quit IRC (Connection closed)
20:52 🔗 icedice has joined #archiveteam-bs
20:52 🔗 icedice has quit IRC (Connection closed)
21:03 🔗 icedice has joined #archiveteam-bs
21:03 🔗 icedice has quit IRC (Connection closed)
21:03 🔗 icedice has joined #archiveteam-bs
21:24 🔗 killsushi has joined #archiveteam-bs
21:28 🔗 BlueMax has joined #archiveteam-bs
22:33 🔗 Shen has quit IRC (Quit: wheeee)
22:40 🔗 Shen has joined #archiveteam-bs
23:03 🔗 schbirid has quit IRC (Remote host closed the connection)
23:55 🔗 icedice has quit IRC (Read error: Operation timed out)

irclogger-viewer