#archiveteam-ot 2019-04-01,Mon

↑back Search

Time Nickname Message
00:04 🔗 robbierut has quit IRC (Read error: Operation timed out)
00:11 🔗 LowLevelM has joined #archiveteam-ot
00:15 🔗 LowLevelM When this google+ thing is over I have a project idea
00:16 🔗 JAA We already have at least three projects in the queue I think, but what's your idea?
00:17 🔗 Flashfire What are the projects in the queue
00:17 🔗 Flashfire ?
00:18 🔗 JAA JamiiForums, Reddit, and HardForum (in no particular order)
00:19 🔗 LowLevelM JAA: The project is to re-archive Thingiverse. I have already made a python script to download things, but it is broken, and far too large for me to do on my own.
00:19 🔗 Flashfire We really really need to get to Jamii Forums because its still at risk
00:19 🔗 Flashfire Reddit is massive and needs to be looked into further
00:19 🔗 JAA LowLevelM: How large is it?
00:19 🔗 Flashfire HardForum I have no idea what it is
00:19 🔗 LowLevelM 3.5 million things at the moment
00:19 🔗 LowLevelM plus the forum
00:20 🔗 LowLevelM It will be super easy to archive, as the ids are an integer, and it has a JSON api.
00:20 🔗 VADemon hardforum is a background job by new standards and not high priority
00:21 🔗 JAA Flashfire: https://hardforum.com/ Hardware discussions, 15 million posts, and at risk of disappearing because the owner no longer works at HardOCP.
00:21 🔗 marked has quit IRC (Quit: WeeChat 2.2)
00:22 🔗 JAA https://www.hardocp.com/article/2019/03/19/goodbye_hardocp_hello_intel/
00:23 🔗 JAA LowLevelM: I meant more in terms of data size. 3.5 million doesn't sound too bad, but if each thing is measured in megabytes, it gets messy.
00:23 🔗 VADemon JAA he said no changes will be coming to the website/forum, will change the owner and keep going
00:23 🔗 JAA VADemon: Ah, that's good to hear.
00:23 🔗 LowLevelM each thing plus the photos is a few megabytes
00:24 🔗 phiresky JAA: isn't reddit already archived by some guy?
00:24 🔗 JAA VADemon: In that case, it might not even be worth a warrior project but just an independent long-term grab.
00:24 🔗 phiresky this one i mean: http://files.pushshift.io/reddit/
00:24 🔗 JAA phiresky: Yes, kind of, but not in a format that is accessible to most people.
00:24 🔗 JAA We want to grab it such that it can be viewed in the Wayback Machine.
00:25 🔗 LowLevelM Thingiverse has gotten super slow in the past months, and shows signs of being forgotten by it's parent company; Makerbot.
00:25 🔗 phiresky are the warcs from google+ etc accessible in the wayback machine?
00:25 🔗 JAA Hmm, why are we having this conversation in -ot? Let's move this to -bs.
00:25 🔗 VADemon JAA it's a Xenforo forum and I'd like to grab bukkit.org forums, it's years of Minecraft administration and we've seen what Wikia has done to MC Forums
00:25 🔗 marked has joined #archiveteam-ot
00:26 🔗 VADemon imho worth to make a new grab script for Xenforo forum-types alone
00:26 🔗 JAA -> #archiveteam-bs
00:45 🔗 marked has quit IRC (Read error: Operation timed out)
00:49 🔗 marked has joined #archiveteam-ot
00:58 🔗 killsushi has joined #archiveteam-ot
00:59 🔗 Evie has joined #archiveteam-ot
01:30 🔗 LowLevelM has left
01:50 🔗 robbierut has joined #archiveteam-ot
02:07 🔗 Exairnous has joined #archiveteam-ot
02:46 🔗 ephemer0l has quit IRC (Quit: http://quassel-irc.org - Chat comfortably. Anywhere.)
02:57 🔗 rnduser_ has joined #archiveteam-ot
02:58 🔗 Despatche has quit IRC (Quit: Read error: Connection reset by deer)
03:01 🔗 rnduser has quit IRC (Ping timeout: 252 seconds)
03:09 🔗 DustinV has joined #archiveteam-ot
03:33 🔗 qw3rty115 has joined #archiveteam-ot
03:36 🔗 qw3rty114 has quit IRC (Ping timeout: 600 seconds)
03:47 🔗 IanR has joined #archiveteam-ot
04:02 🔗 Stiletto has quit IRC ()
04:03 🔗 odemg has quit IRC (Ping timeout: 615 seconds)
04:04 🔗 ephemer0l has joined #archiveteam-ot
04:08 🔗 Stiletto has joined #archiveteam-ot
04:09 🔗 odemg has joined #archiveteam-ot
04:22 🔗 DustinV has quit IRC (Read error: Connection reset by peer)
04:32 🔗 m007a83_ is now known as m007a83
04:50 🔗 dhyan_nat has joined #archiveteam-ot
05:08 🔗 DustinV has joined #archiveteam-ot
05:36 🔗 robbierut has quit IRC (Read error: Connection reset by peer)
05:37 🔗 IanR any way to poke a warrior out of throttle redirect sleep?
05:50 🔗 IanR tell me more about this soft ratelimit? a power outage deprived me of this chat
05:51 🔗 IanR or if this isn't on topic enough t'll go back to main with this question ;-)
06:04 🔗 jut has quit IRC (Ping timeout: 252 seconds)
06:05 🔗 icedice has quit IRC (Quit: Leaving)
06:11 🔗 jut has joined #archiveteam-ot
06:20 🔗 cutepillo has joined #archiveteam-ot
06:20 🔗 cutepillo is this where the anime is
06:21 🔗 IanR anime and ignored questions about google soft rate limiting ips
06:24 🔗 Kaz easy answer is no
06:24 🔗 Kaz harder answer is I don't k(no)w
06:29 🔗 BlueMax has quit IRC (Quit: Leaving)
06:33 🔗 IanR I appreciate the response
06:34 🔗 Exairnous has quit IRC (Read error: Operation timed out)
06:38 🔗 kiska Damn how many upcoming projects do we have?!
06:39 🔗 IanR well, I hear a live action big robot marathon is on the horizon
06:41 🔗 kiska I hear the noise of rsync targets crying in a corner
06:42 🔗 IanR under or over loaded?
06:42 🔗 IanR I've lost my pile of graphana tabs in an outage
06:42 🔗 Kaz targets are doing very well, don't think we're hitting slot limits anywhere
06:45 🔗 deevious has joined #archiveteam-ot
06:59 🔗 julientm has joined #archiveteam-ot
07:07 🔗 BlueMax has joined #archiveteam-ot
07:16 🔗 robbierut has joined #archiveteam-ot
07:21 🔗 eythian has joined #archiveteam-ot
07:31 🔗 kiska I remember when I did the tumblr project we made FoS cry
08:06 🔗 Joseph__ has joined #archiveteam-ot
08:07 🔗 VerifiedJ has quit IRC (Read error: Connection reset by peer)
08:17 🔗 julientm has quit IRC (Remote host closed the connection)
08:19 🔗 IanR any thoughts and theories on the rate limiting are welcome
08:21 🔗 robbierut Is it still happening?
08:32 🔗 DustinVF has joined #archiveteam-ot
08:32 🔗 ivan what's the nature of Google's blocking?
08:32 🔗 julientm has joined #archiveteam-ot
08:33 🔗 DustinV has quit IRC (Ping timeout: 252 seconds)
08:33 🔗 ivan Google has an edge firewall that looks at HTTP request headers (and no-routes you if unhappy) and application firewalls with things like request-per-day limits
08:34 🔗 julientm has quit IRC (Remote host closed the connection)
08:36 🔗 DustinVF has quit IRC (Read error: Operation timed out)
08:37 🔗 julientm has joined #archiveteam-ot
08:39 🔗 IanR has quit IRC (Read error: Connection reset by peer)
08:39 🔗 DustinV has joined #archiveteam-ot
08:40 🔗 julientm has quit IRC (Remote host closed the connection)
08:40 🔗 IanR has joined #archiveteam-ot
08:40 🔗 julientm has joined #archiveteam-ot
08:56 🔗 IanR has quit IRC (Read error: Connection reset by peer)
08:56 🔗 IanR has joined #archiveteam-ot
08:58 🔗 julientm has quit IRC (Remote host closed the connection)
08:59 🔗 IanR firefox and the long form of the google minus tracker seem to have conspired against my system, had to reset out of a swap storm
08:59 🔗 julientm has joined #archiveteam-ot
09:12 🔗 julientm has quit IRC (Remote host closed the connection)
09:12 🔗 julientm has joined #archiveteam-ot
09:19 🔗 MR9K has quit IRC (Read error: Connection reset by peer)
09:21 🔗 MR9K has joined #archiveteam-ot
09:24 🔗 eythian IanR: I have had that happen also
09:47 🔗 ryry has quit IRC (Ping timeout: 260 seconds)
09:54 🔗 DustinV ye, that page seems to have an issue with memory leaking (perticularly if you hit show all)
10:00 🔗 julientm has quit IRC (Remote host closed the connection)
10:11 🔗 julientm has joined #archiveteam-ot
10:11 🔗 BlueMax has quit IRC (Quit: Leaving)
10:25 🔗 jesso has joined #archiveteam-ot
10:36 🔗 IanR chrome hasn't nuked me yet, but also doesn't work as well, exciting choices!
10:44 🔗 Oddly has joined #archiveteam-ot
11:07 🔗 jesso has quit IRC (Quit: jesso)
11:10 🔗 jesso has joined #archiveteam-ot
11:31 🔗 killsushi has quit IRC (Quit: Leaving)
11:46 🔗 dhyan_nat has quit IRC (hub.efnet.us irc.efnet.nl)
11:46 🔗 Mateon1 has quit IRC (hub.efnet.us irc.efnet.nl)
11:46 🔗 SketchCow has quit IRC (hub.efnet.us irc.efnet.nl)
11:46 🔗 ats has quit IRC (hub.efnet.us irc.efnet.nl)
11:46 🔗 betamax has quit IRC (hub.efnet.us irc.efnet.nl)
11:46 🔗 noirscape has quit IRC (hub.efnet.us irc.efnet.nl)
11:46 🔗 argus has quit IRC (hub.efnet.us irc.efnet.nl)
11:46 🔗 asie has quit IRC (hub.efnet.us irc.efnet.nl)
11:46 🔗 Tenebrae has quit IRC (hub.efnet.us irc.efnet.nl)
11:46 🔗 MrRadar2 has quit IRC (hub.efnet.us irc.efnet.nl)
11:46 🔗 BnAboyZ has quit IRC (hub.efnet.us irc.efnet.nl)
11:46 🔗 Frogging has quit IRC (hub.efnet.us irc.efnet.nl)
11:46 🔗 jodizzle has quit IRC (hub.efnet.us irc.efnet.nl)
11:46 🔗 VoynichCr has quit IRC (hub.efnet.us irc.efnet.nl)
11:46 🔗 t2t2 has quit IRC (hub.efnet.us irc.efnet.nl)
11:46 🔗 wp494 has quit IRC (hub.efnet.us irc.efnet.nl)
11:46 🔗 Hintswen has quit IRC (hub.efnet.us irc.efnet.nl)
11:50 🔗 KoalaBear has quit IRC (Read error: Operation timed out)
11:51 🔗 julientm Does anyone know what would be a good way to go about scraping, a profiled media resource from a portal, on my webbrowser. Opening the video media url and saving it by numeric sequence is what I want to automate. Any idea how I can go about that ?
11:53 🔗 dhyan_nat has joined #archiveteam-ot
11:53 🔗 t2t2 has joined #archiveteam-ot
11:53 🔗 Mateon1 has joined #archiveteam-ot
11:53 🔗 wp494 has joined #archiveteam-ot
11:53 🔗 Hintswen has joined #archiveteam-ot
11:53 🔗 SketchCow has joined #archiveteam-ot
11:53 🔗 ats has joined #archiveteam-ot
11:53 🔗 betamax has joined #archiveteam-ot
11:53 🔗 noirscape has joined #archiveteam-ot
11:53 🔗 argus has joined #archiveteam-ot
11:53 🔗 asie has joined #archiveteam-ot
11:53 🔗 Tenebrae has joined #archiveteam-ot
11:53 🔗 MrRadar2 has joined #archiveteam-ot
11:53 🔗 BnAboyZ has joined #archiveteam-ot
11:53 🔗 Frogging has joined #archiveteam-ot
11:53 🔗 jodizzle has joined #archiveteam-ot
11:53 🔗 VoynichCr has joined #archiveteam-ot
11:53 🔗 Fusl sets mode: +o SketchCow
11:59 🔗 julientm https://i.imgur.com/9XuBgDg.png I need to always pull the same resource,a.mp4 it comes with a policy and a refferal code on the url. How can I automate this?
12:02 🔗 JAA Well, where are those Policy and referrer values coming from?
12:04 🔗 julientm JAA, they are not really important since I don't abuse any server loads, it's just automating, a small batch that is the issue,
12:05 🔗 julientm Right now, I manually opening the link, and instead of JS video player, I get html5 firefox player, and can click save-as
12:05 🔗 JAA julientm: But you probably need those values to download the correct video.
12:05 🔗 julientm okay well it is from my local library
12:06 🔗 julientm I am using the online services, to view some videos, with safari
12:06 🔗 julientm Yeah I always use them
12:06 🔗 julientm just looking to automate the flow instead of doing it manually
12:07 🔗 JAA Mhm
12:08 🔗 JAA Well, you need to figure out how to construct the URL that gives you the video file.
12:09 🔗 julientm it randomly changes, every segment of video, and they split the video into every topic, so one book can have 72 5 minute videos. I am looking to download locally and then just add in a vlc playlist.
12:09 🔗 julientm here is what I am working with
12:10 🔗 JAA Does the player download a .m3u or .m3u8 file?
12:11 🔗 julientm let me get you some graphics 2 secs
12:26 🔗 julientm JAA, https://youtu.be/ER3isecQ334 https://ghostbin.com/paste/6cg72/raw
12:26 🔗 julientm has quit IRC (Read error: Connection reset by peer)
12:34 🔗 julientm_ has joined #archiveteam-ot
12:34 🔗 julientm_ JAA, sorry irc rebooted
12:35 🔗 julientm_ JAA, were you able to take a look at it? , https://youtu.be/ER3isecQ334 https://ghostbin.com/paste/6cg72/raw ?
12:39 🔗 Despatche has joined #archiveteam-ot
12:48 🔗 julientm_ has quit IRC (Remote host closed the connection)
12:48 🔗 julientm_ has joined #archiveteam-ot
12:52 🔗 JAA julientm_: Yeah, but I can't really help you much with that information. As I said, you need to figure out where that a.mp4 URL comes from since there's no way you'll guess it. Chances are it's either in a playlist file (.m3u or .m3u8) or somehow retrieved with JavaScript. But without direct access to the website, being able to see all requests with every detail, and playing around with them, there's
12:52 🔗 JAA no way I can tell you how you have to do it.
13:22 🔗 julientm_ okay thank you JAA
13:25 🔗 phiresky can also try youtube-dl it has some logic for common hosting methods afaik
13:27 🔗 Wizzito has joined #archiveteam-ot
13:45 🔗 julientm_ phiresky, JAA so I managed to get it down by manually, clicking on the videos and exporting .har file and then using bash to process text and curl to get videos
13:50 🔗 robbierut has quit IRC (Read error: Operation timed out)
13:50 🔗 robbierut has joined #archiveteam-ot
13:56 🔗 JAA Ahaha, nice one: https://marc.info/?l=openbsd-tech&m=155407864604288&w=2
13:56 🔗 JAA (Context for those who missed it: https://twitter.com/RedTeamPT/status/1110843396657238016 )
13:59 🔗 phiresky haha
14:00 🔗 julientm_ has quit IRC (Read error: Connection reset by peer)
14:04 🔗 julientm has joined #archiveteam-ot
14:09 🔗 robbierut has quit IRC (Ping timeout: 360 seconds)
14:09 🔗 robbierut has joined #archiveteam-ot
14:12 🔗 robbierut has quit IRC (Read error: Connection reset by peer)
14:13 🔗 robbierut has joined #archiveteam-ot
14:18 🔗 DustinVF has joined #archiveteam-ot
14:22 🔗 DustinVFP has joined #archiveteam-ot
14:22 🔗 deevious has quit IRC (Quit: deevious)
14:22 🔗 DustinVFP is now known as otherDust
14:23 🔗 DustinVF has quit IRC (Read error: Operation timed out)
14:27 🔗 DustinV has quit IRC (Read error: Operation timed out)
14:27 🔗 otherDust is now known as DustinV
14:33 🔗 deevious has joined #archiveteam-ot
14:40 🔗 Wizzito has quit IRC (Quit: Leaving)
14:50 🔗 DustinV has quit IRC (Remote host closed the connection)
14:51 🔗 DustinV has joined #archiveteam-ot
14:51 🔗 DustinV has quit IRC (Read error: Connection reset by peer)
14:52 🔗 DustinV has joined #archiveteam-ot
15:14 🔗 cutepillo has quit IRC (Read error: Operation timed out)
15:36 🔗 julientm has quit IRC (Ping timeout: 252 seconds)
15:40 🔗 julientm has joined #archiveteam-ot
15:44 🔗 julientm has quit IRC (Remote host closed the connection)
15:45 🔗 julientm has joined #archiveteam-ot
16:14 🔗 Dj-Wawa has joined #archiveteam-ot
16:23 🔗 dhyan_nat has quit IRC (Read error: Operation timed out)
16:27 🔗 robbierut has quit IRC (Read error: Operation timed out)
16:27 🔗 robbierut has joined #archiveteam-ot
16:54 🔗 Joseph__ has quit IRC (Read error: Connection reset by peer)
16:55 🔗 VerifiedJ has joined #archiveteam-ot
17:49 🔗 robbierut has quit IRC (Read error: Connection reset by peer)
17:50 🔗 robbierut has joined #archiveteam-ot
17:56 🔗 Oddly has quit IRC (Ping timeout: 257 seconds)
18:20 🔗 marked has quit IRC (Read error: Operation timed out)
18:22 🔗 Oddly has joined #archiveteam-ot
18:22 🔗 Exairnous has joined #archiveteam-ot
18:25 🔗 marked has joined #archiveteam-ot
18:47 🔗 robbierut has quit IRC (Read error: Connection reset by peer)
18:47 🔗 robbierut has joined #archiveteam-ot
18:50 🔗 Exairnous has quit IRC (Remote host closed the connection)
18:51 🔗 Exairnous has joined #archiveteam-ot
18:52 🔗 icedice has joined #archiveteam-ot
18:56 🔗 Oddly has quit IRC (Ping timeout: 255 seconds)
19:00 🔗 robbierut has quit IRC (Read error: Connection reset by peer)
19:02 🔗 Odd0002_ has joined #archiveteam-ot
19:02 🔗 julientm has quit IRC (Read error: Connection reset by peer)
19:02 🔗 robbierut has joined #archiveteam-ot
19:03 🔗 Despatche has quit IRC (Read error: Operation timed out)
19:04 🔗 Exairnous has quit IRC (Read error: Operation timed out)
19:06 🔗 Odd0002 has quit IRC (Ping timeout: 615 seconds)
19:06 🔗 Odd0002_ is now known as Odd0002
19:12 🔗 Exairnous has joined #archiveteam-ot
19:17 🔗 dhyan_nat has joined #archiveteam-ot
19:22 🔗 Exairnous has quit IRC (Ping timeout: 615 seconds)
19:29 🔗 robbierut has quit IRC (Read error: Connection reset by peer)
19:30 🔗 DustinV has quit IRC (Ping timeout: 600 seconds)
19:31 🔗 robbierut has joined #archiveteam-ot
19:43 🔗 simon816 has quit IRC (Read error: Operation timed out)
19:43 🔗 dashcloud has quit IRC (Read error: Operation timed out)
19:45 🔗 ivan has quit IRC (Ping timeout: 246 seconds)
19:45 🔗 JAA has quit IRC (Ping timeout: 246 seconds)
19:45 🔗 ivan has joined #archiveteam-ot
19:46 🔗 logres133 has joined #archiveteam-ot
19:46 🔗 dashcloud has joined #archiveteam-ot
19:54 🔗 Stilett0 has joined #archiveteam-ot
19:57 🔗 julientm has joined #archiveteam-ot
19:58 🔗 julientm has quit IRC (Remote host closed the connection)
19:58 🔗 Stilett0 has quit IRC (Ping timeout: 252 seconds)
19:58 🔗 Stiletto has quit IRC (Read error: Operation timed out)
19:58 🔗 Stiletto has joined #archiveteam-ot
19:58 🔗 julientm has joined #archiveteam-ot
19:59 🔗 julientm has quit IRC (Remote host closed the connection)
20:01 🔗 julientm has joined #archiveteam-ot
20:03 🔗 Stilett0 has joined #archiveteam-ot
20:05 🔗 Stiletto has quit IRC (Ping timeout: 255 seconds)
20:10 🔗 robbierut has quit IRC (Read error: Operation timed out)
20:10 🔗 robbierut has joined #archiveteam-ot
20:27 🔗 robbierut has quit IRC (Read error: Connection reset by peer)
20:27 🔗 robbierut has joined #archiveteam-ot
20:29 🔗 Despatche has joined #archiveteam-ot
20:37 🔗 martini has joined #archiveteam-ot
20:43 🔗 simon816 has joined #archiveteam-ot
20:44 🔗 Stiletto has joined #archiveteam-ot
20:44 🔗 JAA has joined #archiveteam-ot
20:44 🔗 Fusl sets mode: +o JAA
20:45 🔗 bakJAA sets mode: +o JAA
20:48 🔗 Stilett0 has quit IRC (Ping timeout: 615 seconds)
20:49 🔗 tuluu_ has quit IRC (Ping timeout: 265 seconds)
20:49 🔗 dhyan_nat has quit IRC (Read error: Operation timed out)
20:59 🔗 tuluu has joined #archiveteam-ot
21:10 🔗 VADemon That's the future of Internet of Things.
21:25 🔗 robbierut has quit IRC (Read error: Connection reset by peer)
21:25 🔗 jrwr http://time.spacescience.tech/ this is for reddit's new april fools thing
21:25 🔗 jrwr /r/sequence/new/ auto timelapse
21:25 🔗 robbierut has joined #archiveteam-ot
21:27 🔗 kode54 has quit IRC (Quit: ZNC 1.7.2 - https://znc.in)
21:35 🔗 kode54 has joined #archiveteam-ot
21:49 🔗 Kaz jrwr: tl;dr this subreddit
21:49 🔗 Kaz ?
21:52 🔗 JAA Kaz: April Fools event by Reddit I believe.
22:09 🔗 rnduser has joined #archiveteam-ot
22:09 🔗 BlueMax has joined #archiveteam-ot
22:12 🔗 rnduser_ has quit IRC (Ping timeout: 252 seconds)
22:32 🔗 Exairnous has joined #archiveteam-ot
22:43 🔗 Exairnous has quit IRC (Ping timeout: 615 seconds)
22:46 🔗 rnduser has quit IRC (Read error: Connection reset by peer)
22:46 🔗 rnduser has joined #archiveteam-ot
22:53 🔗 martini has quit IRC (Quit: No Reasson)
22:54 🔗 rnduser has quit IRC (Read error: Connection reset by peer)
22:54 🔗 rnduser has joined #archiveteam-ot

irclogger-viewer