#archiveteam-bs 2018-07-21,Sat

↑back Search

Time Nickname Message
00:03 🔗 Jens has quit IRC (Remote host closed the connection)
00:04 🔗 Jens has joined #archiveteam-bs
00:07 🔗 vectr0n has quit IRC (ZNC - https://znc.in)
00:24 🔗 vectr0n has joined #archiveteam-bs
00:29 🔗 vectr0n has quit IRC (ZNC - https://znc.in)
00:37 🔗 vectr0n has joined #archiveteam-bs
00:40 🔗 vectr0n has quit IRC (Client Quit)
00:43 🔗 BlueMax has joined #archiveteam-bs
01:48 🔗 vectr0n has joined #archiveteam-bs
02:38 🔗 Selavi has quit IRC (Read error: Connection reset by peer)
02:38 🔗 superkuh has quit IRC (Read error: Operation timed out)
02:38 🔗 Pixi has joined #archiveteam-bs
02:39 🔗 superkuh has joined #archiveteam-bs
02:39 🔗 ivan has quit IRC (Read error: Operation timed out)
02:39 🔗 zyphlar has quit IRC (Read error: Operation timed out)
02:40 🔗 jspiros has quit IRC (Read error: Operation timed out)
02:40 🔗 Petri152 has quit IRC (Read error: Operation timed out)
02:40 🔗 JAA has quit IRC (Read error: Operation timed out)
02:40 🔗 wabu has quit IRC (Read error: Operation timed out)
02:40 🔗 Stilett0 has quit IRC (Read error: Operation timed out)
02:40 🔗 Stilett0 has joined #archiveteam-bs
02:41 🔗 Pixi` has quit IRC (Read error: Operation timed out)
02:42 🔗 ivan has joined #archiveteam-bs
02:43 🔗 svchfoo3 sets mode: +o ivan
02:44 🔗 wp494 has quit IRC (Read error: Operation timed out)
02:44 🔗 wp494 has joined #archiveteam-bs
02:54 🔗 Selavi has joined #archiveteam-bs
03:08 🔗 ta9le has quit IRC (Quit: Connection closed for inactivity)
03:16 🔗 m007a83 has quit IRC (Quit: Leaving)
03:33 🔗 rcunning_ has quit IRC (Connection closed for inactivity)
03:40 🔗 JAA has joined #archiveteam-bs
03:40 🔗 swebb sets mode: +o JAA
03:40 🔗 bakJAA sets mode: +o JAA
03:40 🔗 Petri152 has joined #archiveteam-bs
03:40 🔗 wabu has joined #archiveteam-bs
03:41 🔗 zyphlar has joined #archiveteam-bs
03:42 🔗 archodg_ has joined #archiveteam-bs
03:43 🔗 archodg has quit IRC (Read error: Operation timed out)
03:44 🔗 jspiros has joined #archiveteam-bs
03:45 🔗 odemg has quit IRC (Ping timeout: 268 seconds)
03:57 🔗 m007a83 has joined #archiveteam-bs
03:57 🔗 odemg has joined #archiveteam-bs
05:27 🔗 cf has left Bye
05:28 🔗 cf has joined #archiveteam-bs
05:28 🔗 cf has left Bye.
05:45 🔗 Pixi` has joined #archiveteam-bs
05:47 🔗 Pixi has quit IRC (west.us.hub irc.Prison.NET)
05:47 🔗 achip has quit IRC (west.us.hub irc.Prison.NET)
05:47 🔗 Mateon1 has quit IRC (west.us.hub irc.Prison.NET)
06:18 🔗 Mateon1 has joined #archiveteam-bs
06:18 🔗 achip has joined #archiveteam-bs
07:00 🔗 BlueMaxim has joined #archiveteam-bs
07:04 🔗 dxrt- has joined #archiveteam-bs
07:04 🔗 dxrt has quit IRC (ZNC - http://znc.sourceforge.net)
07:08 🔗 BlueMax has quit IRC (Ping timeout: 604 seconds)
07:08 🔗 BlueMaxim has quit IRC (Read error: Operation timed out)
07:09 🔗 BlueMax has joined #archiveteam-bs
07:26 🔗 schbirid has joined #archiveteam-bs
08:37 🔗 wp494 has quit IRC (Read error: Operation timed out)
08:39 🔗 wp494 has joined #archiveteam-bs
08:48 🔗 dxrt- is now known as dxrt
08:49 🔗 dxrt has quit IRC (Quit: ZNC - http://znc.sourceforge.net)
08:50 🔗 dxrt has joined #archiveteam-bs
08:56 🔗 Mateon1 has quit IRC (Ping timeout: 255 seconds)
08:56 🔗 Mateon1 has joined #archiveteam-bs
09:10 🔗 godane so dtic.mil is now not funny anymore
09:11 🔗 godane just trying to upload a file will cause a 403 error to dtic.mil
09:11 🔗 godane cause i try to scrap metadata from there website so we can have the files have metadata
09:33 🔗 jschwart has joined #archiveteam-bs
09:37 🔗 godane one possible theory is when i'm curling the metadata it cause a 403 error cause i don't have firefox as user-agent
09:38 🔗 godane is a best guest cause i remember i could download pdfs with Firefox as user-agent also
09:51 🔗 Flashfire Archivebot it?
10:13 🔗 godane ok i can acesss the website again
10:16 🔗 godane i think i got it working again
10:17 🔗 godane it was just the 403 error blocking was making no sense for the amount i was grabbing
10:18 🔗 godane cause some one just browsing the website could get block based on the fact that i was like 1 url being scraped 6 times
10:21 🔗 godane one of the newer ones: https://archive.org/details/DTIC_ADA497001
10:21 🔗 godane i have been lacking in uploading those this month cause i have tapes to digitize and upload
10:25 🔗 BlueMax has quit IRC (Quit: Leaving)
10:54 🔗 VoynichCr has joined #archiveteam-bs
10:55 🔗 VoynichCr anyone has thought about archiving all youtube metadata?
10:56 🔗 VoynichCr and maybe some frames or the main thumb
11:04 🔗 fenn 07:42 < archodg_> SketchCow, arkiver I'm working on this, https://old.reddit.com/r/DataHoarder/comments/906884/youtube_metadata_archive_because_working_with/ something that
11:04 🔗 fenn er, sorry for the highlights
11:04 🔗 archodg_ heh
11:08 🔗 archodg_ it's going well, I'm upto 600,500,000+ video ids
11:11 🔗 VoynichCr amazing
11:12 🔗 fenn looks heavily biased to french language videos
11:35 🔗 ta9le has joined #archiveteam-bs
11:49 🔗 odemg fenn, yeah that was the test file I was using from a french guy on the-eyes discord, the other lists I'm working with are 99% english
13:11 🔗 REiN^ has joined #archiveteam-bs
13:25 🔗 plue has quit IRC (Remote host closed the connection)
13:28 🔗 REiN^ has quit IRC (Read error: Connection reset by peer)
13:30 🔗 REiN^ has joined #archiveteam-bs
14:34 🔗 kiska To the person editing imgur's page, please replace the source with this: https://www.reddit.com/r/patreon/comments/7x4wx1
14:43 🔗 JAA kiska: Thanks. I knew there was a better link out there but couldn't find it.
14:58 🔗 plue has joined #archiveteam-bs
14:58 🔗 plue has quit IRC (Client Quit)
14:58 🔗 plue has joined #archiveteam-bs
15:15 🔗 m007a83 has quit IRC (Leaving)
15:17 🔗 kiska Thanks JAA
15:17 🔗 kiska btw JAA it was the topic for #imgone
15:24 🔗 Mateon1 has quit IRC (Remote host closed the connection)
15:24 🔗 Mateon1 has joined #archiveteam-bs
15:28 🔗 JAA Whoops
15:34 🔗 m007a83 has joined #archiveteam-bs
16:39 🔗 achip has quit IRC (west.us.hub irc.Prison.NET)
16:55 🔗 JAA Oh FFS, Twitter's new site also uses that awful scrolling thing where off-screen elements are removed from the DOM. Sigh.
17:01 🔗 JAA They also nuked the non-JS mobile site, mobile.twitter.com.
17:01 🔗 JAA Unless that's now UA-dependent or something.
17:04 🔗 JAA Ah, it needs a cookie. You get asked whether you want the legacy site when you access mobile.twitter.com without JS, which then sets the relevant cookie(s). Afterwards, it serves you the non-JS page.
17:13 🔗 ivan it might be interesting to design a generic mitigation that no-ops the removal of DOM elements that are not in the viewport
17:16 🔗 achip has joined #archiveteam-bs
17:56 🔗 SoniEx2 has quit IRC (Ping timeout: 264 seconds)
17:56 🔗 vectr0n_ has joined #archiveteam-bs
18:02 🔗 vectr0n has quit IRC (Read error: Operation timed out)
18:02 🔗 vectr0n_ is now known as vectr0n
18:08 🔗 SoniEx2 has joined #archiveteam-bs
18:29 🔗 SoniEx2 has quit IRC (Ping timeout: 360 seconds)
18:44 🔗 SoniEx2 has joined #archiveteam-bs
19:03 🔗 schbirid JAA: was already UA dependent iirc, i used an older opera mobile UA
19:59 🔗 SoniEx2 has quit IRC (Ping timeout: 264 seconds)
20:12 🔗 SoniEx2 has joined #archiveteam-bs
20:22 🔗 JAA schbirid: I'm pretty sure I was able to access it without a special UA with Firefox on Linux previously. As in, a few months ago or so.
20:27 🔗 JAA I'm scraping various sources for TalkTalk sites currently. Haven't quite figured out yet what to do with the pages I find though. Maybe I'll just !a < them.
20:28 🔗 JAA Bing appears to be fairly scraping-friendly. At least they don't insta-ban you like many other services if you use a reasonable delay between requests.
20:33 🔗 betamax has joined #archiveteam-bs
21:13 🔗 m007a83 has quit IRC (Leaving)
22:52 🔗 JAA Does anyone have any search term suggestions for TalkTalk? So far, I've searched for a plain site:talktalk.net and together with these terms: family history, genealogy, club, society, clan, company. That yielded 1243 websites through Bing. There must be more though.
22:57 🔗 jut has joined #archiveteam-bs
23:10 🔗 BlueMax has joined #archiveteam-bs
23:41 🔗 Flashfire I propose a warrior project for grabbing steam profiles. With the bans constantly sweeping over if we grab the numerical profiles?
23:47 🔗 m007a83 has joined #archiveteam-bs
23:48 🔗 achip has quit IRC (west.us.hub irc.Prison.NET)

irclogger-viewer