#archiveteam-bs 2019-07-28,Sun

โ†‘back Search

Time Nickname Message
00:37 ๐Ÿ”— ndiddy has quit IRC (Quit: WeeChat 1.4)
00:38 ๐Ÿ”— ndiddy has joined #archiveteam-bs
01:12 ๐Ÿ”— RichardG_ has joined #archiveteam-bs
01:18 ๐Ÿ”— RichardG has quit IRC (Read error: Operation timed out)
01:36 ๐Ÿ”— RichardG_ is now known as RichardG
01:45 ๐Ÿ”— ShellyRol has joined #archiveteam-bs
01:57 ๐Ÿ”— kiskabak has quit IRC (Remote host closed the connection)
01:57 ๐Ÿ”— kiskabak has joined #archiveteam-bs
01:57 ๐Ÿ”— Fusl__ sets mode: +o kiskabak
01:57 ๐Ÿ”— Fusl sets mode: +o kiskabak
01:57 ๐Ÿ”— Fusl_ sets mode: +o kiskabak
02:36 ๐Ÿ”— killsushi has quit IRC (Quit: Leaving)
03:21 ๐Ÿ”— DogsRNice has quit IRC (Read error: Connection reset by peer)
03:48 ๐Ÿ”— wyatt8740 has quit IRC (Read error: Operation timed out)
03:49 ๐Ÿ”— wyatt8740 has joined #archiveteam-bs
03:52 ๐Ÿ”— qw3rty119 has joined #archiveteam-bs
03:56 ๐Ÿ”— qw3rty118 has quit IRC (Read error: Operation timed out)
04:11 ๐Ÿ”— wyatt8740 has quit IRC (Read error: Operation timed out)
04:14 ๐Ÿ”— wyatt8740 has joined #archiveteam-bs
05:01 ๐Ÿ”— wyatt8740 has quit IRC (Read error: Operation timed out)
07:45 ๐Ÿ”— Flashfloo has quit IRC (Remote host closed the connection)
07:45 ๐Ÿ”— Flashfire has quit IRC (Remote host closed the connection)
07:45 ๐Ÿ”— kiska has quit IRC (Remote host closed the connection)
07:45 ๐Ÿ”— Flashfloo has joined #archiveteam-bs
07:45 ๐Ÿ”— kiska has joined #archiveteam-bs
07:45 ๐Ÿ”— Fusl__ sets mode: +o kiska
07:45 ๐Ÿ”— Fusl sets mode: +o kiska
07:45 ๐Ÿ”— Fusl_ sets mode: +o kiska
07:45 ๐Ÿ”— Flashfire has joined #archiveteam-bs
08:08 ๐Ÿ”— VADemon_ has joined #archiveteam-bs
08:08 ๐Ÿ”— JH88 has quit IRC (Read error: Connection reset by peer)
08:08 ๐Ÿ”— wyatt8740 has joined #archiveteam-bs
08:09 ๐Ÿ”— JH88 has joined #archiveteam-bs
08:14 ๐Ÿ”— VADemon has quit IRC (Read error: Operation timed out)
09:18 ๐Ÿ”— schbirid has joined #archiveteam-bs
09:24 ๐Ÿ”— m007a83 has quit IRC (Read error: Operation timed out)
09:49 ๐Ÿ”— DigiDigi has quit IRC (Remote host closed the connection)
10:03 ๐Ÿ”— DigiDigi has joined #archiveteam-bs
12:24 ๐Ÿ”— BlueMax has quit IRC (Read error: Connection reset by peer)
13:43 ๐Ÿ”— Dragnog has quit IRC (Quit: The Lounge - https://thelounge.chat)
13:50 ๐Ÿ”— Dragnog has joined #archiveteam-bs
14:47 ๐Ÿ”— benjinsmi has quit IRC (Remote host closed the connection)
14:48 ๐Ÿ”— benjinsmi has joined #archiveteam-bs
14:56 ๐Ÿ”— Dragnog Thanks @arkiver. Would it be useful if I made a list of the forums that have been migrated and send that over or?.
14:56 ๐Ÿ”— arkiver yes
14:56 ๐Ÿ”— arkiver :)
14:56 ๐Ÿ”— * arkiver is back in 30 mins
14:56 ๐Ÿ”— JAA arkiver: I don't see how an AB job would work for this since the forum listing redirects to the new forums instead.
14:57 ๐Ÿ”— JAA Need to bruteforce the thread IDs probably.
14:57 ๐Ÿ”— arkiver Dragnog is going to try get a list
14:57 ๐Ÿ”— arkiver and
14:58 ๐Ÿ”— arkiver we can try to get a list of subforums from for example https://web.archive.org/web/20190601121723/https://us.battle.net/forums/en/wow/
14:58 ๐Ÿ”— arkiver than do an !a < job including the https://us.battle.net/forums/en/wow/ URL, and then it should I think get all pages?
14:58 ๐Ÿ”— arkiver I could be wrong, Iยดm not very much into archievbot
14:58 ๐Ÿ”— arkiver archivebot*
15:02 ๐Ÿ”— JAA Oh, the subforum listings are still available, right. Yeah, might work.
15:06 ๐Ÿ”— JAA Actually, no, --no-parent will still interfere.
15:26 ๐Ÿ”— JAA Do our tracker stats get archived anywhere? Trying to figure out how big Vidme was, but the trackers for it (vidme and vidme2) were deleted at some point. It would be quite ironic if we don't still have that data somewhere...
15:35 ๐Ÿ”— JAA (For Vidme specifically, there are some snapshots in the WBM, but the stats.json is missing for vidme2, so that's still not helping.)
15:38 ๐Ÿ”— JAA (And the stats.json for vidme is from while the project was still active, so that's useless as well.)
15:38 ๐Ÿ”— JAA So yeah, who archives the archivists?
15:45 ๐Ÿ”— Dragnog The Overwatch forums roots are currently not redirecting. Here is the list of those forums.
15:46 ๐Ÿ”— Dragnog_ has joined #archiveteam-bs
15:46 ๐Ÿ”— Dragnog_ Overwatch https://us.battle.net/forums/en/overwatch/ https://eu.battle.net/forums/en/overwatch/ https://eu.battle.net/forums/de/overwatch/ https://eu.battle.net/forums/fr/overwatch/ https://eu.battle.net/forums/ru/overwatch/ https://eu.battle.net/forums/es/overwatch/ https://eu.battle.net/forums/it/overwatch/ https://kr.battle.net/forums/ko/overwatch/ https://tw.battle.net/forums/zh/overwatch/ https://us.battle.ne
15:46 ๐Ÿ”— Dragnog_ Sorry still getting use to IRC. Last one got cut off https://eu.battle.net/forums/pl/overwatch/
16:04 ๐Ÿ”— JAA Dragnog_: It cut off before that. Your message ended with "/zh/overwatch/ https://us.battle.ne".
16:04 ๐Ÿ”— JAA The webchat thingy is awful. Consider using a proper client instead.
16:12 ๐Ÿ”— Dragnog Ah sorry. I installed The Lounge on aws but vnc wouldn't let me copy from my local machine. I've stuck them in a pastebin instead. https://pastebin.com/BCib0ZXP
16:13 ๐Ÿ”— Dragnog_ has quit IRC (Quit: Page closed)
16:22 ๐Ÿ”— Dragnog Hearthstone links https://pastebin.com/4QhGYM1h
17:04 ๐Ÿ”— Dragnog Here is the complete list. It appears its only the wow forums which are redirecting. The rest seem to be in tact atm https://pastebin.com/Q2Y3Q2FL
17:48 ๐Ÿ”— VerifiedJ has joined #archiveteam-bs
19:11 ๐Ÿ”— ShellyRol I'm currently using grab-site to archive a site. The URL of the site has a string of randomly generated characters that keep changing so the DUPE detector keeps thinking it's grabbing a new page when it is not. Here's an example URL: http://www.novaworld2.com/index.php?idtag=5d3df300d3ebd&do=/public/forums/ Has anyone dealt with something like this before?
19:12 ๐Ÿ”— ShellyRol Not sure if this goes in -bs or -ot
19:24 ๐Ÿ”— Fusl ShellyRol: if they return the same data, add them to ignores, if they are different, thats okay since they are different anyways
19:28 ๐Ÿ”— ShellyRol But some links have unique IDs such as the forums for example: http://www.novaworld2.com/index.php?idtag=5d3df70f29757&do=/public/forums/display_topic/id_13078 but my concern is the ?idtag keeps changing. It would be hard to whitelist every unique link since there are so many
19:42 ๐Ÿ”— schbirid where does that idtag come from? maybe its a session id and you can get rid of it by allowing cookies?
19:51 ๐Ÿ”— ShellyRol I am using cookies to grab the site since a login is required to view most of it. Here's the command I'm using: grab-site "http://www.novaworld2.com/index.php?idtag=5d3cf5e46b94c&do=/public/forums/" --wpull-args=--load-cookies=/home/user/share/Novaworld2/cookies.txt --ua "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36" --no-offsite-links
19:51 ๐Ÿ”— ShellyRol --import-ignores /home/user/share/Novaworld2/ignore --import-ignores /home/user/share/Novaworld2/ignore
20:41 ๐Ÿ”— DogsRNice has joined #archiveteam-bs
20:48 ๐Ÿ”— schbirid has quit IRC (Remote host closed the connection)
22:30 ๐Ÿ”— VerifiedJ has quit IRC (Quit: Leaving)
23:12 ๐Ÿ”— BlueMax has joined #archiveteam-bs

irclogger-viewer