Time |
Nickname |
Message |
00:14
🔗
|
|
RichardG_ has joined #archiveteam-bs |
00:15
🔗
|
|
Lord_Nigh has joined #archiveteam-bs |
00:20
🔗
|
|
RichardG has quit IRC (Ping timeout: 615 seconds) |
00:29
🔗
|
|
Enthaniel has joined #archiveteam-bs |
00:33
🔗
|
Enthaniel |
Hi, I'm trying to find a Google+ post, but it isn't on the wayback machine. Does this mean that the ArchiveTeam project didn't scrape it, or could it just not be imported to the wayback machine yet? |
00:53
🔗
|
|
RichardG_ is now known as RichardG |
01:39
🔗
|
|
Mateon1 has quit IRC (Read error: Operation timed out) |
01:40
🔗
|
|
Mateon1 has joined #archiveteam-bs |
01:54
🔗
|
atphoenix |
Enthaniel, could be either. Someone else may be able to answer more closely. |
01:55
🔗
|
atphoenix |
it'd probably also need to have been a public post to have a fighting chance of being saved. |
02:02
🔗
|
Enthaniel |
Thanks! It was a public post I have a link to, and I was optimistic about it being saved because of the 98.6% figure on the wiki |
02:05
🔗
|
balrog |
can you share the link to the post? |
02:09
🔗
|
atphoenix |
last I heard not all the G+ data has been sent to IA yet, simply due to the scale of the project |
02:28
🔗
|
Enthaniel |
makes sense |
02:28
🔗
|
Enthaniel |
Here's what the link to the post was: https://plus.google.com/u/1/114478321563556097225/posts/f4wotPaxgFZ |
02:48
🔗
|
|
Maylay has quit IRC (Ping timeout: 258 seconds) |
02:51
🔗
|
kiska |
Well I would guess wait a few more months, cause there is still 500T of data yet to be uploaded to IA :D |
03:18
🔗
|
|
Maylay has joined #archiveteam-bs |
03:29
🔗
|
|
Smiley has joined #archiveteam-bs |
03:34
🔗
|
|
SmileyG has quit IRC (Ping timeout: 745 seconds) |
03:36
🔗
|
|
polymorph has joined #archiveteam-bs |
03:39
🔗
|
|
polymorph has quit IRC (Client Quit) |
03:52
🔗
|
|
Enthaniel has quit IRC (Ping timeout: 260 seconds) |
04:27
🔗
|
|
qw3rty__ has joined #archiveteam-bs |
04:31
🔗
|
|
qw3rty_ has quit IRC (Ping timeout: 276 seconds) |
04:44
🔗
|
|
Enthaniel has joined #archiveteam-bs |
04:45
🔗
|
Enthaniel |
Alright, I'll wait a while. Thanks for answering my questions (: |
05:15
🔗
|
OrIdow6 |
The League of Legends boards euw region has English listed twice in its language menu, and when you click them, they take you to the same page |
05:16
🔗
|
OrIdow6 |
In the JSON list of languages, the second is labeled as "en_PL" (Polish English?); it shows up in the interface as "English (EUNE)", where EUNE is a different region |
05:17
🔗
|
OrIdow6 |
Hopefully this isn't NoScript or something like that causing me to miss a language |
05:20
🔗
|
|
Enthaniel has quit IRC (Ping timeout: 260 seconds) |
05:40
🔗
|
OrIdow6 |
So if you give it any valid language, but there's no specific forum for that language, it will give you the "default" language for that region, in most cases English |
05:55
🔗
|
OrIdow6 |
French for both euw and eune looks to have already been shut down, and redirects somewhere |
05:58
🔗
|
|
Raccoon has quit IRC (Remote host closed the connection) |
06:08
🔗
|
OrIdow6 |
Spanish and English are shared across all regions that have them - it's the same set of threads - I assume that to prevent broken links, you'd want to get all of the duplicates, perhaps as a low priority thing if the rest gets done in time |
06:08
🔗
|
OrIdow6 |
At least, I think it's the same set - it reports the same count of threads for each, which is what I'm going by |
06:09
🔗
|
OrIdow6 |
Total 1338089 boards threads if you don't count duplicates; 4646752 |
06:09
🔗
|
OrIdow6 |
if you do |
06:11
🔗
|
OrIdow6 |
If someone is going to qwarc it, it is my belief that {"br": {"pt"}, "eune": {"cs", "el", "en", "hu", "pl", "ro"}, "euw": {"de", "en", "es", "it"}, "lan" : {"es"}, "las": {"es"}, "na": {"en"}, "oce": {"en"}, "ru": {"ru"}, "tr": {"tr"}, "jp": {"ja-jp"}, "pbe": {"en"}} are all valid regions and languages within regions |
06:15
🔗
|
OrIdow6 |
Or if anyone is going to get it through any other means, for that matter |
06:18
🔗
|
OrIdow6 |
*is all |
07:50
🔗
|
|
SilSte has quit IRC (Ping timeout: 186 seconds) |
08:04
🔗
|
|
godane has joined #archiveteam-bs |
08:04
🔗
|
|
d5f4a3622 has quit IRC (Read error: Connection reset by peer) |
08:04
🔗
|
|
d5f4a3622 has joined #archiveteam-bs |
08:05
🔗
|
godane |
latest digitize tapes : https://www.patreon.com/posts/digitize-tapes-34542507 |
08:06
🔗
|
|
d5f4a3622 has quit IRC (Read error: Operation timed out) |
08:06
🔗
|
|
d5f4a3622 has joined #archiveteam-bs |
08:11
🔗
|
|
DigiDigi has quit IRC (Read error: Operation timed out) |
08:13
🔗
|
|
DigiDigi has joined #archiveteam-bs |
09:24
🔗
|
|
Raccoon has joined #archiveteam-bs |
09:33
🔗
|
|
BlueMax has quit IRC (Quit: Leaving) |
10:38
🔗
|
|
bitbit has quit IRC (Quit: Leaving) |
14:20
🔗
|
OrIdow6 |
*2128988 if you don't count duplicates; 1338089 is the number of threads in languages other than English and Spanish |
15:19
🔗
|
|
dashcloud has quit IRC (Quit: http://quassel-irc.org - Chat comfortably. Anywhere.) |
15:25
🔗
|
|
dashcloud has joined #archiveteam-bs |
15:28
🔗
|
|
dashcloud has quit IRC (Client Quit) |
16:13
🔗
|
|
HP_Archiv has joined #archiveteam-bs |
16:30
🔗
|
|
systwi_ has joined #archiveteam-bs |
16:32
🔗
|
|
systwi has quit IRC (Read error: Operation timed out) |
16:52
🔗
|
|
Craigle has quit IRC (Quit: The Lounge - https://thelounge.chat) |
16:52
🔗
|
|
Craigle has joined #archiveteam-bs |
16:54
🔗
|
|
dashcloud has joined #archiveteam-bs |
17:00
🔗
|
|
DogsRNice has joined #archiveteam-bs |
17:04
🔗
|
Ryz |
How is progress on attempting to archive the League of Legends forums? Are we gonna start archiving when the forums don't accept new posts anymore? |
17:32
🔗
|
|
morgandaw has joined #archiveteam-bs |
17:41
🔗
|
|
MeeDee has quit IRC (Ping timeout: 745 seconds) |
17:53
🔗
|
|
Lord_Nigh has quit IRC (Quit: ZNC - http://znc.in) |
18:04
🔗
|
|
igloo25 has joined #archiveteam-bs |
18:34
🔗
|
|
HP_Archiv has quit IRC (Quit: Leaving) |
18:46
🔗
|
|
systwi_ is now known as systwi |
18:48
🔗
|
OrIdow6 |
The "forums" vs. the "boards" look to be the bulk of it, and they haven't had anything new since 2014 |
18:51
🔗
|
|
phillipsj has quit IRC (Remote host closed the connection) |
19:03
🔗
|
|
thuban1 has joined #archiveteam-bs |
19:06
🔗
|
|
thuban has quit IRC (Read error: Operation timed out) |
19:22
🔗
|
OrIdow6 |
JAA or anyone else with the capability: what's the status on QWarc or similar? Need something like 9 threads/second avg at this point to finish both in time |
19:30
🔗
|
|
thuban1 has quit IRC (Read error: Connection reset by peer) |
19:31
🔗
|
|
thuban1 has joined #archiveteam-bs |
19:45
🔗
|
|
TC01_ has quit IRC (Quit: No Ping reply in 180 seconds.) |
19:45
🔗
|
|
TC01 has joined #archiveteam-bs |
20:48
🔗
|
JAA |
The "forums" should be easy enough. The "boards" are trickier. |
20:50
🔗
|
JAA |
OrIdow6: Are there different language versions of the forums as well? |
20:51
🔗
|
JAA |
9 threads per second are easy with qwarc if their servers can handle it and don't ban us. |
20:59
🔗
|
OrIdow6 |
JAA: I haven't seen any indication of it |
21:04
🔗
|
OrIdow6 |
That is to say, no |
21:10
🔗
|
JAA |
Ah, it's all in the same system there. This is one of the German forums, for example: http://forums.euw.leagueoflegends.com/board/forumdisplay.php?f=40 |
21:10
🔗
|
JAA |
And of course those are not linked on http://forums.euw.leagueoflegends.com/board/ |
21:24
🔗
|
|
kris33 has joined #archiveteam-bs |
22:05
🔗
|
JAA |
I don't see any way to enumerate threads on the "boards". |
22:05
🔗
|
JAA |
OrIdow6: How did you find that hidden board https://boards.pbe.leagueoflegends.com/en/c/champions-gameplay-feedback ? |
22:19
🔗
|
OrIdow6 |
JAA: a web search |
22:20
🔗
|
OrIdow6 |
I assumed that you could find these by going through the recent threads list, & this seems to be correct - curl "https://boards.pbe.leagueoflegends.com/api/B0C6E0DB1E1358AE8B45243A6146FC3D667031EF/discussions?sort_type=recent&num_loaded=50000" | grep "gameplay-feedback |
22:20
🔗
|
OrIdow6 |
" |
22:21
🔗
|
OrIdow6 |
Turn on JS, go to the "new" etc. lists - they're all there |
22:21
🔗
|
OrIdow6 |
All threads |
22:26
🔗
|
OrIdow6 |
To extract what is here a long hex string: boardID = getOnlyItem(re.findall('(?<=data-href="/api/)\w+(?=/discussions)', r.text)) |
22:27
🔗
|
OrIdow6 |
getOnlyItem being an unimportant caller of "assert" |
22:28
🔗
|
|
HP_Archiv has joined #archiveteam-bs |
22:40
🔗
|
JAA |
Yeah, the general pagination was going to be my next test. Good. |
22:48
🔗
|
|
BlueMax has joined #archiveteam-bs |
23:25
🔗
|
|
eythian has quit IRC (Quit: http://quassel-irc.org - Chat comfortabel. Waar dan ook.) |
23:25
🔗
|
|
eythian has joined #archiveteam-bs |