| Time |
Nickname |
Message |
|
00:05
🔗
|
|
Raccoon has joined #archiveteam-bs |
|
00:21
🔗
|
Ryz |
Time to work on writing that M&A article soon - I have a list of sources I picked up while constantly archiving those goods~ |
|
00:40
🔗
|
|
bitbit has quit IRC (Quit: Leaving) |
|
00:55
🔗
|
|
Pixi has quit IRC (Read error: Operation timed out) |
|
01:06
🔗
|
|
Pixi has joined #archiveteam-bs |
|
01:16
🔗
|
|
VADemon has joined #archiveteam-bs |
|
01:42
🔗
|
|
pew has quit IRC (Ping timeout: 276 seconds) |
|
01:54
🔗
|
|
pew has joined #archiveteam-bs |
|
04:21
🔗
|
|
qw3rty__ has joined #archiveteam-bs |
|
04:29
🔗
|
|
qw3rty_ has quit IRC (Read error: Operation timed out) |
|
06:28
🔗
|
|
HP_Archiv has joined #archiveteam-bs |
|
06:28
🔗
|
|
HP_Archiv has quit IRC (Read error: Connection reset by peer) |
|
06:29
🔗
|
|
HP_Archiv has joined #archiveteam-bs |
|
06:30
🔗
|
|
HP_Archiv has quit IRC (Client Quit) |
|
06:30
🔗
|
|
HP_Archiv has joined #archiveteam-bs |
|
06:32
🔗
|
|
HP_Archiv has quit IRC (Client Quit) |
|
06:33
🔗
|
|
HP_Archiv has joined #archiveteam-bs |
|
06:33
🔗
|
|
HP_Archiv has quit IRC (Read error: Connection reset by peer) |
|
06:33
🔗
|
|
HP_Archiv has joined #archiveteam-bs |
|
06:33
🔗
|
|
HP_Archiv has quit IRC (Client Quit) |
|
07:01
🔗
|
|
HP_Archiv has joined #archiveteam-bs |
|
07:01
🔗
|
|
HP_Archiv has quit IRC (Read error: Connection reset by peer) |
|
10:07
🔗
|
|
BlueMax has quit IRC (Read error: Connection reset by peer) |
|
11:57
🔗
|
|
morgandaw has joined #archiveteam-bs |
|
12:03
🔗
|
|
MeeDee has quit IRC (Read error: Operation timed out) |
|
13:37
🔗
|
|
icedice has joined #archiveteam-bs |
|
13:38
🔗
|
|
MeeDee has joined #archiveteam-bs |
|
13:39
🔗
|
|
morgandaw has quit IRC (Ping timeout: 276 seconds) |
|
13:50
🔗
|
|
VADemon has quit IRC (left4dead) |
|
14:05
🔗
|
|
themadpro has joined #archiveteam-bs |
|
14:20
🔗
|
atphoenix |
quite the list here https://www.zdnet.com/article/a-list-of-security-conferences-canceled-or-postponed-due-to-coronavirus-concerns/ |
|
14:23
🔗
|
|
VADemon has joined #archiveteam-bs |
|
14:32
🔗
|
nicolas17 |
atphoenix: oh cool a list of cv-related sites |
|
14:32
🔗
|
nicolas17 |
for your consideration https://coronainfo.xyz/ I don't know about its reliability |
|
14:44
🔗
|
|
bitbit has joined #archiveteam-bs |
|
14:58
🔗
|
|
Mateon1 has quit IRC (Read error: Connection reset by peer) |
|
14:58
🔗
|
|
Mateon1 has joined #archiveteam-bs |
|
15:30
🔗
|
|
themadpro has quit IRC (Quit: This computer has gone to sleep) |
|
15:35
🔗
|
|
wyatt8740 has quit IRC (Read error: Operation timed out) |
|
15:37
🔗
|
|
themadpro has joined #archiveteam-bs |
|
15:39
🔗
|
|
themadpro has quit IRC (Client Quit) |
|
15:41
🔗
|
|
wyatt8740 has joined #archiveteam-bs |
|
15:51
🔗
|
atphoenix |
another cv-related tech conference change list https://www.zdnet.com/article/sxsw-canceled-due-to-coronavirus-all-the-2020-tech-conference-cancellations-and-travel-bans/ |
|
15:52
🔗
|
|
themadpro has joined #archiveteam-bs |
|
16:17
🔗
|
|
themadpro has quit IRC (Quit: This computer has gone to sleep) |
|
16:23
🔗
|
|
themadpro has joined #archiveteam-bs |
|
16:45
🔗
|
|
themadpro has quit IRC (Quit: This computer has gone to sleep) |
|
16:56
🔗
|
|
themadpro has joined #archiveteam-bs |
|
18:13
🔗
|
|
mtntmnky has quit IRC (Remote host closed the connection) |
|
18:13
🔗
|
|
mtntmnky has joined #archiveteam-bs |
|
18:27
🔗
|
|
Dallas has joined #archiveteam-bs |
|
18:41
🔗
|
|
Video_ has joined #archiveteam-bs |
|
19:54
🔗
|
|
d5f4a3622 has quit IRC (Read error: Connection reset by peer) |
|
19:56
🔗
|
|
d5f4a3622 has joined #archiveteam-bs |
|
20:22
🔗
|
|
bitbit has quit IRC (Remote host closed the connection) |
|
20:34
🔗
|
betamax |
regarding the discussion on #archiveteam re: BBC Mixital, I was the one who initially made the page on the wiki for it |
|
20:34
🔗
|
betamax |
then subsequently realised I just didn't have the time to work on it |
|
20:35
🔗
|
betamax |
I don't think the discussion around our approach to archiving it ever got past the "lets decide on a name for the channel" :) |
|
20:35
🔗
|
betamax |
a lot will be difficult to get as it's all JS |
|
20:36
🔗
|
betamax |
but the fanfic, reports and videos should be do-able |
|
20:37
🔗
|
OrIdow6 |
A full archive looks like it would require a lot of human effort |
|
20:37
🔗
|
betamax |
cc: themadpro, JAA, PurpleSym (tagging you all as you've edited the wiki page) |
|
20:37
🔗
|
OrIdow6 |
Might have to split off into subgroups to work on each of the different channels (which seem to be more or less separate pieces of software, with a few superificial qualities and e.g. the view counter and comments in common) |
|
20:38
🔗
|
OrIdow6 |
Which porbably won't happen |
|
20:38
🔗
|
OrIdow6 |
*probably |
|
20:38
🔗
|
betamax |
shall we create a channel to avoid clogging up -bs? how about #mixdown ? |
|
20:39
🔗
|
betamax |
(although I'm trying to avoid becoming to active in this since I have very little free time) |
|
20:44
🔗
|
themadpro |
same here |
|
20:45
🔗
|
themadpro |
at least for this week |
|
20:45
🔗
|
themadpro |
#mixdown does sound nice but hear me out |
|
20:45
🔗
|
themadpro |
#missal |
|
20:45
🔗
|
|
balrog has quit IRC (Quit: Bye) |
|
20:46
🔗
|
themadpro |
btw I couldn't finish the wiki page but in summary they seem to come in three types: |
|
20:48
🔗
|
themadpro |
1) Text |
|
20:48
🔗
|
themadpro |
1) Text, 2) Media (Photo/Video/Artwork), 3) "Interactives" (Moviemakers, level designers and a "dancing robot maker") |
|
20:48
🔗
|
|
balrog has joined #archiveteam-bs |
|
20:51
🔗
|
themadpro |
The first two categories should be pretty trivial but I fear the third might be particularly challenging |
|
20:52
🔗
|
themadpro |
And they knew, the FAQ both mentions how to download your stuff and how to "screen-cap" specifically for one of the interactives |
|
21:01
🔗
|
OrIdow6 |
They say that there are millions, but I only count 32350 with the search API |
|
21:02
🔗
|
themadpro |
yeah, I don't think it's in the millions, it shouldn't be too heavy of a task |
|
21:03
🔗
|
|
Lord_Nigh has quit IRC (Quit: ZNC - http://znc.in) |
|
21:04
🔗
|
|
Lord_Nigh has joined #archiveteam-bs |
|
22:17
🔗
|
JAA |
phuzion: Thanks, I've thrown it into ArchiveBot. That will not be complete though as in-thread pagination of replies as well as posts on user profiles are done with JS. |
|
22:18
🔗
|
JAA |
(That's for http://activedir.org/ for reference.) |
|
22:27
🔗
|
|
Video_ has quit IRC (Leaving) |
|
22:29
🔗
|
|
wyatt8740 has quit IRC (Read error: Operation timed out) |
|
22:36
🔗
|
Sanqui |
where www. and non-www domains exist, which one do we prefer |
|
22:36
🔗
|
Sanqui |
and does wayback show when the "other version" is available? |
|
22:37
🔗
|
atphoenix |
Sanqui, I have not seen WBM offer alternative equivalent domains even when they exist. IME you need to know about them. |
|
22:37
🔗
|
Sanqui |
darn |
|
22:37
🔗
|
JAA |
The WBM strips the protocol, any authentication data, and a leading "www." from all URLs when you access it. |
|
22:38
🔗
|
JAA |
It's just not very obvious. |
|
22:38
🔗
|
Sanqui |
oh! |
|
22:38
🔗
|
Sanqui |
that's good to know |
|
22:38
🔗
|
Sanqui |
that weakly implies we should omit www. too |
|
22:39
🔗
|
|
bitbit has joined #archiveteam-bs |
|
22:39
🔗
|
JAA |
For example, https://web.archive.org/web/20200308101216/http://example.org/ can also be accessed as https://web.archive.org/web/20200308101216/something://foo:bar@www.example.org/ |
|
22:39
🔗
|
JAA |
Which means that in the rare case of the www domain serving different content from the non-www, you get it all mixed on the WBM. |
|
22:40
🔗
|
Sanqui |
I know WBM also ignores case, which is arguably a bigger issue. |
|
22:40
🔗
|
Sanqui |
And probably http and https. |
|
22:40
🔗
|
JAA |
Yep |
|
22:40
🔗
|
Sanqui |
But well, as long as the underlying WARCs are sane, it can all be fixed one day. |
|
22:40
🔗
|
Sanqui |
I'm going to be stripping out www. then |
|
22:40
🔗
|
JAA |
(Narrator: It was never fixed.) |
|
22:40
🔗
|
Sanqui |
lol. |
|
22:41
🔗
|
JAA |
The case collapsing is a serious issue, cf. Picosong for example. |
|
22:41
🔗
|
VADemon |
There was an article in favor of keeping "www" but i cant find it because search engines |
|
22:41
🔗
|
Sanqui |
There are articles against keeping www, for keeping www, and for using double www.www. |
|
22:42
🔗
|
|
wyatt8740 has joined #archiveteam-bs |
|
22:42
🔗
|
JAA |
http://no-www.org/ and https://web.archive.org/web/20081217064517/http://www.www.extra-www.org/ from back in the day. :-) |
|
22:43
🔗
|
JAA |
I'm in the former camp. |
|
22:43
🔗
|
JAA |
But for archival, well, whatever's appropriate for the particular site. Sometimes there are hard links with a particular spelling in the HTML and the site is served on both, so that then prescribes which to use. |
|
22:43
🔗
|
JAA |
If both versions are used, welp. |
|
22:44
🔗
|
JAA |
(At least when talking about AB. Custom tools can of course do recursion on a set of domains instead.) |
|
22:47
🔗
|
Sanqui |
Yup, yup. |
|
22:47
🔗
|
Sanqui |
Cheers. |
|
22:53
🔗
|
atphoenix |
betamax, themadpro OrIdow6 : on mixital: #mixedup #mixitdown #mixitnone #unmixed #unmixit |
|
22:53
🔗
|
bitbit |
urls should are case sensitive? |
|
22:54
🔗
|
nicolas17 |
URLs are case sensitive except for the hostname |
|
22:54
🔗
|
nicolas17 |
seems WBM works differently :) |
|
22:54
🔗
|
atphoenix |
URL case sensitivity depends on the web server. Some web servers can be configured to be case-agnostic |
|
22:55
🔗
|
nicolas17 |
a webserver running on Windows and serving static files will probably be case-insensitive in practice |
|
22:55
🔗
|
bitbit |
yikes. I think I messed up with a proxy I wrote a while ago for a workplace then |
|
22:55
🔗
|
nicolas17 |
(I think IIS used to allow backslashes in place of slashes too D:) |
|
22:55
🔗
|
JAA |
*URLs* are always case sensitive. What the web server does with a particular HTTP request is unrelated to URL syntax. |
|
22:55
🔗
|
nicolas17 |
^ |
|
22:56
🔗
|
|
themadpro has quit IRC (Quit: This computer has gone to sleep) |
|
22:56
🔗
|
atphoenix |
^that says it better |
|
22:56
🔗
|
bitbit |
double www sounds pretty crazy |
|
22:56
🔗
|
atphoenix |
and double www is crazy |
|
22:57
🔗
|
nicolas17 |
yesterday I reviewed *.kde.org and there's some www -> no www redirects and some the other way |
|
22:58
🔗
|
nicolas17 |
er |
|
22:58
🔗
|
nicolas17 |
kde domains* |
|
22:58
🔗
|
JAA |
Yes, double www is crazy. One should use as many wwws as DNS allows. |
|
22:58
🔗
|
nicolas17 |
calligra.org -> www.calligra.org; www.kde.org -> kde.org |
|
22:58
🔗
|
atphoenix |
anyhow Apache webservers can do lots of fancy stuff with mod_redirect |
|
22:59
🔗
|
atphoenix |
I mean mod_rewrite |
|
23:00
🔗
|
nicolas17 |
mod_rewrite is voodoo |
|
23:00
🔗
|
JAA |
And this discussion is getting into -ot territory. |
|
23:00
🔗
|
nicolas17 |
according to its official documentation |
|
23:00
🔗
|
bitbit |
I'm pretty surprised that "we" managed to get the entire world to even accept "https://www." ... it's such an awkward technical thing to have on every billboard, etc. |
|
23:00
🔗
|
nicolas17 |
JAA: true |
|
23:13
🔗
|
betamax |
in the interests of making progress with Mixital, I've made a channel on hackint: #missitall |
|
23:13
🔗
|
betamax |
cc: themadpro, OrIdow6, JAA, PurpleSym |
|
23:14
🔗
|
JAA |
On hackint, I hope? |
|
23:14
🔗
|
betamax |
yup |
|
23:14
🔗
|
JAA |
:-) |
|
23:16
🔗
|
|
jamrs has quit IRC (Read error: Operation timed out) |
|
23:21
🔗
|
|
BlueMax has joined #archiveteam-bs |