#archiveteam-bs 2020-06-19,Fri

↑back Search

Time Nickname Message
00:23 🔗 SilSte has quit IRC (Ping timeout: 745 seconds)
00:27 🔗 Raccoon has joined #archiveteam-bs
00:53 🔗 jason0597 has quit IRC (Read error: Operation timed out)
01:03 🔗 odemgi JAA, another please: https://www.raphnet.net/
01:03 🔗 JAA Ryz: ^ Can you take care of this please?
01:05 🔗 Ryz Hello odemgi, what is your reason for wanting https://www.raphnet.net/ to be archived?
01:05 🔗 odemgi valuable old tech guides/hacks
01:07 🔗 odemgi we need reasons now, are we running out of storage? xD
01:09 🔗 Ryz It's running right now; it's more of trying to get a better idea of said goods to be archived
01:09 🔗 Ryz Can't just blindly take in the requested archive
01:11 🔗 odemgi I'll stick me contacts in, *blinks*, ehh, she's a small one. Someone asked me to mirror it at the-eye but I wanted to make sure it's on wayback too :)
01:31 🔗 BlueMax has quit IRC (Remote host closed the connection)
01:31 🔗 BlueMax has joined #archiveteam-bs
01:32 🔗 BlueMax has quit IRC (Remote host closed the connection)
01:33 🔗 BlueMax has joined #archiveteam-bs
01:34 🔗 BlueMax has quit IRC (Remote host closed the connection)
01:35 🔗 BlueMax has joined #archiveteam-bs
01:35 🔗 BlueMax has quit IRC (Read error: Connection reset by peer)
01:36 🔗 BlueMax has joined #archiveteam-bs
01:37 🔗 BlueMax has quit IRC (Remote host closed the connection)
01:37 🔗 BlueMax has joined #archiveteam-bs
01:38 🔗 BlueMax has quit IRC (Remote host closed the connection)
01:39 🔗 BlueMax has joined #archiveteam-bs
01:55 🔗 OrIdow6 has quit IRC (Remote host closed the connection)
01:57 🔗 OrIdow6 has joined #archiveteam-bs
02:32 🔗 OrIdow6 Mapillary:
02:34 🔗 OrIdow6 Started in 2013 as a "croudsourced" website where people upload geotagged photos (cf. Panoramio)
02:35 🔗 OrIdow6 Pictures licensed under some CC license that made commercial use difficult (they changed from NC to ND in 2014); the company made money by selling exceptions
02:37 🔗 OrIdow6 Used a lot by the OpenStreetMap people
02:38 🔗 OrIdow6 It was bought by Facebook recently, and they've changed the terms to allow commercial use without charging anyone
02:39 🔗 OrIdow6 To try to answer: "Mapillary is shutting down???"
02:39 🔗 OrIdow6 People speculate that the reason Facebook is doing this is to try to compete with Google Street View
02:41 🔗 OrIdow6 Looking at their homepage, they've deemphasized the crowdsourcing aspect; there's a lot of emphasis on how (as far as I can tell) they're getting imagery by putting cameras on ridesharing cars and paying the drivers
02:41 🔗 OrIdow6 Though how successful they've been in doing this, I don't know
02:42 🔗 OrIdow6 And I think the most worrying thing is that Facebook has made all the images free, which means that they're no longer making money the old way
02:45 🔗 OrIdow6 Worst case is that Facebook puts all the stress on the ridesharing thing, crowdsourcing becomes an artifact of the past, and they close it all down
02:45 🔗 nicolas17 https://www.archiveteam.org/index.php?title=Mapillary
02:47 🔗 OrIdow6 Best case is that they see crowdsourcing - presumably by people who don't have Facebook accounts - as being viable, and continue with it
02:47 🔗 nicolas17 I don't think the homepage has changed in a while
02:47 🔗 OrIdow6 nicolas17: I know, I'm trying to see if there's any discussion to be had here
02:48 🔗 OrIdow6 nicolas17: Are you one of the OSM people?
02:48 🔗 nicolas17 yes, and I was also sent Mapillary stuff (such as windshield phone mounts) to distribute
02:48 🔗 nicolas17 and I have uploaded 100k photos
02:49 🔗 OrIdow6 What do you think Facebook is going to do?
02:50 🔗 nicolas17 I said I don't think there is an *imminent* risk of disappearing, and got this reply
02:50 🔗 nicolas17 <Ryz> on the topic of Mapillary and whether things will immediately change when it was announced they were acquired, eeh, I feel there should be some proactive action, because from experience archiving companies being acquired and their internet presence, they just bothered to change or outright delete information
02:51 🔗 nicolas17 they are likely to change their marketing text (and we'd want to ensure the old version of the website is archived) but not likely to kill the platform in the near future
02:51 🔗 nicolas17 what will happen a year from now, I don't know
02:52 🔗 nicolas17 either way, I don't think AT has the resources to archive that much data :/
02:53 🔗 JAA ~300 TB would probably be feasible *if* IA wants it.
02:55 🔗 OrIdow6 Yeah, it's more whether the IA wants to spend millions storing the data
02:56 🔗 nicolas17 it's crazy scale
02:56 🔗 nicolas17 OrIdow6: if you could get a list of all photo IDs and their lat/lon coordinates, and store it in efficient binary format (eg. 8 byte doubles, not text), the *list* would take 36GB
02:58 🔗 JAA nicolas17: We're all reading the main channel. :-)
02:58 🔗 nicolas17 I didn't know if OrIdow6 was here before
02:58 🔗 JAA If that 300 TB estimate is accurate, it's big but far from the biggest we've done (Google+ at 1.4 PiB).
02:59 🔗 OrIdow6 I'm always here
02:59 🔗 OrIdow6 Usually
03:02 🔗 nicolas17 the photo count is on the home page (1,195,679,315), the "300KB/photo" figure is my guess from looking at very few actual filesizes so I don't know how accurate it is
03:03 🔗 lennier2 has joined #archiveteam-bs
03:04 🔗 OrIdow6 And yeah, the 36 GB list may have impressed me in the past, but not anymore
03:08 🔗 OrIdow6 In any case, I think that 300 TB is too big for a proactive project, given the lack of anything concrete
03:08 🔗 OrIdow6 Someone should probably throw it into AB, though
03:09 🔗 OrIdow6 But it's something to monitor
03:11 🔗 lennier1 has quit IRC (Read error: Operation timed out)
03:11 🔗 lennier2 is now known as lennier1
03:19 🔗 nicolas17 I just downloaded the first picture of the most recent 140 sequences I uploaded, average size 410KB
03:20 🔗 OrIdow6 So about 446 TiB
03:23 🔗 tar-xvf_ has joined #archiveteam-bs
03:25 🔗 OrIdow6 has quit IRC (Quit: Leaving.)
03:25 🔗 Datechnom If there is no copies of it on IA then we should really get it in there
03:25 🔗 odemgi has quit IRC (Ping timeout: 265 seconds)
03:27 🔗 nicolas17 I'm also not sure how to enumerate all photos
03:28 🔗 nicolas17 maybe cover the whole map doing searches by geographical area
03:32 🔗 JAA Is there any way to browse their data without signing up or using special software?
03:35 🔗 nicolas17 yes
03:36 🔗 nicolas17 to get my last 140 sequences as I did earlier, I opened my browser dev tools and went to the tab with my uploads, then messed with the API parameters and ended up with this:
03:36 🔗 nicolas17 curl -v --globoff 'https://a.mapillary.com/v3/model.json?client_id=MkJKbDA0bnZuZlcxeTJHTmFqN3g1dzo1YTM0NjRkM2EyZGU5MzBh&paths=[["sequencesByUserKey","aoZLD95CuCzEWNLXWNiihw",{"from":0,"to":140},["first_key","key"]]]&method=get'
03:37 🔗 JAA Ah yeah, I found it at https://www.mapillary.com/app/
03:37 🔗 nicolas17 oh you meant as an end user :P
03:38 🔗 JAA Yeah, to see how to search for e.g. all images within some lat/lon box.
03:42 🔗 nicolas17 afaik when you scroll through the map they use mapbox vector tiles for the background map, and another map layer in the same vector tile format for the images
03:43 🔗 JAA Yeah, I don't think we need to care about the map at all since that's just a render of OSM data.
03:43 🔗 nicolas17 indeed, and that doesn't even come from their server anyway
03:43 🔗 qw3rty_ has joined #archiveteam-bs
03:44 🔗 nicolas17 just pointing out that the map layer with photos (which does come from mapillary's server) is in the same format
03:44 🔗 JAA Right :-)
03:46 🔗 kiska Calculate the cost for the data at $1700 per TB of data to be stored at IA forever
03:46 🔗 OrIdow6 has joined #archiveteam-bs
03:46 🔗 kiska 13:46:27 <kiska> Calculate the cost for the data at $1700 per TB of data to be stored at IA forever
03:47 🔗 kiska XD
03:47 🔗 JAA First step would anyway be to gather metadata, and that would be fairly small.
03:48 🔗 JAA Then we can ask IA whether they want a billion pictures of the world with a size estimate. :-P
03:48 🔗 kiska If anyone wants to upload a 35GB file to me... Don't
03:48 🔗 kiska I don't have the storage for that xD
03:48 🔗 OrIdow6 cat /dev/urandom | curl
03:49 🔗 JAA cat /dev/zero | curl
03:49 🔗 kiska XD
03:49 🔗 JAA There's no compression or similar involved. :-P
03:49 🔗 kiska ./transfersh --basedir /dev/null
03:49 🔗 JAA Oof
03:49 🔗 JAA Anyway
03:51 🔗 qw3rty has quit IRC (Read error: Operation timed out)
04:29 🔗 Ryz I've launched archives of https://www.mapillary.com/ in regards to their social media, their GitHub account, and their sub-domains, but not the main domain that is https://www.mapillary.com/ yet (and their forums)
04:30 🔗 Ryz A bit curious to see they didn't mention being acquired by Facebook on their social media when it happened
05:08 🔗 nicolas17 yeah interesting that there was no tweet from them
05:08 🔗 OrIdow6 Thanks Ryz
05:09 🔗 Ryz Again, I haven't run the main domain and their forum yet, mainly because I don't know how the latter is, and don't really want ArchiveBot presence to be too much for 'em when running the sub-domains I guess <#>;
05:10 🔗 nicolas17 https://www.openstreetmap.org/user/jesolem/diary/393358
05:21 🔗 OrIdow6 Interesting that Facebook has already "invested" somewhat in OpenStreetMap
05:22 🔗 OrIdow6 Supports the "Google Maps competitor" idea
05:31 🔗 OrIdow6 And seems to me to give Mapillary a better chance, if that's what this is about
05:31 🔗 lennier2 has joined #archiveteam-bs
05:35 🔗 lennier1 has quit IRC (Ping timeout: 272 seconds)
05:35 🔗 lennier2 is now known as lennier1
05:36 🔗 nicolas17 has quit IRC (Quit: Konversation terminated!)
05:36 🔗 pew has quit IRC (Ping timeout: 265 seconds)
05:36 🔗 pew has joined #archiveteam-bs
06:29 🔗 Rush has joined #archiveteam-bs
06:34 🔗 Rush has quit IRC (Ping timeout: 252 seconds)
06:37 🔗 HCross kiska: this is the reason my 100TB box is being built
07:53 🔗 VoynichCr have you seen the "Add successor" github feature?
07:54 🔗 VoynichCr https://help.github.com/en/github/setting-up-and-managing-your-github-user-account/maintaining-ownership-continuity-of-your-user-accounts-repositories
09:08 🔗 wessel152 has quit IRC (Read error: Operation timed out)
09:20 🔗 Datechnom HCross Where is that going to best hosted? Built and then Colocated in a DC i assume?
09:21 🔗 nyany has quit IRC (Read error: Operation timed out)
09:26 🔗 nyany has joined #archiveteam-bs
09:26 🔗 nyany has quit IRC (Excess Flood)
09:27 🔗 nyany has joined #archiveteam-bs
09:36 🔗 HCross Datechnom: yes, hopefully this month
09:40 🔗 Datechnom Thats awesome. Guessing it will be one big all rsync target and worker node?
09:40 🔗 Datechnom what are the spec's? HCross
09:40 🔗 HCross yeah, 100TB of disk, 2x decent E5, RAM x LOTS (192GB afaik)
09:41 🔗 Datechnom Datechnoman approves of this purchase :D
10:45 🔗 wessel152 has joined #archiveteam-bs
11:08 🔗 lunik13 has quit IRC (Quit: :x)
11:11 🔗 lunik13 has joined #archiveteam-bs
11:27 🔗 HCross Datechnom: mainly not target, that's my next plan - this is Long Term Storage
12:30 🔗 dashcloud has quit IRC (Ping timeout: 610 seconds)
12:45 🔗 BlueMax has quit IRC (Read error: Connection reset by peer)
13:26 🔗 jason0597 has joined #archiveteam-bs
14:46 🔗 Arcorann has quit IRC (Read error: Connection reset by peer)
15:27 🔗 dashcloud has joined #archiveteam-bs
15:35 🔗 tar-xvf_ has quit IRC (Quit: Leaving)
15:38 🔗 systwi_ has quit IRC (Read error: Connection reset by peer)
15:39 🔗 systwi has joined #archiveteam-bs
15:42 🔗 dashcloud has quit IRC (Read error: Operation timed out)
15:51 🔗 sirvy has quit IRC (Quit: WeeChat 2.3)
15:54 🔗 sirvy has joined #archiveteam-bs
16:10 🔗 sHATNER has quit IRC (Ping timeout: 272 seconds)
17:15 🔗 dashcloud has joined #archiveteam-bs
17:23 🔗 nicolas17 has joined #archiveteam-bs
17:32 🔗 dashcloud has quit IRC (Read error: Operation timed out)
18:06 🔗 lennier2 has joined #archiveteam-bs
18:09 🔗 mgrytbak has quit IRC (Ping timeout: 272 seconds)
18:09 🔗 synm0nger has joined #archiveteam-bs
18:10 🔗 lennier1 has quit IRC (Ping timeout: 272 seconds)
18:10 🔗 Laverne has quit IRC (Ping timeout: 272 seconds)
18:10 🔗 lennier2 is now known as lennier1
18:10 🔗 brayden has quit IRC (Ping timeout: 272 seconds)
18:10 🔗 SynMonger has quit IRC (Ping timeout: 272 seconds)
18:12 🔗 mgrytbak has joined #archiveteam-bs
18:20 🔗 dashcloud has joined #archiveteam-bs
18:31 🔗 tomaspark has joined #archiveteam-bs
18:41 🔗 BartoCH has quit IRC (Quit: WeeChat 2.7)
18:47 🔗 dashcloud has quit IRC (Read error: Operation timed out)
18:51 🔗 dashcloud has joined #archiveteam-bs
19:06 🔗 dashcloud has quit IRC (Read error: Operation timed out)
19:15 🔗 Stilett0 has joined #archiveteam-bs
19:15 🔗 Stilett0 has quit IRC (Client Quit)
19:17 🔗 brayden has joined #archiveteam-bs
19:18 🔗 Laverne has joined #archiveteam-bs
19:42 🔗 larryv has joined #archiveteam-bs
19:47 🔗 jason0597 has quit IRC (Read error: Operation timed out)
19:51 🔗 Aoede has joined #archiveteam-bs
19:53 🔗 sHATNER has joined #archiveteam-bs
20:17 🔗 lennier2 has joined #archiveteam-bs
20:23 🔗 lennier1 has quit IRC (Read error: Operation timed out)
20:23 🔗 lennier2 is now known as lennier1
20:52 🔗 dashcloud has joined #archiveteam-bs
21:02 🔗 lennier2 has joined #archiveteam-bs
21:08 🔗 lennier1 has quit IRC (Read error: Operation timed out)
21:08 🔗 lennier2 is now known as lennier1
21:34 🔗 jason0597 has joined #archiveteam-bs
22:08 🔗 bsmith093 has quit IRC (Read error: Operation timed out)
22:22 🔗 bsmith093 has joined #archiveteam-bs
22:43 🔗 ephemer0l has quit IRC (Quit: http://quassel-irc.org - Chat comfortably. Anywhere.)
22:53 🔗 ephemer0l has joined #archiveteam-bs
23:35 🔗 BlueMax has joined #archiveteam-bs
23:36 🔗 Maylay has quit IRC (Ping timeout: 610 seconds)
23:37 🔗 HP_Archiv has joined #archiveteam-bs
23:46 🔗 wyatt8740 has joined #archiveteam-bs
23:47 🔗 wyatt8750 has joined #archiveteam-bs
23:47 🔗 wyatt8740 has quit IRC (Read error: Operation timed out)
23:47 🔗 wyatt8750 is now known as wyatt8740
23:54 🔗 Maylay has joined #archiveteam-bs
23:54 🔗 Maylay has quit IRC (Remote host closed the connection!)

irclogger-viewer