| Time |
Nickname |
Message |
|
00:05
🔗
|
|
mismatch has joined #archiveteam-bs |
|
00:58
🔗
|
|
JesseW has joined #archiveteam-bs |
|
01:01
🔗
|
|
xXx_ndidd has quit IRC (Ping timeout: 633 seconds) |
|
01:20
🔗
|
bsmith093 |
JesseW: how goes the upload? and the csv |
|
01:20
🔗
|
JesseW |
the csvs have finished -- I need to load them into a database. |
|
01:22
🔗
|
JesseW |
the upload has also finished, after a total of about 63 hours. |
|
01:26
🔗
|
JesseW |
I broke the csvs into separate files per directory -- I'm calculating the total size now. |
|
01:34
🔗
|
godane |
so i'm close to having all of gawker.com sitemap |
|
01:37
🔗
|
|
bwn_ has joined #archiveteam-bs |
|
01:39
🔗
|
JesseW |
godane: nice! |
|
01:49
🔗
|
|
bwn has quit IRC (Read error: Operation timed out) |
|
01:57
🔗
|
JesseW |
bsmith093: so the total size of the CSVs is 3.5GB |
|
01:58
🔗
|
bsmith093 |
holy crap. will anything even read that massive sql thing?! |
|
02:06
🔗
|
JesseW |
3.5GB of data in a sql database isn't particularly large. |
|
02:07
🔗
|
JesseW |
It might be painful for sqlite (maybe), but not for other databases. |
|
02:35
🔗
|
* |
JesseW would like help/a listening chatroom (does that make sense?) in figuring out how to filter leaf nodes out of an adjacency list... |
|
02:36
🔗
|
JesseW |
I have a list of IA identifier -> collection it is in, and I want to filter out non-collections (i.e. leaf nodes) without loading the whole thing into a graph system |
|
03:02
🔗
|
Frogging |
being a CS student that sounds like something I should know |
|
03:02
🔗
|
Frogging |
but alas |
|
03:39
🔗
|
JesseW |
heh |
|
04:07
🔗
|
|
tomwsmf-a has quit IRC (Ping timeout: 258 seconds) |
|
04:09
🔗
|
JesseW |
Amusing item: https://archive.org/metadata/ia-das/metadata -- a collection which is a member of itself. |
|
04:38
🔗
|
|
bwn has joined #archiveteam-bs |
|
04:48
🔗
|
|
bwn_ has quit IRC (Read error: Operation timed out) |
|
05:17
🔗
|
remsen |
.j #justsolve |
|
05:17
🔗
|
remsen |
Shit. |
|
05:34
🔗
|
JesseW |
remsen: ? |
|
05:35
🔗
|
* |
JesseW has solved the graph problem I mentioned; it turned out just listing all the internal nodes, then running over the list again worked fine. |
|
05:35
🔗
|
JesseW |
I was going through the IA census collections data -- it turns out there are about 16,000 collections. |
|
05:35
🔗
|
remsen |
JesseW, command fuckup! I need now to log off and toss my modem into the garbage. |
|
05:38
🔗
|
JesseW |
sympathy. modems -> :-( |
|
05:40
🔗
|
remsen |
My new modem/router combo (!!!) from TWC is actually an upgrade from the Linksys one I bought myself. |
|
05:41
🔗
|
remsen |
Well, the one that was purchased for the household. |
|
05:43
🔗
|
remsen |
It actually has decent default security too. Good on Arris. |
|
05:44
🔗
|
remsen |
It's obviously leased so I can't flash it. |
|
05:53
🔗
|
|
Sk1d has quit IRC (Ping timeout: 250 seconds) |
|
05:55
🔗
|
|
JesseW has quit IRC (Remote host closed the connection) |
|
05:56
🔗
|
|
Honno has joined #archiveteam-bs |
|
05:57
🔗
|
|
JesseW has joined #archiveteam-bs |
|
06:01
🔗
|
|
Sk1d has joined #archiveteam-bs |
|
06:26
🔗
|
godane |
i'm starting to uploading Sky & Telescope: https://archive.org/details/Sky_and_Telescope_1941-11-cbr |
|
06:27
🔗
|
godane |
the cbr files |
|
06:28
🔗
|
godane |
your going to get cbr and pdf collections of it |
|
06:29
🔗
|
godane |
this is mostly cause there could be gaps in both collections |
|
07:09
🔗
|
|
Start has quit IRC (Ping timeout: 260 seconds) |
|
07:36
🔗
|
|
JesseW has quit IRC (Quit: Leaving.) |
|
07:41
🔗
|
|
JesseW has joined #archiveteam-bs |
|
08:05
🔗
|
|
schbirid has joined #archiveteam-bs |
|
08:09
🔗
|
|
mismatch has quit IRC (Ping timeout: 633 seconds) |
|
08:16
🔗
|
|
JesseW has quit IRC (Quit: Leaving.) |
|
08:54
🔗
|
|
marvinw has quit IRC (Ping timeout: 633 seconds) |
|
09:21
🔗
|
godane |
I'm starting to upload gawker.com sitemap for 2005 |
|
10:01
🔗
|
|
marvinw has joined #archiveteam-bs |
|
10:04
🔗
|
|
bwn has quit IRC (Read error: Operation timed out) |
|
10:12
🔗
|
|
bwn has joined #archiveteam-bs |
|
10:20
🔗
|
|
marvinw has quit IRC (Read error: Connection reset by peer) |
|
10:29
🔗
|
|
marvinw has joined #archiveteam-bs |
|
10:45
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
|
10:59
🔗
|
godane |
SketchCow: the collection for this item is a item: https://archive.org/details/Lifehacker_Extra_17 |
|
11:00
🔗
|
godane |
https://archive.org/details/lifehacker |
|
11:00
🔗
|
godane |
may want to change it to lifehacker-extra or rev3-lifehacker-extra |
|
11:02
🔗
|
godane |
SketchCow: also some got dark being mark spam: https://archive.org/details/Lifehacker_Extra_1 |
|
11:16
🔗
|
arkiver |
godane: that's awesome! https://archive.org/details/Sky_and_Telescope_1941-11-cbr |
|
11:16
🔗
|
arkiver |
will you upload all years? |
|
11:23
🔗
|
godane |
up to 2009 |
|
11:25
🔗
|
godane |
i'm up to this far : https://archive.org/details/Sky_and_Telescope_1960-12-cbr |
|
11:29
🔗
|
godane |
so looks like cbr files have no gaps |
|
11:29
🔗
|
godane |
it was only 1949 in the pdf format that had a gap |
|
11:29
🔗
|
godane |
mostly cause there is only 1 1949 magazine in pdf format |
|
13:18
🔗
|
|
Stiletto has quit IRC (Ping timeout: 260 seconds) |
|
13:19
🔗
|
|
Famicoma1 has quit IRC (Ping timeout: 260 seconds) |
|
13:19
🔗
|
|
Muad-Dib has quit IRC (Ping timeout: 260 seconds) |
|
13:19
🔗
|
|
Stiletto has joined #archiveteam-bs |
|
13:21
🔗
|
|
Famicoma1 has joined #archiveteam-bs |
|
13:26
🔗
|
|
vitzli has joined #archiveteam-bs |
|
13:31
🔗
|
|
Famicoma1 has quit IRC (Remote host closed the connection) |
|
13:31
🔗
|
|
Famicoma1 has joined #archiveteam-bs |
|
13:32
🔗
|
|
Muad-Dib has joined #archiveteam-bs |
|
14:37
🔗
|
|
metalcamp has joined #archiveteam-bs |
|
15:24
🔗
|
|
JesseW has joined #archiveteam-bs |
|
15:38
🔗
|
|
JesseW has quit IRC (Quit: Leaving.) |
|
16:05
🔗
|
|
altlabel has joined #archiveteam-bs |
|
16:11
🔗
|
|
zino has quit IRC (Read error: Operation timed out) |
|
16:12
🔗
|
|
Start has joined #archiveteam-bs |
|
16:17
🔗
|
|
vitzli has quit IRC (Leaving) |
|
19:36
🔗
|
|
bwn has quit IRC (Read error: Operation timed out) |
|
20:03
🔗
|
|
bwn has joined #archiveteam-bs |
|
20:31
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
|
21:02
🔗
|
|
mksplg has quit IRC (Ping timeout: 260 seconds) |
|
21:05
🔗
|
|
Rickster has quit IRC (Remote host closed the connection) |
|
21:07
🔗
|
|
Rickster has joined #archiveteam-bs |
|
21:14
🔗
|
|
mksplg has joined #archiveteam-bs |
|
21:15
🔗
|
|
zino has joined #archiveteam-bs |
|
21:22
🔗
|
|
Stiletto is now known as Stilett0 |
|
21:39
🔗
|
|
bauruine has quit IRC (Ping timeout: 260 seconds) |
|
21:42
🔗
|
|
Stilett0 has quit IRC (Read error: Operation timed out) |
|
21:48
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
|
21:53
🔗
|
|
bauruine has joined #archiveteam-bs |
|
22:02
🔗
|
|
Stiletto has joined #archiveteam-bs |
|
22:06
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
|
22:06
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
|
22:18
🔗
|
|
Honno has quit IRC (Ping timeout: 492 seconds) |
|
23:14
🔗
|
|
toad2 has quit IRC (Read error: Operation timed out) |
|
23:14
🔗
|
|
toad1 has joined #archiveteam-bs |
|
23:27
🔗
|
|
Stiletto has quit IRC (Ping timeout: 260 seconds) |
|
23:32
🔗
|
|
metalcamp has quit IRC (Ping timeout: 244 seconds) |
|
23:39
🔗
|
|
JetBalsa has joined #archiveteam-bs |
|
23:59
🔗
|
|
ndiddy has joined #archiveteam-bs |