Time |
Nickname |
Message |
01:57
🔗
|
|
dashcloud has joined #archiveteam-ot |
03:00
🔗
|
|
systwiALT has joined #archiveteam-ot |
03:09
🔗
|
|
ephemer0l has joined #archiveteam-ot |
03:54
🔗
|
systwiALT |
ivan_: So I got postgres installed and running under El Capitan and I am slowly learning how things work (like I said, first time messing with SQL stuff :-L ) |
03:57
🔗
|
systwiALT |
My only question is, how do I organize my hierarchical data in this? Spreadsheet-style databases (there's probably a better term for them, idk) don't really [seem] to fit the bill |
03:58
🔗
|
|
m007a83_ has joined #archiveteam-ot |
03:59
🔗
|
|
m007a83_ has quit IRC (Client Quit) |
04:00
🔗
|
|
m007a83 has quit IRC (Ping timeout: 252 seconds) |
04:49
🔗
|
systwiALT |
Additionally, is there really much of a problem using hyphens in column names? I tried researching but wasn't exactly certain. I can use ALTER TABLE youtube_db ADD "foo-bar" varchar; with no problems, but it sounds like it could be? |
05:57
🔗
|
|
m007a83 has joined #archiveteam-ot |
06:20
🔗
|
|
Raccoon has quit IRC (Ping timeout: 258 seconds) |
06:23
🔗
|
|
Raccoon has joined #archiveteam-ot |
06:29
🔗
|
|
Raccoon has quit IRC (Ping timeout: 360 seconds) |
07:12
🔗
|
ivan_ |
I gave you an outline of a schema earlier |
07:13
🔗
|
ivan_ |
Yes, they're "spreadsheet style" |
07:16
🔗
|
ivan_ |
You can use more than one table to express different entities |
07:16
🔗
|
ivan_ |
You have channels, videos, and files |
07:18
🔗
|
ivan_ |
You can look up videos for a channel, then look up files for a video, and this is all fast with indexes |
07:19
🔗
|
ivan_ |
The primary key for a table is important and should be based on what makes the entity unique for that table |
07:19
🔗
|
ivan_ |
Re: dashes I would just use underscores |
08:07
🔗
|
|
Dragnog2 has joined #archiveteam-ot |
08:21
🔗
|
systwiALT |
I'm sorry I wasn't online when you explained, my connection dropped and all previous messages cleared. |
08:27
🔗
|
ivan_ |
systwiALT: |
08:27
🔗
|
ivan_ |
<JAA> I'd just store that in a simple relational DB with three tables (or more, depending on what else you want to store): channels, videos, and files. |
08:27
🔗
|
ivan_ |
<ivan_> to flesh that out: channels (channel_id, username, display_name) PK channel_id; videos (video_id, channel_id, video_title, ... .info.json stuff) PK video_id; files (video_id, file) PK (video_id, file) |
08:27
🔗
|
ivan_ |
<ivan_> add a channel_id index on videos |
08:27
🔗
|
ivan_ |
<ivan_> you can query to get videos in a channel, files for a video, etc |
08:27
🔗
|
ivan_ |
<ivan_> PK is primary key |
08:29
🔗
|
ivan_ |
when you have a primary key on something, it is indexed and you can do fast queries on it |
08:30
🔗
|
ivan_ |
you can also add indexes that aren't the primary key |
08:30
🔗
|
ivan_ |
also, when you have a primary key or index on (a, b, c) you can do fast lookups on (a, b), and (a) as well because of how the index is structured |
08:39
🔗
|
|
deevious has quit IRC (Remote host closed the connection) |
09:41
🔗
|
|
dhyan_nat has quit IRC (Read error: Operation timed out) |
09:43
🔗
|
|
bluefoo has quit IRC (Ping timeout: 615 seconds) |
09:58
🔗
|
|
dhyan_nat has joined #archiveteam-ot |
10:15
🔗
|
|
dhyan_nat has quit IRC (Read error: Operation timed out) |
10:56
🔗
|
JAA |
systwiALT: Also, logs of (some) AT IRC channels: http://archive.fart.website/bin/irclogger_logs |
11:06
🔗
|
|
Dragnog2 has quit IRC (Quit: Connection closed for inactivity) |
11:41
🔗
|
|
dhyan_nat has joined #archiveteam-ot |
12:19
🔗
|
|
BlueMax has quit IRC (Read error: Connection reset by peer) |
13:44
🔗
|
|
dhyan_nat has quit IRC (Read error: Operation timed out) |
16:03
🔗
|
ivan_ |
systwiALT: another thing to think about is whether you want multiple variations of a video and how you're going to store that on disk and in the database |
16:04
🔗
|
ivan_ |
you would presumably want a way to have multiple rows in videos would mean something like a (video_id text, variant smallint) PK |
16:14
🔗
|
|
ShellyRol has quit IRC (Ping timeout: 745 seconds) |
16:38
🔗
|
JAA |
+ retrieval date most likely |
16:38
🔗
|
JAA |
Although the video files themselves can't get changed (I think?), the remaining info can I believe. |
16:38
🔗
|
JAA |
You'd probably want to keep track of the timeline. |
16:55
🔗
|
|
systwiALT has quit IRC (Read error: Operation timed out) |
16:59
🔗
|
|
ShellyRol has joined #archiveteam-ot |
17:28
🔗
|
ivan_ |
JAA: videos can be trimmed by anyone, modified in other ways with some kind of youtube partner status |
17:28
🔗
|
ivan_ |
yeah (video_id, retrieval_date) is a better key |
17:40
🔗
|
JAA |
Huh, interesting. |
17:41
🔗
|
|
DogsRNice has joined #archiveteam-ot |
17:44
🔗
|
ivan_ |
and sometimes youtube reencodes videos with a new broken encoder and either the video or audio don't play properly any more :-) :-) |
18:57
🔗
|
|
killsushi has joined #archiveteam-ot |
20:24
🔗
|
|
DogsRNice has quit IRC (Read error: Connection reset by peer) |
20:57
🔗
|
|
coderobe has quit IRC (Remote host closed the connection) |
21:37
🔗
|
|
coderobe has joined #archiveteam-ot |
22:28
🔗
|
|
Mateon1 has quit IRC (Read error: Operation timed out) |
22:28
🔗
|
|
Mateon1 has joined #archiveteam-ot |
22:42
🔗
|
|
Mateon1 has quit IRC (Ping timeout: 745 seconds) |
22:43
🔗
|
|
Mateon1 has joined #archiveteam-ot |
23:06
🔗
|
|
godane has quit IRC (Ping timeout: 258 seconds) |
23:10
🔗
|
|
bluefoo has joined #archiveteam-ot |
23:13
🔗
|
|
t3 has joined #archiveteam-ot |
23:14
🔗
|
|
t3 has quit IRC (Client Quit) |
23:29
🔗
|
|
BlueMax has joined #archiveteam-ot |
23:29
🔗
|
|
godane has joined #archiveteam-ot |
23:29
🔗
|
|
t3 has joined #archiveteam-ot |