Time |
Nickname |
Message |
00:05
🔗
|
|
Sanqui has joined #archiveteam-bs |
00:05
🔗
|
|
Kaz has joined #archiveteam-bs |
00:05
🔗
|
|
abstract has joined #archiveteam-bs |
00:05
🔗
|
|
svchfoo1 sets mode: +o Kaz |
00:05
🔗
|
|
PotcFdk has joined #archiveteam-bs |
00:06
🔗
|
|
apache2 has joined #archiveteam-bs |
00:06
🔗
|
|
acridAxid has joined #archiveteam-bs |
00:08
🔗
|
|
atg has joined #archiveteam-bs |
00:09
🔗
|
|
riking_ has joined #archiveteam-bs |
00:09
🔗
|
|
lenary has joined #archiveteam-bs |
00:09
🔗
|
|
lenary has quit IRC (Read error: Connection reset by peer) |
00:09
🔗
|
|
bashNinja has joined #archiveteam-bs |
00:10
🔗
|
|
diggan has joined #archiveteam-bs |
00:10
🔗
|
|
diggan has quit IRC (Read error: Connection reset by peer) |
00:10
🔗
|
|
riking_ has quit IRC (Read error: Connection reset by peer) |
00:11
🔗
|
|
kpcyrd has joined #archiveteam-bs |
00:11
🔗
|
|
revi has joined #archiveteam-bs |
00:12
🔗
|
|
justcool4 has joined #archiveteam-bs |
00:13
🔗
|
|
riking_ has joined #archiveteam-bs |
00:15
🔗
|
|
diggan has joined #archiveteam-bs |
00:17
🔗
|
|
lenary has joined #archiveteam-bs |
00:19
🔗
|
|
mattl has joined #archiveteam-bs |
00:25
🔗
|
|
justcool4 is now known as justcool3 |
00:30
🔗
|
|
abartov__ has joined #archiveteam-bs |
00:31
🔗
|
|
picklefac has joined #archiveteam-bs |
00:34
🔗
|
|
DrasticAc has joined #archiveteam-bs |
00:35
🔗
|
|
pnJay has joined #archiveteam-bs |
00:38
🔗
|
|
amelia386 has joined #archiveteam-bs |
00:39
🔗
|
|
tchaypo_ has joined #archiveteam-bs |
00:39
🔗
|
|
starlord has joined #archiveteam-bs |
00:40
🔗
|
|
jesse-s has joined #archiveteam-bs |
00:41
🔗
|
|
alex73 has joined #archiveteam-bs |
00:41
🔗
|
|
Vito` has joined #archiveteam-bs |
00:42
🔗
|
|
JSharp___ has joined #archiveteam-bs |
00:43
🔗
|
|
ThisAsYou has joined #archiveteam-bs |
00:45
🔗
|
|
xit has joined #archiveteam-bs |
00:46
🔗
|
|
fallenoak has joined #archiveteam-bs |
00:47
🔗
|
|
tech234a has joined #archiveteam-bs |
00:47
🔗
|
|
hook54321 has joined #archiveteam-bs |
00:47
🔗
|
|
deathy__ has joined #archiveteam-bs |
00:47
🔗
|
|
Iglooop1 sets mode: +o hook54321 |
00:47
🔗
|
|
Ctrl-S___ has joined #archiveteam-bs |
00:50
🔗
|
|
Ivy has joined #archiveteam-bs |
00:50
🔗
|
|
c0mpass has joined #archiveteam-bs |
00:51
🔗
|
|
HCross has joined #archiveteam-bs |
00:51
🔗
|
|
horkermon has joined #archiveteam-bs |
01:28
🔗
|
|
pew has quit IRC (Ping timeout: 276 seconds) |
01:37
🔗
|
|
coderobe has joined #archiveteam-bs |
01:39
🔗
|
|
pew has joined #archiveteam-bs |
02:38
🔗
|
|
ShellyRol has quit IRC (Ping timeout: 492 seconds) |
02:42
🔗
|
|
ShellyRol has joined #archiveteam-bs |
03:17
🔗
|
|
qw3rty__ has joined #archiveteam-bs |
03:21
🔗
|
|
qw3rty_ has quit IRC (Ping timeout: 276 seconds) |
03:22
🔗
|
|
wp494 has joined #archiveteam-bs |
03:30
🔗
|
|
Fionera has joined #archiveteam-bs |
03:31
🔗
|
|
MillerBOS has quit IRC (Read error: Operation timed out) |
03:31
🔗
|
|
Fionera_ has quit IRC (Read error: Operation timed out) |
03:31
🔗
|
|
halt_ has quit IRC (Read error: Operation timed out) |
03:31
🔗
|
|
MillerBO- has joined #archiveteam-bs |
03:32
🔗
|
|
MillerBO- is now known as MillerBOS |
03:32
🔗
|
|
wyatt8740 has quit IRC (Read error: Operation timed out) |
03:32
🔗
|
|
wyatt8740 has joined #archiveteam-bs |
03:33
🔗
|
|
halt_ has joined #archiveteam-bs |
04:43
🔗
|
|
vitzli has joined #archiveteam-bs |
04:46
🔗
|
|
ShellyRol has quit IRC (Read error: Operation timed out) |
04:48
🔗
|
|
lennier1 has quit IRC (Read error: Operation timed out) |
04:49
🔗
|
|
lennier1 has joined #archiveteam-bs |
04:56
🔗
|
|
lennier2 has joined #archiveteam-bs |
05:01
🔗
|
|
ShellyRol has joined #archiveteam-bs |
05:01
🔗
|
|
lennier1 has quit IRC (Ping timeout: 496 seconds) |
05:02
🔗
|
|
lennier2 is now known as lennier1 |
08:45
🔗
|
|
Ryz has quit IRC (Read error: Connection reset by peer) |
08:45
🔗
|
|
kiska1825 has quit IRC (Read error: Connection reset by peer) |
08:45
🔗
|
|
Ryz8 has joined #archiveteam-bs |
08:45
🔗
|
|
kiska1825 has joined #archiveteam-bs |
08:57
🔗
|
|
vitzli has quit IRC (Leaving) |
09:34
🔗
|
|
BlueMax has joined #archiveteam-bs |
09:43
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
09:49
🔗
|
|
lennier2 has joined #archiveteam-bs |
09:49
🔗
|
|
Muad-Dib has joined #archiveteam-bs |
09:51
🔗
|
|
godane has quit IRC (Read error: Connection reset by peer) |
09:53
🔗
|
|
BlueMax has quit IRC (Ping timeout: 745 seconds) |
09:55
🔗
|
|
BlueMaxim has quit IRC (Read error: Connection reset by peer) |
09:57
🔗
|
|
lennier1 has quit IRC (Ping timeout: 745 seconds) |
10:06
🔗
|
|
lennier2_ has joined #archiveteam-bs |
10:06
🔗
|
|
lennier2_ is now known as lennier1 |
10:09
🔗
|
|
lennier2_ has joined #archiveteam-bs |
10:10
🔗
|
|
lennier2 has quit IRC (Ping timeout: 745 seconds) |
10:14
🔗
|
|
lennier1 has quit IRC (Read error: Operation timed out) |
10:14
🔗
|
|
lennier2_ is now known as lennier1 |
10:21
🔗
|
|
ivan has joined #archiveteam-bs |
10:23
🔗
|
|
mgrytbak has joined #archiveteam-bs |
10:29
🔗
|
|
Jens has joined #archiveteam-bs |
10:30
🔗
|
|
Jens has quit IRC (Client Quit) |
10:32
🔗
|
|
Jens has joined #archiveteam-bs |
10:57
🔗
|
|
Jens has quit IRC (Quit: Jens) |
10:58
🔗
|
|
Jens has joined #archiveteam-bs |
11:02
🔗
|
|
ShellyRol has quit IRC (Read error: Operation timed out) |
11:03
🔗
|
|
ShellyRol has joined #archiveteam-bs |
11:43
🔗
|
|
Meroje has joined #archiveteam-bs |
12:06
🔗
|
|
tomaspark has joined #archiveteam-bs |
12:53
🔗
|
|
Raccoon has quit IRC (Remote host closed the connection) |
12:59
🔗
|
|
Raccoon has joined #archiveteam-bs |
13:02
🔗
|
|
Raccoon has quit IRC (Remote host closed the connection) |
13:31
🔗
|
|
balrog has quit IRC (Ping timeout: 492 seconds) |
14:00
🔗
|
|
balrog has joined #archiveteam-bs |
14:48
🔗
|
|
godane has joined #archiveteam-bs |
15:33
🔗
|
|
tempnicl has joined #archiveteam-bs |
15:34
🔗
|
tempnicl |
Hi. I need to archive the past few days of a twitter account, including its responses to other tweets. Can you fine people give me some advise on best practices? |
15:34
🔗
|
tempnicl |
@JAA recommended the use of snscrape + grab-site. |
15:35
🔗
|
tempnicl |
This will give me a WARC. How can I export the tweets in a text-format as well, that can be easily published to a e.g. a gist? |
15:35
🔗
|
JAA |
Ah, if you just want the contents directly, not a proper archive, then just snscrape. |
15:36
🔗
|
JAA |
Specifically, with --format, you can produce any output you like. |
15:37
🔗
|
JAA |
It's not really documented currently, but here's what you can use: https://github.com/JustAnotherArchivist/snscrape/blob/b6cc3180d97f1f9e9004f52e832333678d8c46f7/snscrape/modules/twitter.py#L15-L24 |
15:37
🔗
|
tempnicl |
Ah, interesting. I thought snscrape only returns urls to the tweets which still have to be processed by some other tool |
15:37
🔗
|
JAA |
That's what it does by default, yes. |
15:37
🔗
|
|
ShellyRol has quit IRC (Read error: Operation timed out) |
15:38
🔗
|
JAA |
Example usage: |
15:38
🔗
|
JAA |
> snscrape -n 1 --format '{content!r}' twitter-user textfiles |
15:38
🔗
|
|
ShellyRol has joined #archiveteam-bs |
15:38
🔗
|
JAA |
'@0x29 @polm23 I have a flux reader if they want to send it along.' |
15:38
🔗
|
JAA |
The --format value is a Python formatting string, and the variables are what I linked above. |
15:39
🔗
|
JAA |
(Beware of linebreaks etc.) |
15:42
🔗
|
tempnicl |
@JAA, I really don't grok snstools help output. How can I get the actual tweets by a user? I used `snstool twitter-user someuser > urls`. But, what now? Can you help? |
15:42
🔗
|
tempnicl |
I.e. each tweets content |
15:42
🔗
|
JAA |
Example above. |
15:43
🔗
|
JAA |
I don't know what information you want exactly and in what format. |
15:43
🔗
|
tempnicl |
Ah, overlooked that. Thx |
15:44
🔗
|
JAA |
If you need it even more customisable (e.g. JSON output), you'd have to use snscrape as a package through Python directly. That's completely undocumented though. |
15:48
🔗
|
tempnicl |
@JAA, best output would be .csv. How can I add the url to `--format '{content!r}'` ? I tried with ` --format '{content,url!r}'` which seems to be wrong... |
15:49
🔗
|
tempnicl |
What does the `!r` do? |
15:49
🔗
|
JAA |
'{url},{content!r}' or similar |
15:50
🔗
|
JAA |
!r is the same as repr(). It transforms the value into something that could be used again directly as a string in Python, in this case. |
15:50
🔗
|
JAA |
So linebreaks get translated into \n for example. |
15:51
🔗
|
tempnicl |
Ah, thx! |
15:51
🔗
|
JAA |
I'm not sure you can produce completely valid CSV like this, by the way. If a tweet's text contains a comma, for example, that might be problematic. |
15:52
🔗
|
tempnicl |
Ah true |
16:21
🔗
|
|
lennier2 has joined #archiveteam-bs |
16:27
🔗
|
|
lennier1 has quit IRC (Read error: Operation timed out) |
16:27
🔗
|
|
lennier2 is now known as lennier1 |
16:37
🔗
|
|
Pixi` has quit IRC (Quit: Leaving) |
16:37
🔗
|
|
Pixi has joined #archiveteam-bs |
17:40
🔗
|
|
Maylay has quit IRC (Read error: Operation timed out) |
18:04
🔗
|
|
Maylay has joined #archiveteam-bs |
18:04
🔗
|
|
tempnicl has quit IRC (Quit: Page closed) |
18:11
🔗
|
|
HP_Archiv has quit IRC (Ping timeout: 276 seconds) |
18:12
🔗
|
|
HP_Archiv has joined #archiveteam-bs |
18:21
🔗
|
|
Ajay1 has joined #archiveteam-bs |
19:13
🔗
|
|
Mateon1 has joined #archiveteam-bs |
19:24
🔗
|
|
Mateon1 has quit IRC (Ping timeout: 622 seconds) |
19:27
🔗
|
|
Mateon1 has joined #archiveteam-bs |
19:46
🔗
|
|
lennier2 has joined #archiveteam-bs |
19:48
🔗
|
|
lennier1 has quit IRC (Read error: Operation timed out) |
19:48
🔗
|
|
lennier2 is now known as lennier1 |
20:26
🔗
|
|
godane has quit IRC (Read error: Operation timed out) |
20:50
🔗
|
|
systwi_ has joined #archiveteam-bs |
20:57
🔗
|
|
systwi has quit IRC (Read error: Operation timed out) |
21:59
🔗
|
|
robogoat has quit IRC (Read error: Operation timed out) |
22:35
🔗
|
|
Ryz8 is now known as Ryz |
23:05
🔗
|
|
ndiddy_ is now known as ndiddy |
23:53
🔗
|
|
BlueMax has joined #archiveteam-bs |
23:59
🔗
|
|
HP_Archiv has quit IRC (Quit: Leaving) |