| Time |
Nickname |
Message |
|
01:23
🔗
|
Coderjoe |
https://sphotos-b.xx.fbcdn.net/hphotos-snc6/10239_532076040155187_2068485922_n.jpg |
|
03:52
🔗
|
underscor |
chronomex: lolumad? |
|
03:52
🔗
|
underscor |
;D |
|
03:53
🔗
|
chronomex |
wut |
|
03:58
🔗
|
SketchCow |
ME GUSTA |
|
03:58
🔗
|
SketchCow |
So, we've now set up to ingest almost all archiveteam WARC-based items into the wayback machine. |
|
03:59
🔗
|
SketchCow |
Mid-month the wayback machine will ideally begin showing these more recent sites. |
|
04:01
🔗
|
godane |
now thats cool |
|
04:01
🔗
|
GLaDOS |
That's quite hot. |
|
04:03
🔗
|
godane |
i'm temped to make a livecd of wayback machine |
|
04:04
🔗
|
godane |
just download warc.gz to a folder and it starts working |
|
04:29
🔗
|
Nintendud |
Nice. |
|
04:54
🔗
|
godane |
luckly i'm getting images with my theregister.co.uk dumps |
|
04:55
🔗
|
godane |
this dump is meant to be more for data mine |
|
04:56
🔗
|
godane |
its mostly just going to be the articles |
|
04:56
🔗
|
godane |
by year |
|
05:08
🔗
|
godane |
looks like i have 4 theblazetv glenn beck shows that are 4 hours i have to grab |
|
05:08
🔗
|
godane |
:-( |
|
05:08
🔗
|
godane |
once my 2005 run of theregister is done i will go to windows and start grabing that |
|
05:09
🔗
|
godane |
good news is it looks like the rnc specials on theblazetv is up now |
|
10:39
🔗
|
SmileyG |
lol this is fun |
|
10:39
🔗
|
SmileyG |
rocking out to utterly random music selections on the archive |
|
11:36
🔗
|
SmileyG |
Actually alard its starting to make sense now I understand how its called lol |
|
11:36
🔗
|
SmileyG |
So it pulls the "normal" profile page |
|
11:36
🔗
|
SmileyG |
it then grabs varaious parts and "table.inserts" them into.... well a table |
|
11:36
🔗
|
SmileyG |
which will give you the list of download items? |
|
11:50
🔗
|
Soojin |
http://www.youtube.com/watch?v=cKcWyQ5aJlY&feature=g-user-u |
|
11:54
🔗
|
C-Keen |
~. |
|
14:01
🔗
|
alard |
SmileyG: The table (in lua everything is a 'table', but this table is really just a list) is the list of urls that gets added to Wget's queue. |
|
14:01
🔗
|
alard |
So it downloads the profile page, looks inside, finds pagination links, queues the page urls, the album urls etc. |
|
14:05
🔗
|
SmileyG |
nice |
|
14:05
🔗
|
joepie91 |
alard: you may have an idea on this... my script fetched about 1 million usernames so far |
|
14:05
🔗
|
joepie91 |
and that's it |
|
14:05
🔗
|
joepie91 |
however |
|
14:05
🔗
|
joepie91 |
https://www.google.nl/search?sugexp=chrome,mod=8&sourceid=chrome&ie=UTF-8&q=site%3Acommunity.webshots.com+inurl%3Auser |
|
14:05
🔗
|
joepie91 |
approx 11.400.000 results |
|
14:06
🔗
|
alard |
Should we start a #webshots channel? |
|
14:06
🔗
|
joepie91 |
since there's very little duplicates in the gathered list of users, I suspect I only have small (popular) subset |
|
14:06
🔗
|
alard |
Where did you look for your current list? |
|
14:06
🔗
|
joepie91 |
yes, probably |
|
14:06
🔗
|
joepie91 |
it starts at the community category index, then gets the top users for each category |
|
14:06
🔗
|
joepie91 |
but the top users list is limited to 100 pages |
|
14:06
🔗
|
joepie91 |
per category |
|
14:06
🔗
|
joepie91 |
that's 100 * 100 users per category |
|
14:07
🔗
|
joepie91 |
(max) |
|
15:41
🔗
|
SmileyG |
yeah errr zee fuxed. |
|
20:13
🔗
|
chronomex |
alard is a hero in a city of heroes |
|
20:59
🔗
|
undersco2 |
holy balls |
|
20:59
🔗
|
undersco2 |
http://fastdesign7.com/ |
|
20:59
🔗
|
chronomex |
http://jalopnik.com/5875229/what-happens-when-you-put-four-donuts-on-a-c63-amg |
|
20:59
🔗
|
SmileyG |
dear lord. |
|
22:09
🔗
|
godane |
i can grab some old webuser magazines now |
|
22:10
🔗
|
godane |
it go on usenet in the last week |