Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Nice trove to pore through when I find the time.

I like to use Twitter to analyze HN datasets. It's mostly limited to links, because that's what I'm after mostly.

https://twitter.com/newsycombinator https://twitter.com/HackerNews .. And a few other accounts. Try to avoid Bitly wrapped links.

Use something like Greptweet to harvest the tweets and parse out any noise.



That will not get you accurate results for HN data analysis since a) those accounts only tweet important links so analysis will be biased b) you can only get 3200 tweets at a time. (This is a Twitter API limitation)

You have to look at both the good and the bad.


I hear you. Raw unfiltered links always have hidden gems.

One thing though: Greptweet has an archive somewhere with a huge trove of tweets that users of the service have searched for, and were thus logged and kept. (Some even go over the 3200 limit). It's a massive Tarball, so set aside time to download it and parse out boring/noisy links.

A lot of HN links are tech-press posts which consist of hearsay and merely proxy the thoughts of others. The recent changes in HN with regards to more academia-style posts is refreshing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: