Golden Hour of Publishing Comments

Hacker News is a site similar to Reddit where user-submitted stories (known as “posts”) are voted on and commented on. In the tech and startup worlds, Hacker News is immensely popular, and pieces that reach the top of the site’s listings can get hundreds of thousands of views.

We’ll compare these two types of posts to determine the following:

1. Do ‘Ask HN’ or ‘Show HN’ posts receive more comments on average?

2. Which one receives more comments?

import pandas as pd

# read input file
data = pd.read_csv('HN_posts_year_to_Sep_26_2016.csv')

# keep titles receiving comments, and randomly sample 20000 rows
sample = data[data['num_comments']>0].sample(20000)
sample

	id	title	url	num_points	num_comments	author	created_at
76795	11902973	Why Team Happiness Can Be the Wrong Thing to A...	https://vimeo.com/143894732	91	44	michaelfeathers	6/14/2016 16:10
9300	12494791	There's only one business worth starting	https://medium.com/hi-my-name-is-jon/theres-on...	3	1	hccampos	9/14/2016 7:25
79128	11878535	Video streaming for all	http://getlawd.com/	3	7	stoufa88	6/10/2016 18:10
275008	10307145	Google and Microsoft make patent peace	http://www.zdnet.com/article/google-and-micros...	89	56	tanglesome	9/30/2015 21:02
132643	11423114	What is your story of finding your cofounder, ...	NaN	2	5	PeterTMayer	4/4/2016 16:24
...	...	...	...	...	...	...	...
174186	11082832	The Godfather of Digital Maps	http://www.forbes.com/sites/miguelhelft/2016/0...	30	8	dll	2/11/2016 20:13
117737	11546882	MongoDB Twitter Spam Campaign	http://imgur.com/a/iY5C7	1	1	snurk	4/22/2016 3:29
245098	10523794	The unmanned aerial drones of WW2	https://en.wikipedia.org/wiki/Operation_Outward	1	2	fivedogit	11/7/2015 4:58
230676	10633042	How Cereal Is Made	http://luckypeach.com/how-cereal-is-made/	23	13	zdw	11/26/2015 14:14
60515	12045437	Italian banking is the next shoe to drop	http://marginalrevolution.com/marginalrevoluti...	40	6	jseliger	7/6/2016 19:56

20000 rows × 7 columns

# keep only rows that contain 'ask hn'
sampleAsk = sample[['ask hn' in i.lower() for i in sample['title']]]
sampleAsk

	id	title	url	num_points	num_comments	author	created_at
188898	10969705	Ask HN: Are GMail's new features making spam e...	NaN	3	2	aleem	1/25/2016 20:27
66651	11992582	Ask HN: Why is academic language so redundant?	NaN	2	4	50CNT	6/28/2016 10:14
184006	11006270	Ask HN: What are some examples of great B2B la...	NaN	5	3	bossx	1/31/2016 13:08
242932	10539626	Ask HN: Accused of email hacking	NaN	1	1	dfraser992	11/10/2015 14:56
287609	10215239	Ask HN: Is it common knowledge that Yahoo bene...	NaN	1	1	superplussed	9/14/2015 14:34
...	...	...	...	...	...	...	...
40330	12219043	Ask HN: Do you still play with VR actively?	NaN	105	108	billconan	8/3/2016 16:14
271208	10335717	Ask HN: Hackintosh vs Mac Mini	NaN	2	6	siquick	10/5/2015 23:24
228610	10650345	Ask HN: Learning path for React and Flux	NaN	3	1	rufus42	11/30/2015 16:56
120244	11526536	Ask HN: Is this a Python bug?	NaN	3	4	wslh	4/19/2016 12:47
149820	11276960	Ask HN: Am I getting old?	NaN	11	5	greenspot	3/13/2016 9:24

1693 rows × 7 columns

# keep only rows that contain 'show hn'
sampleShow = sample[['show hn' in i.lower() for i in sample['title']]]
sampleShow

	id	title	url	num_points	num_comments	author	created_at
288163	10211892	Show HN: Best places to work remotely by actua...	https://workfrom.co	17	9	darrenbuckner	9/13/2015 16:33
170868	11109744	Show HN: The fastest way to discover fashion o...	http://frowse.fashion/home/3	1	2	xShirase	2/16/2016 13:50
139946	11360582	SHOW HN: Left-Pad could be the next FizzBuzz s...	https://www.educative.io/collection/page/10370...	4	1	fahimulhaq	3/25/2016 15:27
261028	10411041	Show HN: Microphone Self-Announcing .NET Serv...	https://github.com/rogeralsing/Microphone	54	6	RogerAlsing	10/19/2015 4:02
247565	10506163	Show HN: Vivilio Discover books peers and inf...	https://www.vivilio.com	9	4	soumitrasg	11/4/2015 13:05
...	...	...	...	...	...	...	...
143733	11327679	Show HN: Micro a microservice toolkit	https://blog.micro.mu/2016/03/20/micro.html	101	32	chuhnk	3/21/2016 12:52
179074	11043959	Show HN: Swift and VR Google Cardboard Ported...	https://github.com/nzff/cardboard-swift	64	19	nzff	2/5/2016 19:31
42857	12197474	Show HN: Trading platform for Pokemon Go	https://medium.com/@deadlocked_d/pok%C3%A9mon-...	1	1	liongate2	7/31/2016 16:02
126956	11471060	Show HN: Musicsaur Multi-room audio synchroni...	http://www.musicsaur.com/	3	2	qrv3w	4/11/2016 12:36
77493	11896000	Show HN: Golf Tradr fantasy golf with a stock...	https://golftradr.com	3	1	rob_zim	6/13/2016 18:21

1285 rows × 7 columns

1. Question: Does ‘Show HN’ or ‘Ask HN’ posts receive more comments?

averageComAsk = sum(sampleAsk['num_comments'])/1692
averageComAsk

12.789598108747045

averageComShow = sum(sampleShow['num_comments'])/1284
averageComShow

8.483644859813085

1. Answer: On average ‘Ask HN’ posts receive 12.78 comments, while ‘Show HN’ posts receive 8.48 comments. We then conclude that ‘Ask HN’ posts are more popular than ‘Show HN’ posts.

2. Question: Focus on ‘Ask HN’ (since it receives more comments), what are the amount of ask posts created per hour, along with the total amount of comments?

import datetime as dt
hour = pd.DataFrame([dt.datetime.strftime(i, '%H') for i in [dt.datetime.strptime(i, "%m/%d/%Y %H:%M") for i in sampleAsk['created_at']]], columns=['hour'])

# make a DataFrame of hour (of the post) and (its) num_comments
hourCom = pd.concat([hour, sampleAsk['num_comments'].reset_index()], axis=1, ignore_index=True)[[0, 2]].rename({0: 'hour', 2: 'num_comments'}, axis=1)
hourCom

	hour	num_comments
0	20	2
1	10	4
2	13	3
3	14	1
4	14	1
...	...	...
1688	16	108
1689	23	6
1690	16	1
1691	12	4
1692	09	5

1693 rows × 2 columns

# total of comments per hour
sumCom = hourCom.groupby('hour').sum()
sumCom.sort_values(['num_comments'], ascending=False).head()

	num_comments
hour
15	4091
17	1286
20	1273
13	1149
18	1108

2. Answer: We see the top 5 golden hours where there are the most comments are 15, 17, 20, 13 and 18. However, this may not be accurate. We need to find out the rate of receiving comments/hour to verify this.

# total of counts of posts per hour
countCom = hourCom.groupby('hour').count()
countCom

	num_comments
hour
00	57
01	52
02	48
03	65
04	47
05	39
06	47
07	51
08	47
09	33
10	64
11	63
12	67
13	69
14	91
15	115
16	104
17	93
18	123
19	87
20	100
21	103
22	69
23	59

averageHour = []
for i in range(0, 24):
    averageHour.append(sumCom['num_comments'][i]/countCom['num_comments'][i])
averageCom = pd.concat([sumCom, countCom, pd.DataFrame(averageHour, index=sumCom.index)], axis=1)
averageCom.columns = ['sum_comments', 'count_comments', 'average_comments']
averageCom

	sum_comments	count_comments	average_comments
hour
00	493	57	8.649123
01	485	52	9.326923
02	333	48	6.937500
03	523	65	8.046154
04	402	47	8.553191
05	916	39	23.487179
06	644	47	13.702128
07	693	51	13.588235
08	692	47	14.723404
09	146	33	4.424242
10	763	64	11.921875
11	523	63	8.301587
12	806	67	12.029851
13	1149	69	16.652174
14	972	91	10.681319
15	4091	115	35.573913
16	1049	104	10.086538
17	1286	93	13.827957
18	1108	123	9.008130
19	558	87	6.413793
20	1273	100	12.730000
21	1088	103	10.563107
22	1021	69	14.797101
23	626	59	10.610169

averageCom.sort_values(by=['average_comments'], ascending=False).head()

	sum_comments	count_comments	average_comments
hour
15	4091	115	35.573913
05	916	39	23.487179
13	1149	69	16.652174
22	1021	69	14.797101
08	692	47	14.723404

2. Answer: We see the top 5 golden hours where on average receing comments are not 15, 17, 20, 13 and 18. However, 15 still remains the top golder hour receiving the most comments.

Based on two calculation (sum of comments/hour) and (average of comment/hour), we conclude that 15 is the most popular time of the day ‘Ask HN’ posts generally receive their comments from site users.

Golden Hour of Publishing Comments
@ rushi | Sunday, Jun 19, 2022 | 6 minutes read | Update at Sunday, Jun 19, 2022