Filter for top 10 highest values of group in dataset (in R)
Context: I am trying to find the top 10 highest values of count in my data frame conditional on them falling within the years 1970-1979. My data frame looks as below:
id lemma year count
1 word1 1970 737
2 word2 1971 767
3 word3 1972 988
df_n_maxcount_1970s - df_n %% filter(year 1980) %% slice_max(count, n=40)
df_n_maxcount_1980s - df_n %% filter(year == 1980:1989) %% slice_max(count, n=40)
This has worked pretty well, but there's a level of manual work and in 1990 I had to increase n to 200 because there were many duplicates (i.e., the same word was appearing many times so I wasn't getting 10 unique words when searching for the top 10 with n=10).
Question: Can I automate the code so that I end up with one dataframe arranged as below? (of course, word 1 in 1970 might not equal word1 in 1980 and there would be 10 rows for each decade value for the top 10 words arranged by count). OR at least 5 separate dataframes with top 10 counts of words per decade?
decade lemma count
1970 word1 100
1970 word2 99
1970 word3 98
1980 word1 100
1990 word1 100
2000 word1 100
2010 word1 100
Topic data-wrangling r
Category Data Science