posts

`number_of_words(community)`

Adds the number of words in each post as int to community.posts.

`post_position_in_thread(community)`

Adds each post's position in thread (as int, starting with 1) to community.posts.

`preprocessed_text(community, n_topics=10)`

This preprocessor supplies cleaned text, text statistics (using Textacy) and sentiment statistics (TextBlob). The following columns are added to Community.posts:

clean
all_words
words_no_stop
n_words_no_stop
frac_uppercase
frac_punctuation_marks
avg_syllables_per_word
sentiment_polarity
sentiment_subjectivity
n_words
n_chars
n_long_words
n_unique_words
n_syllables
n_syllables_per_word
entropy
ttr
segmented_ttr
hdd
automated_index
flesch_reading_ease
smog_index
coleman_liau_index
flesch_kincaid_grade_level
gunning_fog_index

Parameters:

Name	Type	Description	Default
`community`			required

`rounded_date(community, round_dates_to='7D')`

Round the post dates according to specified frequency. If round_dates_to is None (default), this preprocessor does nothing.

Parameters:

Name	Description	Default
`community`		required
`round_dates_to`	Frequency to round the initial posts'	`'7D'`
`<https`	//pandas.pydata.org/docs/user_guide/timeseries.html	required