Skip to content

basic

Basic metrics based on counts, dates etc. of posts, contributors.

By level of observation / concept:

topics

community

agg_number_of_posts_per_interval(community, interval)

Number of posts per interval.

Total number of posts in community per interval (parameter).

Parameters:

Name Type Description Default
community pici.Community required
interval str

The interval over which to aggregate. See pandas.Timedelta (https://pandas.pydata.org/docs/user_guide/timedeltas.html)

required

Returns:

Name Type Description
results dict of str:int
  • number of posts per <interval>

agg_posts_per_topic(community)

Min, max, and average number of posts authored per topic.

Parameters:

Name Type Description Default
community required

Returns:

Name Type Description
results dict of str:int
  • <agg> posts per topic

contributors_per_interval(community, interval)

Number of users that have authored at least one post in time interval.

TODO
  • document
  • add to TOC

Parameters:

Name Type Description Default
community required
interval required

lorenz(community)

Distribution of posts (in analogy to lorenz curve). Returns (x,y) where x is the (least-contributing) bottom x% of users, and y the proportion of posts made by them.

Parameters:

Name Type Description Default
community

report: - % contributors - % posts

required

number_of_contributors_per_topic(community)

Number of different contributors that have authored at least one post in a thread.

Parameters:

Name Type Description Default
community pici.Community required

Returns:

Name Type Description
results dict of str: int
  • number of contributors

number_of_posts(community)

Total number of posts authored by community.

TODO

document

Parameters:

Name Type Description Default
community required

number_of_posts_per_topic(community)

Number of posts per topic.

TODO
  • add to toc

Parameters:

Name Type Description Default
community required

Returns:

Name Type Description
report
  • number of posts

number_of_words(community)

The number of words in a post (removing html).

Parameters:

Name Type Description Default
community pici.Community required

Returns:

Name Type Description
results dict of str:int
  • number of words

post_dates_per_topic(community)

Date of first post, second post, and last post.

Parameters:

Name Type Description Default
community pici.Community required

Returns:

Name Type Description
results dict of str:date
  • first post date
  • second post date
  • last post date

post_delays_per_topic(community)

Delays (in days) between first and second post, and first and last post.

Parameters:

Name Type Description Default
community pici.Community required

Returns:

Name Type Description
results dict of str:int
  • delay first last post
  • delay first second post

posts_per_interval(community, interval)

Number of posts authored by community per time interval.

TODO
  • document
  • add to TOC

Parameters:

Name Type Description Default
community required
interval required

posts_word_occurrence(community, words, normalize=True)

Counts the occurrence of a set of words in each post.

Parameters:

Name Type Description Default
community pici.Community required
words list of str

List of words to count in post texts.

required
normalize bool

Normalize occurrence count by text length.

True

Returns:

Name Type Description
results dict of str:int
  • occurrence of <word> for each provided word