cached_metrics
This is a collection of all cachable functions that are used in the
calculation of indicators. The cache is implemented using
functools.lru_cache
with maxsize=None
. Caching is commonly done at
least on community level (pici.Community is hashable). Examples for when
using a cache makes sense:
- calculating the similarity of post texts (done once for all combinations)
- generating "temporal networks" (filtered representations of networks, depending on dates of posts)
It is recommended to define cached parts of indicators here.
_comments_by_contributor(community, contributor, date_limit=None)
Get all threads initiated by contributor.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
community |
required | ||
contributor |
User name |
required | |
date_limit |
Date in string format, e.g. '2020-01-15' |
None
|
specified user (before the specified date_limit).
_contribution_regularity(community, contributor, start, end)
Get the contribution regularity of contributor
as the percentage of
days that contributor posted in the forum, between the dates start
and end
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
community |
required | ||
contributor |
required | ||
start |
required | ||
end |
required |
_initial_post_author_network_metric(initial_post, community, metric, kind)
Get a cached network metric for the author of an initial post.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
initial_post |
required | ||
metric |
required | ||
community |
required | ||
thread_date |
required | ||
kind |
required |
Returns:
Type | Description |
---|---|
The value of the metric. |
_replies_to_own_topics(community, contributor, date_limit=None)
The number of replies made to initial posts by specified contributor in community.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
community |
required | ||
contributor |
required | ||
date_limit |
Date in string format, e.g. '2020-01-15' |
None
|
contributor. If date_limit is provided, only threads & replies posted before the date limit are considered.
_temporal_text_similarity_dict(community, date, text_col='preprocessed_text__words_no_stop', similarity_metric='token_sort_ratio')
Returns a dictionary of post-text:1xn-similarity-matrix for similarity subgraph filtered by date.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
community |
required | ||
date |
required | ||
text_col |
'preprocessed_text__words_no_stop'
|
||
similarity_metric |
'token_sort_ratio'
|
_temporal_text_similarity_network(community, date, text_col='preprocessed_text__words_no_stop', similarity_metric='token_sort_ratio', only_initial_posts=True)
Create a subview graph of the text similarity network created by
`_text_similarity_network()
by filtering out all nodes (=posts)
where post.date is > date.
Args: community: date: text_col: similarity_metric: only_initial_posts:
Returns:
_text_similarity_network(community, text_col='preprocessed_text__words_no_stop', similarity_metric='token_sort_ratio', only_initial_posts=True)
Create a text-similarity network for all posts in community, using
textacy.representations.network.build_similarity_network()
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
community |
required | ||
text_col |
'preprocessed_text__words_no_stop'
|
||
similarity_metric |
'token_sort_ratio'
|
||
only_initial_posts |
True
|
_threads_by_contributor(community, contributor, date_limit=None)
Get all threads initiated by contributor.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
community |
required | ||
contributor |
User name |
required | |
date_limit |
Date in string format, e.g. '2020-01-15' |
None
|
specified user (before the specified date_limit).