helpers

`aggregate(dict_of_series, aggregations=[np.mean, np.min, np.max, np.std, np.sum])`

Applies a number of aggregations to the series supplied as values in dict_of_series. Keys are names of series, the name of the aggregation is appended to the series names as "(agg-name)".

Parameters:

Name	Type	Description	Default
`aggregations`		list of aggregation functions	`[np.mean, np.min, np.max, np.std, np.sum]`
`dict_of_series`		dict of indicator_name:Pandas.Series	required

Returns:

Type	Description
	dict of formatted indicator_name: aggregated series

`apply_to_initial_posts(community, new_cols, func)`

Applies func to initial posts (community.posts where post_position_in_thread==1). Returns DataFrame with topic_column field as index. Cols in retured df are named according to strings in new_cols, values in cols in order of values returned by func.

Parameters:

Name	Description	Default
`community`	pici.Community	required
`new_cols`	list of strings	required
`func`	function to apply to each initial post from community.posts	required

and indexed by thread-ids.

`as_table(func)`

Decorator that returns results as table, indexed with community name. TODO: document

Parameters:

Name	Type	Description	Default
`func`			required

`create_co_contributor_graph(link_data, node_data, node_col, group_col, node_attributes, connected=True)`

Creates a networkx.Graph with nodes=users and edges if two users have contributed to the same thread. Edge weights = number of threads where two users co-contributed.

Parameters:

Name	Type	Description	Default
`link_data`			required
`node_data`			required
`node_col`			required
`group_col`			required
`node_attributes`			required
`connected`			`True`

`create_commenter_graph(link_data, node_data, node_col, group_col, node_attributes, conntected=True)`

Creates a networkx.DiGraph with nodes=users and directed edges a->b if a has replied to an initial post by b. Edge weight is the number of comments.

Parameters:

Name	Type	Description	Default
`link_data`			required
`node_data`			required
`node_col`			required
`group_col`			required
`node_attributes`			required
`conntected`			`True`

`flat(df, columns='community_name')`

Returns a pivoted version of df with flattened index.

Parameters:

Name	Type	Description	Default
`df`	`pd.DataFrame`	Pandas.DataFrame	required
`columns`	`str`	Column name to pivot on.	`'community_name'`

`generate_indicator_results(posts, initial_post, feedback, indicator_text, column, aggs=[np.sum, np.mean, np.min, np.max, np.std])`

Returns results from column in DataFrames posts, initial_post, and feedback as different aggregations (sum, mean, ...). Initial post is only aggregated as sum. Output is a dict with df/agg: value, e.g. "posts indicator_text (mean)":value.

Parameters:

Name	Type	Description	Default
`posts`			required
`initial_post`			required
`feedback`			required
`indicator_text`			required
`column`			required

`join_df(func)`

Decorator that joins results to existing dataframe in community. TODO: document

Parameters:

Name	Type	Description	Default
`func`			required

`merge_dfs(dfs, only_unique=False)`

Wrapper for Pandas.merge(). Merges DataFrames, so that

TODO: document

Parameters:

Name	Type	Description	Default
`dfs`	`Iterable[pd.DataFrame]`		required
`only_unique`	`bool`		`False`

`num_words(text)`

Counts the number of words in a text. Does account for html tags and comments (not included in count).

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to count words in.	required

Returns:

Name	Type	Description
`count`	`int`	Number of words.

`series_most_common(series)`

Get most common element from Pandas.Series.

Parameters:

Name	Type	Description	Default
`series`	`pd.Series`	Pandas.Series	required

`where_all(conditions)`

Concatenates logical condition with and.

Parameters:

Name	Type	Description	Default
`conditions`			required

`word_occurrences(text, words)`

Counts the number of occurrences of specified words in text.

Parameters:

Name	Type	Description	Default
`text`	`str`	A text with words.	required
`words`	`list of str`	Words.	required

Returns:

Name	Type	Description
`occurrences`	`dict of str:int`
		A `word (str), number of occurrences (int)` dictionary