labelling

`InnovationLabels`

Bases: Labels

TODO: add documentation

`from_limesurvey(limesurvey_results, drop_labellers=None)`

Adds label entries from Limesurvey results format. Limesurvey results can contain multiple labelled threads per response. For each thread i and associated url and labels, the data must contain one column, e.g.: "thread1" (=url), "labelA1", "labelB1", ..., "thread2", "labelA2", "labelB2", ...

Parameters:

Name	Type	Description	Default
`limesurvey_results`		String (path to file) or Pandas.DataFrame	required

`LabelCollection`

TODO: add documentation

`all_label_names()` `property`

TODO: add documentation

`by_level(level)`

TODO: Add documentation

Parameters:

Name	Type	Description	Default
`level`			required

`labels()` `property`

TODO: add documentation

`LabelStats`

This class provides metrics and visualizations to analyze the annotations made by the labellers.

Available metrics are

% agreement ("a_0")
Cohen's kappa (two labellers)
Fleiss' kappa (multiple labellers)
Krippendorff's alpha (multiple labellers, missing data)

Labellers' annotations can furthermore be evaluated against a subsample of "goldstandard" annotations, allowing to associate labellers with a quality- score.

TODO refactor --> move visualizations to visualizations.py ?

See [1] for a comparison of inter-rater agreement metrics.

[1] Xinshu Zhao, Jun S. Liu & Ke Deng (2013) Assumptions behind Intercoder Reliability Indices, Annals of the International Communication Association, 36:1, 419-480, DOI: 10.1080/23808985.2013.11679142

`_melt_goldstandard_agreement(data)`

Prepares interrater_agreement dataframe for plotting (e.g., with seaborn): wide format to long format, shorten label names.

Parameters:

Name	Type	Description	Default
`data`		Pandas.DataFrame with label names as index, "labellers"-	required

Returns:

Type	Description
	Pandas.DataFrame with columns: - index: shortened label names - labellers: labeller names - variable: metric name - value: metric value

`cohen_kappa()`

Get Cohen's kappa for all labels, using scikit-learn implementation sklearn.metrics.cohen_kappa_score. Returns NaN if number of labellers != 2.

Returns:

Type	Description
	dict of (label name, kappa)

`complete_agreement()`

Get percentage of cases where all labellers agree (per label).

Returns:

Name	Type	Description
`agreement`		Pandas.DataFrame with 'label', '% perfect agreement',
		'% n'

`fleiss_kappa()`

Get Fleiss kappa for all labels, based on Statsmodels implementation (statsmodels.stats.inter_rater.fleiss_kappa). Returns NaN if number of labellers < 2.

Returns:

Type	Description
	dict of (label name, kappa)

`interrater_agreement()`

Calculated the overall interrater agreement for all labellers in data. If number of labellers > 2, all values for Cohen/Fleiss kappa will be NaN.

Returns:

Type	Description
	agreement dataframe

`krippendorff_alpha()`

Get Krippendorff alphas using the ''krippendorff'' package.

See also: Andrew F. Hayes & Klaus Krippendorff (2007) Answering the Call for a Standard Reliability Measure for Coding Data, Communication Methods and Measures, 1:1, 77-89, DOI: 10.1080/19312450709336664

Returns:

Type	Description
	Pandas.DataFrame

`pairwise_interrater_agreement(goldstandard=None, min_comparisons=1)`

Calculates the agreement metrics for all combinations of two labellers. If goldstandard is set (name of labeller), only comparisons with the goldstandard are calculated.

Parameters:

Name	Type	Description	Default
`goldstandard`		Name of labeller	`None`
`min_comparisons`		Minimum number of shared labelled cases	`1`

Returns:

Type	Description
	Agreement dataframe with "labellers" column that contains
	(labeller A, labeller B) tuples (default), or
	labeller B strings (if labeller A is set as goldstandard)

`plot_goldstandard_agreement(kind='label_boxplots', goldstandard=None, data=None)`

Plot the labellers' agreement with goldstandard. Provides different plots through kind: - label_boxplots: values: each labeller's agreement with goldstandard, x: metric, y: boxplot of values - labellers_points: values: each labeller's agreement with goldstandard, grid, col per metric, x: labels, y: values

Parameters:

Name	Description	Default
`kind`	One of {'label_boxplots','labellers_points'}	`'label_boxplots'`
`goldstandard`	Name of labeller to use as goldstandard (used if data is None, generates data)	`None`
`data`	agreement dataframe generated by `pairwise_interrater_agreement`	`None`

`Labels`

Bases: ABC

TODO: add documentation

`init(data=None, cols=DEFAULT_COLS, filter=None)`

Parameters:

Name	Type	Description	Default
`data`	`pd.DataFrame`		`None`
`cols`	`dict`		`DEFAULT_COLS`

`append(data, cols=DEFAULT_COLS, drop_labellers=None)`

TODO: add documentation

Parameters:

Name	Type	Default
`data`	`pd.DataFrame`	required
`cols`	`dict`	`DEFAULT_COLS`
`drop_labellers`		`None`

`data_by_label(format='sklearn', dropna=False)`

TODO: add documentation

Parameters:

Name	Type	Description	Default
`dropna`			`False`

`labellers()` `property`

TODO: add documentation

`rating_table(label_name, communities=None, custom_filter=None, allow_missing_data=False)`

Get the rating table for one label to be used, e.g., with statsmodels.stats.inter_rater.

Parameters:

Name	Description	Default
`communities`	List of communities to include in table,	`None`
`label_name`	label to be returned in table	required
`allow_missing_data`	whether to drop columns with missing ratings	`False`

Returns:

Type	Description
	rating table: labels as 2-dim table with raters (labellers) in
	rows and ratings in columns.

`set_filter(f)`

TODO: add documentation

Parameters:

Name	Type	Description	Default
`f`			required

labelling

InnovationLabels

from_limesurvey(limesurvey_results, drop_labellers=None)

LabelCollection

all_label_names() property

by_level(level)

labels() property

LabelStats

_melt_goldstandard_agreement(data)

cohen_kappa()

complete_agreement()

fleiss_kappa()

interrater_agreement()

krippendorff_alpha()

pairwise_interrater_agreement(goldstandard=None, min_comparisons=1)

plot_goldstandard_agreement(kind='label_boxplots', goldstandard=None, data=None)

Labels

__init__(data=None, cols=DEFAULT_COLS, filter=None)

append(data, cols=DEFAULT_COLS, drop_labellers=None)

data_by_label(format='sklearn', dropna=False)

labellers() property

rating_table(label_name, communities=None, custom_filter=None, allow_missing_data=False)

set_filter(f)

`InnovationLabels`

`from_limesurvey(limesurvey_results, drop_labellers=None)`

`LabelCollection`

`all_label_names()` `property`

`by_level(level)`

`labels()` `property`

`LabelStats`

`_melt_goldstandard_agreement(data)`

`cohen_kappa()`

`complete_agreement()`

`fleiss_kappa()`

`interrater_agreement()`

`krippendorff_alpha()`

`pairwise_interrater_agreement(goldstandard=None, min_comparisons=1)`

`plot_goldstandard_agreement(kind='label_boxplots', goldstandard=None, data=None)`

`Labels`

`init(data=None, cols=DEFAULT_COLS, filter=None)`

`append(data, cols=DEFAULT_COLS, drop_labellers=None)`

`data_by_label(format='sklearn', dropna=False)`

`labellers()` `property`

`rating_table(label_name, communities=None, custom_filter=None, allow_missing_data=False)`

`set_filter(f)`