elliptic_toolkit package

Elliptic Bitcoin Dataset Toolkit

A comprehensive Python toolkit for working with the Elliptic Bitcoin dataset, providing utilities for data loading, temporal analysis, graph neural network modeling, and evaluation.

elliptic_toolkit.download_dataset(root: str = 'elliptic_bitcoin_dataset', raw_file_names=['elliptic_txs_features.csv', 'elliptic_txs_edgelist.csv', 'elliptic_txs_classes.csv'], force: bool = False, url: str = 'https://data.pyg.org/datasets/elliptic')[source]

Download the Elliptic Bitcoin dataset from PyTorch Geometric’s dataset repository.

Parameters:

root (str, optional) – The root directory where the dataset will be stored. Defaults to “elliptic_bitcoin_dataset”.
raw_file_names (list, optional) – List of raw file names to download. Defaults to [ ‘elliptic_txs_features.csv’,
'elliptic_txs_edgelist.csv'
'elliptic_txs_classes.csv'
].
force (bool, optional) – Whether to force re-download the dataset if it already exists. Defaults to False.
url (str, optional) – The base URL for the dataset files. Defaults to ‘https://data.pyg.org/datasets/elliptic’.

elliptic_toolkit.process_dataset(folder_path: str = 'elliptic_bitcoin_dataset', features_file: str = 'elliptic_txs_features.csv', classes_file: str = 'elliptic_txs_classes.csv', edges_file: str = 'elliptic_txs_edgelist.csv')[source]

Loads, validates, and processes the Elliptic Bitcoin dataset.

Returns:

nodes_df (pandas.DataFrame) – DataFrame with shape (203769, 167). Columns:
- ’time’: Discrete time step (int)
- ’feat_0’ … ‘feat_164’: Node features (float)
- ’class’: Node label (int: 1 for illicit, 0 for licit, -1 for unknown/missing)
The ‘class’ column uses -1 to indicate missing labels (transductive setting). The ‘txId’ column is dropped in the returned DataFrame; its original order matches the input file.
edges_df (pandas.DataFrame) – DataFrame with shape (234355, 2). Columns:
- ’txId1’: Source node index (int, row index in nodes_df)
- ’txId2’: Target node index (int, row index in nodes_df)
Each row represents a directed edge in the transaction graph, with node indices corresponding to rows in nodes_df.

Notes

All IDs in ‘edges_df’ are mapped to row indices in ‘nodes_df’.
The function performs strict validation on shapes, unique values, and label distribution.

elliptic_toolkit.temporal_split(times, test_size=0.2)[source]

elliptic_toolkit.temporal_split(times: ndarray, test_size=0.2)

elliptic_toolkit.temporal_split(times: Tensor, test_size=0.2)

elliptic_toolkit.temporal_split(nodes_df: DataFrame, test_size=0.2, return_X_y=True)

Split data into temporal train/test sets based on unique time steps.

Parameters:

times (np.ndarray, torch.Tensor, or pandas.DataFrame) – The time information or data to split. For DataFrames, must contain a ‘time’ column.
test_size (float, default=0.2) – Proportion of unique time steps to include in the test split (between 0.0 and 1.0).

Returns:

For array/tensor input –

train_indices, test_indicesarray-like
Indices for training and test sets.
For DataFrame input –

(X_train, y_train), (X_test, y_test)tuple of tuples

X_trainpandas.DataFrame
Training features (all columns except ‘class’).

y_trainpandas.Series
Training labels (the ‘class’ column).

X_testpandas.DataFrame
Test features (all columns except ‘class’).

y_testpandas.Series
Test labels (the ‘class’ column).

Or, if return_X_y=False:

train_df, test_dfpandas.DataFrame
The full training and test DataFrames, already sliced by time.
Type-specific behavior
———————
- np.ndarray (Uses numpy operations to split by unique time values.)
- torch.Tensor (Uses torch operations to split by unique time values (no CPU/GPU transfer).)
- pandas.DataFrame (Splits based on the ‘time’ column. If return_X_y=True, unpacks X and y)
based on the ‘class’ column; otherwise, returns the sliced DataFrames.

elliptic_toolkit.load_labeled_data(test_size=0.2, root='elliptic_bitcoin_dataset')[source]

Utility function to load data, select only labeled data and split temporally into train and test sets. :param test_size: Proportion of unique time steps to include in the test split (between 0.0 and 1.0). :type test_size: float, default=0.2 :param root: The root directory where the dataset is stored. Defaults to “elliptic_bitcoin_dataset”. :type root: str, optional

Returns:: (X_train, y_train), (X_test, y_test) – X_train, y_train: training features and labels X_test, y_test: test features and labels
Return type:: tuple of tuples

class elliptic_toolkit.GNNBinaryClassifier(data, model, hidden_dim=64, num_layers=3, dropout=0.5, norm=None, jk='last', learning_rate_init=0.01, weight_decay=0.0005, balance_loss=True, max_iter=200, verbose=False, n_iter_no_change=10, tol=0.0001, device='auto', heads=None, **kwargs)[source]

Bases: ClassifierMixin, BaseEstimator

Graph Neural Network Binary Classifier with early stopping.

A scikit-learn compatible binary classifier that wraps around PyTorch Geometric GNN models. Currently supports transductive and full batch learning models (GCN, GAT).

The training loss is monitored and the model is considered converged if the loss does not improve for n_iter_no_change consecutive iterations by at least tol. This early stopping mechanism is always enabled, similar to MLPClassifier in scikit-learn.

Parameters:

data (torch_geometric.data.Data) – Graph data object containing node features (x), edge indices (edge_index), and node labels (y).
model (torch.nn.Module) – The GNN model class to instantiate for training.
hidden_dim (int, default=64) – Number of hidden units in each layer.
num_layers (int, default=3) – Number of layers in the neural network.
dropout (float, default=0.5) – Dropout probability for regularization.
learning_rate_init (float, default=0.01) – Initial learning rate for the Adam optimizer.
weight_decay (float, default=5e-4) – L2 regularization strength.
balance_loss (bool, default=True) – Whether to balance the loss function by weighting positive samples. If True, uses positive class weighting in BCEWithLogitsLoss based on class frequencies. If False, uses unweighted loss.
max_iter (int, default=200) – Maximum number of training iterations.
verbose (bool, default=False) – Whether to print training progress.
n_iter_no_change (int, default=10) – Number of consecutive iterations with no improvement to trigger early stopping.
tol (float, default=1e-4) – Tolerance for improvement. Training stops if loss improvement is less than this value.
device (str or torch.device, default='auto') – Device to use for computation. Can be ‘cpu’, ‘cuda’, ‘auto’, or a torch.device object. If ‘auto’, will use CUDA if available, otherwise CPU.
heads (int, default=None) – Number of attention heads for GAT models. Only applicable when model=GAT. Ignored with a warning for other model types.
**kwargs (dict) – Additional keyword arguments passed to the model constructor.
Attributes
----------
loss_curve (list) – List of loss values at each training iteration.
model – The trained GNN model after calling fit.

__init__(data, model, hidden_dim=64, num_layers=3, dropout=0.5, norm=None, jk='last', learning_rate_init=0.01, weight_decay=0.0005, balance_loss=True, max_iter=200, verbose=False, n_iter_no_change=10, tol=0.0001, device='auto', heads=None, **kwargs)[source]

fit(X, y=None)[source]

Fit the GNN model to the training data.

Training automatically stops when the loss stops improving for n_iter_no_change consecutive iterations, similar to MLPClassifier.

Parameters:

train_indices (array-like) – Indices of training samples in the graph.
y (array-like, default=None) – Target values (ignored, present for sklearn compatibility).

Returns:

self – Returns self for method chaining.

Return type:

GNNBinaryClassifier

Warns:

UserWarning – If training stops due to max_iter being reached without convergence.

predict(X)[source]

Predict class labels for samples in test_indices.

Parameters:: test_indices (array-like) – Indices of test samples in the graph.
Returns:: predictions – Predicted class labels (0 or 1).
Return type:: ndarray of shape (n_samples,)
Raises:: ValueError – If the classifier has not been fitted yet.

predict_proba(X)[source]

Predict class probabilities for samples in test_indices.

Parameters:: test_indices (array-like) – Indices of test samples in the graph.
Returns:: probabilities – Predicted class probabilities. First column contains probabilities for class 0, second column for class 1.
Return type:: ndarray of shape (n_samples, 2)
Raises:: ValueError – If the classifier has not been fitted yet.

property classes_

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → GNNBinaryClassifier

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

class elliptic_toolkit.MLPWrapper(num_layers=2, hidden_dim=16, hidden_layer_sizes=None, alpha=0.0001, learning_rate_init=0.001, batch_size='auto', max_iter=1000)[source]

Bases: MLPClassifier

Wrapper around sklearn’s MLPClassifier to allow specifying the number of layers and hidden dimension directly. This is useful for hyperparameter tuning where hyperparameters need to be independent. Some parameters of the base MLPClassifier are fixed to ensure consistent behavior: - shuffle=False: Disable shuffling to maintain temporal order. - early_stopping=False: Disable internal test/validation split for validation loss based early stopping and use training loss based early stopping instead.

Parameters:

num_layers (int, default=2) – Number of hidden layers in the MLP.
hidden_dim (int, default=16) – Number of units in each hidden layer.
hidden_layer_sizes (tuple or None, default=None) – If provided, this overrides num_layers and hidden_dim. Should be a tuple specifying the size of each hidden layer.
alpha (float, default=0.0001) – L2 regularization term.
learning_rate_init (float, default=0.001) – Initial learning rate.
batch_size (int or 'auto', default='auto') – Size of minibatches for stochastic optimizers.
max_iter (int, default=1000) – Maximum number of iterations.

__init__(num_layers=2, hidden_dim=16, hidden_layer_sizes=None, alpha=0.0001, learning_rate_init=0.001, batch_size='auto', max_iter=1000)[source]

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → MLPWrapper

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.
Returns:: self – The updated object.
Return type:: object

set_partial_fit_request(*, classes: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') → MLPWrapper

Configure whether metadata should be requested to be passed to the partial_fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to partial_fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to partial_fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

classes (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for classes parameter in partial_fit.
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in partial_fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → MLPWrapper

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

set_params(**params)[source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:: **params (dict) – Estimator parameters.
Returns:: self – Estimator instance.
Return type:: estimator instance

class elliptic_toolkit.DropTime(drop=True)[source]

Bases: BaseEstimator, TransformerMixin

Transformer for dropping the ‘time’ column from a DataFrame. Useful in scikit-learn pipelines.

__init__(drop=True)[source]

fit(X, y=None)[source]

transform(X)[source]

class elliptic_toolkit.TemporalRollingCV(n_splits=5, *, test_size=None, max_train_size=None, gap=0, time_col='time')[source]

Bases: TimeSeriesSplit

Time-based cross-validation iterator that extends scikit-learn’s TimeSeriesSplit to work with data that has explicit time step values (like the Elliptic Bitcoin dataset).

This class inherits from TimeSeriesSplit and adds functionality to handle datasets where multiple samples can belong to the same time step. It maps the time step indices to actual row indices in the dataset, allowing it to be used with datasets like the Elliptic Bitcoin dataset.

This CV strategy ensures that for each fold: 1. Training data comes from earlier time periods 2. The test set is a continuous time window following the training data 3. Each fold expands the training window and shifts the test window forward

Parameters:

n_splitsint, default=5: Number of splits to generate
test_sizeint, default=None: Size of test window in time steps. If None, will be calculated based on n_splits.
max_train_sizeint, default=None: Maximum number of time steps to use for training. If None, all available time steps will be used.
gapint, default=0: Number of time steps to skip between training and test sets
time_colstr, default=’time’: Name of the column containing time step information

__init__(n_splits=5, *, test_size=None, max_train_size=None, gap=0, time_col='time')[source]

split(X, y=None, groups=None)[source]

Generate indices to split data into training and test sets.

Unlike standard TimeSeriesSplit, this method works with explicit time step values and maps them to actual row indices in the dataset. This allows it to handle datasets where multiple samples can belong to the same time step.

Parameters:

Xarray-like, DataFrame: Training data. If DataFrame, must contain the column specified by time_col. Otherwise, time values must be passed through the groups parameter.
yarray-like, optional: Targets for the training data (ignored)
groupsarray-like, optional: Time values for each sample if X doesn’t have the time column specified by time_col

Yields:

train_indexndarray: Indices of rows in the training set
test_indexndarray: Indices of rows in the test set

Notes:

The yielded indices refer to rows in the original dataset, not time steps. This makes the cross-validator compatible with scikit-learn’s model selection tools.

elliptic_toolkit.plot_evals(est, X_test, y_test, y_train, *, time_steps_test=None)[source]

Generate two evaluation plots for a classifier: 1. Precision-Recall curve on the test set. 2. Rolling/cumulative AP and illicit rate by time step.

Parameters:

est (classifier) – Trained classifier with predict_proba method.
X_test (pd.DataFrame, array-like) – Test features. Must contain a ‘time’ column unless time_steps_test is provided.
y_test (numpy.ndarray) – Test labels (binary).
y_train (numpy.ndarray) – Training labels (binary), used for reference illicit rate.
time_steps_test (numpy.ndarray, optional) – Time step values for test set. If None, will use X_test[‘time’].

Returns:

pr_fig (matplotlib.figure.Figure) – Figure for the precision-recall curve.
temporal_fig (matplotlib.figure.Figure) – Figure for the rolling/cumulative AP and illicit rate by time step.

Notes

This function assumes arrays to be numpy ndarrays. X_test is allowed to be a torch.Tensor but est.predict_proba must return numpy arrays.

elliptic_toolkit.plot_marginals(cv_results, max_ticks=10)[source]

For each hyperparameter in cv_results, plot the marginal mean and standard deviation (error bar) of test scores.

The marginal mean/std for each hyperparameter value is computed by averaging across all other hyperparameters the mean/std across the cv folds (i.e., by computing the average of the mean_test_score and std_test_score columns).

Parameters:

cv_results (dict) – The cv_results_ attribute from a scikit-learn search object.
max_ticks (int, default=10) – Maximum number of x-ticks to show on the x-axis for readability.

Returns:

figs – Dictionary mapping parameter names to matplotlib.figure.Figure objects.

Return type:

dict

elliptic_toolkit.parse_search_cv_logs(file_path, trim=True)[source]

Parse the hyperparameter search results. If trim is True, only return columns with more than one unique value.

Parameters:

file_path (str) – Path to the log file.
trim (bool, default=True) – Whether to trim the DataFrame to only columns with more than one unique value.

Returns:

res – DataFrame with hyperparameter results.

Return type:

pandas.DataFrame

Notes

Assumes each relevant line in the log file contains ‘END’, cv number as [CV x/y] and hyperparameters in the format param=value.

Specific example line:

[CV 1/5] END accuracy=0.95, learning_rate=0.01, num_layers=3,; acc=0.95, total time=3min

The specific regex patterns can be adjusted in the regex_map dictionary.

elliptic_toolkit.trim_hyperparameter_results(df)[source]: Return a DataFrame with only columns that have more than one unique value.