Package 'iC10TrainingData' reference manual

Title:	Training Datasets for iC10 Package
Description:	Training datasets for iC10; which implements the classifier described in the paper 'Genome-driven integrated classification of breast cancer validated in over 7,500 samples' (Ali HR et al., Genome Biology 2014). It uses copy number and/or expression form breast cancer data, trains a pamr classifier (Tibshirani et al.) with the features available and predicts the iC10 group. Genomic annotation for the training dataset has been obtained from Mark Dunning's lluminaHumanv3.db package.
Authors:	Oscar M Rueda and Jose Antonio Seoane Fernandez
Maintainer:	Oscar M. Rueda <[email protected]>
License:	GPL-3
Version:	2.0.1
Built:	2025-02-24 05:33:01 UTC
Source:	https://github.com/cran/iC10TrainingData

Training Datasets for iC10 Package

Description

Training datasets for iC10; which implements the classifier described in the paper 'Genome-driven integrated classification of breast cancer validated in over 7,500 samples' (Ali HR et al., Genome Biology 2014). It uses copy number and/or expression form breast cancer data, trains a pamr classifier (Tibshirani et al.) with the features available and predicts the iC10 group. Genomic annotation for the training dataset has been obtained from Mark Dunning's lluminaHumanv3.db package. Training datasets for iC10; which implements the classifier described in the METABRIC paper 'The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups' (Curtis et al., Nature 2012). It uses copy number and/or expression form breast cancer data, trains a pamr classifier (Tibshirani et al.) with the features available and predicts the iC10 group.

Details

The DESCRIPTION file:

Package:	iC10TrainingData
Type:	Package
Title:	Training Datasets for iC10 Package
Version:	2.0.1
Date:	2024-07-16
Author:	Oscar M Rueda and Jose Antonio Seoane Fernandez
Maintainer:	Oscar M. Rueda <[email protected]>
Description:	Training datasets for iC10; which implements the classifier described in the paper 'Genome-driven integrated classification of breast cancer validated in over 7,500 samples' (Ali HR et al., Genome Biology 2014). It uses copy number and/or expression form breast cancer data, trains a pamr classifier (Tibshirani et al.) with the features available and predicts the iC10 group. Genomic annotation for the training dataset has been obtained from Mark Dunning's lluminaHumanv3.db package.
License:	GPL-3
Packaged:	2024-07-16 07:15:09 UTC; oscar
NeedsCompilation:	no
Date/Publication:	2024-07-16 08:00:02 UTC
Depends:	R (>= 3.5.0)
Repository:	https://rueda-lab.r-universe.dev
RemoteUrl:	https://github.com/cran/iC10TrainingData
RemoteRef:	HEAD
RemoteSha:	1e41b8cc1497e8183f2f07ae0f3aca4522734642

Index of help topics:

IntClustMemb            Class Membership for the training set
Map.All                 Probe mapping of the complete set of features
                        of the training set
Map.CN                  Probe mapping of the copy number features of
                        the training set.
Map.Exp                 Probe mapping of the Expression features of the
                        training set
iC10TrainingData-package
                        Training Datasets for iC10 Package
train.CN                Copy number data for the training set
train.Exp               Expression data for the training set.

Training datasets for iC10; which implements the classifier described in the METABRIC paper 'The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups' (Curtis et al., Nature 2012). It uses copy number and/or expression form breast cancer data, trains a pamr classifier (Tibshirani et al.) with the features available and predicts the iC10 group.

Author(s)

Oscar M Rueda and Jose Antonio Seoane Fernandez

Maintainer: Oscar M. Rueda <[email protected]>

References

Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352. Tibshirani et al. Diagnosis of multiple cancer types by shrunken centroids of gene expression. PNAS 2002; 99(10):6567-6572.

Examples

data(train.CN)
data(train.Exp)
data(train.CN)
data(train.Exp)

Class Membership for the training set

Description

iC10 assignment for the Metabric training dataset (997 samples).

Usage

data(IntClustMemb)data(IntClustMemb)

Format

The format is: Factor w/ 10 levels "1","2","3","4",..: 2 9 3 3 8 6 7 7 7 3 ... - attr(*, "names")= chr [1:997] "MB.0135" "MB.0167" "MB.0136" "MB.3403" ...

Source

Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352.

Examples

data(IntClustMemb)
barplot(table(IntClustMemb))
data(IntClustMemb)
barplot(table(IntClustMemb))

Probe mapping of the complete set of features of the training set

Description

Probe mapping of the complete set of features of the training set

Usage

data(Map.All)data(Map.All)

Format

A data frame with 714 observations on the following 10 variables:

Probe_ID: a character vector with the Illumina probe ids that flank the features
Gene_symbol: a factor with the hugo gene names
Ensembl_ID: a factor with the ensemble ids
Cytoband: a factor with the cytobands (on hg18)
Genomic_location_hg18: a factor with the genomic locations on hg18
chromosome_name_hg18: a numeric vector with the chromosome on hg18
start_position_hg18: a numeric vector with the start position on hg18
end_position_hg18: a numeric vector with the end position on hg18
Synonyms_0: a character vector with the gene name synonyms of the feature
Gene.Chosen: a character vector (YES or NO) specifiying the probe chosen for gene-based selection
Genomic_location_hg19: a factor with the genomic locations on hg19
chromosome_name_hg19: a numeric vector with the chromosome on hg19
start_position_hg19: a numeric vector with the start position on hg19
end_position_hg19: a numeric vector with the end position on hg19
chromosome_name_hg38: a numeric vector with the chromosome on hg38
start_position_hg38: a numeric vector with the start position on hg38
end_position_hg38: a numeric vector with the end position on hg38

Source

Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352.

Examples

data(Map.All)
head(Map.All)
data(Map.All)
head(Map.All)

Probe mapping of the copy number features of the training set.

Description

Probe mapping of the copy number features of the training set.

Usage

data(Map.CN)data(Map.CN)

Format

A data frame with 38 observations on the following 8 variables.

Probe_ID: a character vector with the Illumina probe ids that flank the features
Gene_symbol: a factor with the hugo gene names
Ensembl_ID: a factor with the ensemble ids
Cytoband: a factor with the cytobands (on hg18)
Genomic_location_hg18: a factor with the genomic locations on hg18
chromosome_name_hg18: a numeric vector with the chromosome on hg18
start_position_hg18: a numeric vector with the start position on hg18
end_position_hg18: a numeric vector with the end position on hg18
Genomic_location_hg19: a factor with the genomic locations on hg19
chromosome_name_hg19: a numeric vector with the chromosome on hg19
start_position_hg19: a numeric vector with the start position on hg19
end_position_hg19: a numeric vector with the end position on hg19
chromosome_name_hg38: a numeric vector with the chromosome on hg38
start_position_hg38: a numeric vector with the start position on hg38
end_position_hg38: a numeric vector with the end position on hg38

Source

Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352.

Examples

data(Map.CN)
head(Map.CN)
data(Map.CN)
head(Map.CN)

Probe mapping of the Expression features of the training set

Description

Probe mapping of the Expression features of the training set

Usage

data(Map.Exp)data(Map.Exp)

Format

A data frame with 711 observations on the following 10 variables.

Probe_ID: a character vector with the Illumina probe ids that flank the features
Gene_symbol: a factor with the hugo gene names
Ensembl_ID: a factor with the ensemble ids
Cytoband: a factor with the cytobands (on hg18)
Genomic_location_hg18: a factor with the genomic locations on hg18
chromosome_name_hg18: a numeric vector with the chromosome on hg18
start_position_hg18: a numeric vector with the start position on hg18
end_position_hg18: a numeric vector with the end position on hg18
Synonyms_0: a character vector with the gene name synonyms of the feature
Gene.Chosen: a character vector (YES or NO) specifiying the probe chosen for gene-based selection
Genomic_location_hg19: a factor with the genomic locations on hg19
chromosome_name_hg19: a numeric vector with the chromosome on hg19
start_position_hg19: a numeric vector with the start position on hg19
end_position_hg19: a numeric vector with the end position on hg19

Source

Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352.

Examples

data(Map.Exp)
head(Map.Exp)
data(Map.Exp)
head(Map.Exp)

Copy number data for the training set

Description

Copy number data for the training set

Usage

data(train.CN)data(train.CN)

Format

A matrix with 714 rows and 997 columns. Rows are features and columns are training samples.

Details

Each row corresponds to one copy number feature for all samples in the training set. Note that it includes all features in the classifier. Note also that, depending on the data available and the type of matching (gene or probe) only some of the features will be used.

Source

Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352.

Examples

data(train.CN)
summary(train.CN)
data(train.CN)
summary(train.CN)

Expression data for the training set.

Description

Expression data for the training set.

Usage

data(train.Exp)data(train.Exp)

Format

A matrix with 714 rows and 997 columns. Rows are features and columns are training samples.

Details

Each row corresponds to one expression feature for all samples in the training set. Note that it includes all features in the classifier. Note that, depending on the data available and the type of matching (gene or probe) only some of the features will be used.

Source

Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352.

Examples

data(train.Exp)
summary(train.Exp)
data(train.Exp)
summary(train.Exp)

Package 'iC10TrainingData'

Help Index

Training Datasets for iC10 Package

Description

Details

Author(s)

References

See Also

Examples

Class Membership for the training set

Description

Usage

Format

Source

Examples

Probe mapping of the complete set of features of the training set

Description

Usage

Format

Source

Examples

Probe mapping of the copy number features of the training set.

Description

Usage

Format

Source

Examples

Probe mapping of the Expression features of the training set

Description

Usage

Format

Source

Examples

Copy number data for the training set

Description

Usage

Format

Details

Source

Examples

Expression data for the training set.

Description

Usage

Format

Details

Source

Examples