| Title: | Training Datasets for iC10 Package |
|---|---|
| Description: | Training datasets for iC10; which implements the classifier described in the paper 'Genome-driven integrated classification of breast cancer validated in over 7,500 samples' (Ali HR et al., Genome Biology 2014). It uses copy number and/or expression form breast cancer data, trains a pamr classifier (Tibshirani et al.) with the features available and predicts the iC10 group. Genomic annotation for the training dataset has been obtained from Mark Dunning's lluminaHumanv3.db package. |
| Authors: | Oscar M Rueda and Jose Antonio Seoane Fernandez |
| Maintainer: | Oscar M. Rueda <[email protected]> |
| License: | GPL-3 |
| Version: | 2.0.1 |
| Built: | 2026-06-07 09:05:08 UTC |
| Source: | https://github.com/cran/iC10TrainingData |
Training datasets for iC10; which implements the classifier described in the paper 'Genome-driven integrated classification of breast cancer validated in over 7,500 samples' (Ali HR et al., Genome Biology 2014). It uses copy number and/or expression form breast cancer data, trains a pamr classifier (Tibshirani et al.) with the features available and predicts the iC10 group. Genomic annotation for the training dataset has been obtained from Mark Dunning's lluminaHumanv3.db package. Training datasets for iC10; which implements the classifier described in the METABRIC paper 'The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups' (Curtis et al., Nature 2012). It uses copy number and/or expression form breast cancer data, trains a pamr classifier (Tibshirani et al.) with the features available and predicts the iC10 group.
The DESCRIPTION file:
| Package: | iC10TrainingData |
| Type: | Package |
| Title: | Training Datasets for iC10 Package |
| Version: | 2.0.1 |
| Date: | 2024-07-16 |
| Author: | Oscar M Rueda and Jose Antonio Seoane Fernandez |
| Maintainer: | Oscar M. Rueda <[email protected]> |
| Description: | Training datasets for iC10; which implements the classifier described in the paper 'Genome-driven integrated classification of breast cancer validated in over 7,500 samples' (Ali HR et al., Genome Biology 2014). It uses copy number and/or expression form breast cancer data, trains a pamr classifier (Tibshirani et al.) with the features available and predicts the iC10 group. Genomic annotation for the training dataset has been obtained from Mark Dunning's lluminaHumanv3.db package. |
| License: | GPL-3 |
| Packaged: | 2024-07-16 07:15:09 UTC; oscar |
| NeedsCompilation: | no |
| Depends: | R (>= 3.5.0) |
| Repository: | https://rueda-lab.r-universe.dev |
| Date/Publication: | 2024-07-17 02:45:56 UTC |
| RemoteUrl: | https://github.com/cran/iC10TrainingData |
| RemoteRef: | HEAD |
| RemoteSha: | 1e41b8cc1497e8183f2f07ae0f3aca4522734642 |
Index of help topics:
iC10TrainingData-package
Training Datasets for iC10 Package
IntClustMemb Class Membership for the training set
Map.All Probe mapping of the complete set of features
of the training set
Map.CN Probe mapping of the copy number features of
the training set.
Map.Exp Probe mapping of the Expression features of the
training set
train.CN Copy number data for the training set
train.Exp Expression data for the training set.
Training datasets for iC10; which implements the classifier described in the METABRIC paper 'The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups' (Curtis et al., Nature 2012). It uses copy number and/or expression form breast cancer data, trains a pamr classifier (Tibshirani et al.) with the features available and predicts the iC10 group.
Oscar M Rueda and Jose Antonio Seoane Fernandez
Maintainer: Oscar M. Rueda <[email protected]>
Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352. Tibshirani et al. Diagnosis of multiple cancer types by shrunken centroids of gene expression. PNAS 2002; 99(10):6567-6572.
iC10
data(train.CN) data(train.Exp)data(train.CN) data(train.Exp)
iC10 assignment for the Metabric training dataset (997 samples).
data(IntClustMemb)data(IntClustMemb)
The format is: Factor w/ 10 levels "1","2","3","4",..: 2 9 3 3 8 6 7 7 7 3 ... - attr(*, "names")= chr [1:997] "MB.0135" "MB.0167" "MB.0136" "MB.3403" ...
Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352.
data(IntClustMemb) barplot(table(IntClustMemb))data(IntClustMemb) barplot(table(IntClustMemb))
Probe mapping of the complete set of features of the training set
data(Map.All)data(Map.All)
A data frame with 714 observations on the following 10 variables:
Probe_IDa character vector with the Illumina probe ids that flank the features
Gene_symbola factor with the hugo gene names
Ensembl_IDa factor with the ensemble ids
Cytobanda factor with the cytobands (on hg18)
Genomic_location_hg18a factor with the genomic locations on hg18
chromosome_name_hg18a numeric vector with the chromosome on hg18
start_position_hg18a numeric vector with the start position on hg18
end_position_hg18a numeric vector with the end position on hg18
Synonyms_0a character vector with the gene name synonyms of the feature
Gene.Chosena character vector (YES or NO) specifiying the probe chosen for gene-based selection
Genomic_location_hg19a factor with the genomic locations on hg19
chromosome_name_hg19a numeric vector with the chromosome on hg19
start_position_hg19a numeric vector with the start position on hg19
end_position_hg19a numeric vector with the end position on hg19
chromosome_name_hg38a numeric vector with the chromosome on hg38
start_position_hg38a numeric vector with the start position on hg38
end_position_hg38a numeric vector with the end position on hg38
Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352.
data(Map.All) head(Map.All)data(Map.All) head(Map.All)
Probe mapping of the copy number features of the training set.
data(Map.CN)data(Map.CN)
A data frame with 38 observations on the following 8 variables.
Probe_IDa character vector with the Illumina probe ids that flank the features
Gene_symbola factor with the hugo gene names
Ensembl_IDa factor with the ensemble ids
Cytobanda factor with the cytobands (on hg18)
Genomic_location_hg18a factor with the genomic locations on hg18
chromosome_name_hg18a numeric vector with the chromosome on hg18
start_position_hg18a numeric vector with the start position on hg18
end_position_hg18a numeric vector with the end position on hg18
Genomic_location_hg19a factor with the genomic locations on hg19
chromosome_name_hg19a numeric vector with the chromosome on hg19
start_position_hg19a numeric vector with the start position on hg19
end_position_hg19a numeric vector with the end position on hg19
chromosome_name_hg38a numeric vector with the chromosome on hg38
start_position_hg38a numeric vector with the start position on hg38
end_position_hg38a numeric vector with the end position on hg38
Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352.
data(Map.CN) head(Map.CN)data(Map.CN) head(Map.CN)
Probe mapping of the Expression features of the training set
data(Map.Exp)data(Map.Exp)
A data frame with 711 observations on the following 10 variables.
Probe_IDa character vector with the Illumina probe ids that flank the features
Gene_symbola factor with the hugo gene names
Ensembl_IDa factor with the ensemble ids
Cytobanda factor with the cytobands (on hg18)
Genomic_location_hg18a factor with the genomic locations on hg18
chromosome_name_hg18a numeric vector with the chromosome on hg18
start_position_hg18a numeric vector with the start position on hg18
end_position_hg18a numeric vector with the end position on hg18
Synonyms_0a character vector with the gene name synonyms of the feature
Gene.Chosena character vector (YES or NO) specifiying the probe chosen for gene-based selection
Genomic_location_hg19a factor with the genomic locations on hg19
chromosome_name_hg19a numeric vector with the chromosome on hg19
start_position_hg19a numeric vector with the start position on hg19
end_position_hg19a numeric vector with the end position on hg19
Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352.
data(Map.Exp) head(Map.Exp)data(Map.Exp) head(Map.Exp)
Copy number data for the training set
data(train.CN)data(train.CN)
A matrix with 714 rows and 997 columns. Rows are features and columns are training samples.
Each row corresponds to one copy number feature for all samples in the training set. Note that it includes all features in the classifier. Note also that, depending on the data available and the type of matching (gene or probe) only some of the features will be used.
Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352.
data(train.CN) summary(train.CN)data(train.CN) summary(train.CN)
Expression data for the training set.
data(train.Exp)data(train.Exp)
A matrix with 714 rows and 997 columns. Rows are features and columns are training samples.
Each row corresponds to one expression feature for all samples in the training set. Note that it includes all features in the classifier. Note that, depending on the data available and the type of matching (gene or probe) only some of the features will be used.
Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352.
data(train.Exp) summary(train.Exp)data(train.Exp) summary(train.Exp)