Title: | A Copy Number and Expression-Based Classifier for Breast Tumours |
---|---|
Description: | Implementation of the classifier described in the paper Ali HR et al (2014) <doi:10.1186/s13059-014-0431-1>. It uses copy number and/or expression form breast cancer data, trains a Tibshirani's 'pamr' classifier with the features available and predicts the iC10 group. |
Authors: | Oscar M Rueda [aut, cre] |
Maintainer: | Oscar M Rueda <[email protected]> |
License: | GPL-3 |
Version: | 2.0.2 |
Built: | 2024-11-17 05:10:13 UTC |
Source: | https://github.com/cran/iC10 |
Implementation of the classifier described in the paper Ali HR et al (2014) <doi:10.1186/s13059-014-0431-1>. It uses copy number and/or expression form breast cancer data, trains a Tibshirani's 'pamr' classifier with the features available and predicts the iC10 group.
The DESCRIPTION file:
Package: | iC10 |
Type: | Package |
Title: | A Copy Number and Expression-Based Classifier for Breast Tumours |
Version: | 2.0.2 |
Date: | 2024-07-16 |
Authors@R: | person("Oscar M", "Rueda", , "[email protected]", role = c("aut", "cre"), comment = c(ORCID = "0000-0003-0008-4884")) |
Maintainer: | Oscar M Rueda <[email protected]> |
Description: | Implementation of the classifier described in the paper Ali HR et al (2014) <doi:10.1186/s13059-014-0431-1>. It uses copy number and/or expression form breast cancer data, trains a Tibshirani's 'pamr' classifier with the features available and predicts the iC10 group. |
License: | GPL-3 |
Imports: | pamr, impute, iC10TrainingData |
Packaged: | 2024-07-19 06:32:13 UTC; oscar |
NeedsCompilation: | no |
Date/Publication: | 2024-07-19 09:00:26 UTC |
Author: | Oscar M Rueda [aut, cre] (<https://orcid.org/0000-0003-0008-4884>) |
Repository: | https://rueda-lab.r-universe.dev |
RemoteUrl: | https://github.com/cran/iC10 |
RemoteRef: | HEAD |
RemoteSha: | 6aaca03625bf9bc41acec4d9e6db1888c1422bd8 |
Index of help topics:
compare Compare results of the iC10 classifier getCNfeatures Internal function for matching copy number features. getExpfeatures Internal function for matching expression features. goodnessOfFit Goodness of fit results of the iC10 classifier iC10 A copy number and expression-based classfier for breast cancers iC10-package A Copy Number and Expression-Based Classifier for Breast Tumours matchFeatures Matching features from the classifier to the test data. normalizeFeatures Normalization of expression features plot.iC10 Plot results of the iC10 classifier print.iC10 Print results of the iC10 classifier summary.iC10 Summary results of the iC10 classifier
iC10 implements the classifier described in the paper 'Genome-driven integrated classification of breast cancer validated in over 7,500 samples' (Ali HR et al., Genome Biology 2014). It uses copy number and/or expression form breast cancer data, trains a pamr classifier (Tibshirani et al.) with the features available and predicts the iC10 group.
Oscar M Rueda [aut, cre] (<https://orcid.org/0000-0003-0008-4884>)
Maintainer: Oscar M Rueda <[email protected]>
Ali HR et al. Genome-driven integrated classification of breast cancer validated in over 7,500 samples. Genome Biology 2014; 15:431. Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352. Tibshirani et al. Diagnosis of multiple cancer types by shrunken centroids of gene expression. PNAS 2002; 99(10):6567-6572.
require(iC10TrainingData) data(train.CN) data(train.Exp) features <- matchFeatures(Exp=train.Exp, Exp.by.feat="probe") features <- normalizeFeatures(features, "scale") res <- iC10(features) summary(res) goodnessOfFit(res, newdata=features) compare(res, iC10=1:2, newdata=features) compare(res, iC10=2:4, newdata=features)
require(iC10TrainingData) data(train.CN) data(train.Exp) features <- matchFeatures(Exp=train.Exp, Exp.by.feat="probe") features <- normalizeFeatures(features, "scale") res <- iC10(features) summary(res) goodnessOfFit(res, newdata=features) compare(res, iC10=1:2, newdata=features) compare(res, iC10=2:4, newdata=features)
This function plots the centroids of the training set versus the average profiles of the new data classified in each group.
compare(obj, iC10=1:10, newdata, name.test="Test",...) ## S3 method for class 'iC10' compare(obj, iC10=1:10, newdata, name.test="Test",...)
compare(obj, iC10=1:10, newdata, name.test="Test",...) ## S3 method for class 'iC10' compare(obj, iC10=1:10, newdata, name.test="Test",...)
obj |
An object of class |
iC10 |
Groups to plot |
newdata |
Set of features of the new data to compare. They must be the same samples classified and contained in
|
name.test |
Name of the new data set to appear in the text of the plot |
... |
Additional arguments passed to |
A plot is returned with two plots per groups requested.
Oscar M. Rueda
Ali HR et al. Genome-driven integrated classification of breast cancer validated in over 7,500 samples. Genome Biology 2014; 15:431. Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352.
iC10
, plot.iC10
, matchFeatures
, normalizeFeatures
require(iC10TrainingData) data(train.CN) data(train.Exp) features <- matchFeatures(Exp=train.Exp, Exp.by.feat="probe") features <- normalizeFeatures(features, "scale") res <- iC10(features) compare(res, 1:3, newdata=features)
require(iC10TrainingData) data(train.CN) data(train.Exp) features <- matchFeatures(Exp=train.Exp, Exp.by.feat="probe") features <- normalizeFeatures(features, "scale") res <- iC10(features) compare(res, 1:3, newdata=features)
This function should not be called directly
getCNfeatures(CN, Probes, Map, by.feat, ref, Synonyms)
getCNfeatures(CN, Probes, Map, by.feat, ref, Synonyms)
CN |
CN features matrix |
Probes |
Vector with the probes to match |
Map |
data.frame with the genomic description of the features to match |
by.feat |
"probe" or "gene", indicating if match should be done by probe position or gene name. |
ref |
hg18 or hg19 (only relevant if matching is done by probe position). |
Synonyms |
data.frame with available synonym gene names to match (only relevant if matching is done by gene name). |
A matrix with the copy number features
Oscar M Rueda
Internal function for matching expression features.
getExpfeatures(Exp, Probes, Synonyms, by.feat)
getExpfeatures(Exp, Probes, Synonyms, by.feat)
Exp |
Matrix of expression features |
Probes |
Vector of probes to match |
Synonyms |
vector of synonyms fo gene names |
by.feat |
either "probe" or "gene" |
A matrix with the Probes
in Exp.
This function is not supposed to be called directly. use matchFeatures
instead.
Oscar M Rueda
Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352.
Goodness of fit results of the iC10 classifier: this function computes correlations between the signatures of the training dataset and the classified features.
goodnessOfFit(obj, iC10=1:10, newdata=NULL,...) ## S3 method for class 'iC10' goodnessOfFit(obj, iC10=1:10, newdata=NULL,...)
goodnessOfFit(obj, iC10=1:10, newdata=NULL,...) ## S3 method for class 'iC10' goodnessOfFit(obj, iC10=1:10, newdata=NULL,...)
obj |
An object of |
iC10 |
Groups to compute goodness of fit. |
newdata |
The feature data to compute the goodness of fit. Must be the samples classified in |
... |
Additional arguments passed to |
It prints the correlation for each iC10.
Oscar M Rueda
Ali HR et al. Genome-driven integrated classification of breast cancer validated in over 7,500 samples. Genome Biology 2014; 15:431. Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352.
iC10
require(iC10TrainingData) data(train.CN) data(train.Exp) features <- matchFeatures(Exp=train.Exp, Exp.by.feat="probe") features <- normalizeFeatures(features, "scale") res <- iC10(features) goodnessOfFit(res, newdata=features)
require(iC10TrainingData) data(train.CN) data(train.Exp) features <- matchFeatures(Exp=train.Exp, Exp.by.feat="probe") features <- normalizeFeatures(features, "scale") res <- iC10(features) goodnessOfFit(res, newdata=features)
iC10 implements the classifier described in the paper 'Genome-driven integrated classification of breast cancer validated in over 7,500 samples' (Ali HR et al., Genome Biology 2014). It uses copy number and/or expression form breast cancer data, trains a pamr classifier (Tibshirani et al.) with the features available and predicts the iC10 group.
iC10(x, seed=25435)
iC10(x, seed=25435)
x |
An object with class |
seed |
seed to initialize random number generator. It is passed to |
This function trains a pamr
classifier and predicts the set of
samples. The shrinkage parameter is obtained with crossvalidation,
therefore different runs can give different results (unless a seed is specified).
An object of class iC10
. A list with the following elements:
class |
Prediction classes for the samples |
posterior |
Probablitites for each sample to belong to each of the 10 groups |
centroids |
Shrunken Centroids for each of the 10 groups. |
fitted |
Normalized features for the samples classified. |
map.cn |
Annotation data for the copy number features |
map.exp |
Annotation data for the expression features |
Oscar M. Rueda
Ali HR et al. Genome-driven integrated classification of breast cancer validated in over 7,500 samples. Genome Biology 2014; 15:431. Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352. Tibshirani et al. Diagnosis of multiple cancer types by shrunken centroids of gene expression. PNAS 2002; 99(10):6567-6572.
See pamr.train
, pamr.cv
and pamr.predict
in package pamr
.
require(iC10TrainingData) data(train.CN) data(train.Exp) features <- matchFeatures(Exp=train.Exp, Exp.by.feat="probe") features <- normalizeFeatures(features, "scale") res <- iC10(features)
require(iC10TrainingData) data(train.CN) data(train.Exp) features <- matchFeatures(Exp=train.Exp, Exp.by.feat="probe") features <- normalizeFeatures(features, "scale") res <- iC10(features)
This function matches available copy number and/or expression data features to the training signatures; using either genomic position or HUGO gene name for copy number features and either Illumina probe names or HUGO gene name for expression features.
matchFeatures(CN = NULL, Exp = NULL, CN.by.feat = c("gene", "probe"), Exp.by.feat = c("gene", "probe"), ref="hg19")
matchFeatures(CN = NULL, Exp = NULL, CN.by.feat = c("gene", "probe"), Exp.by.feat = c("gene", "probe"), ref="hg19")
CN |
Data must be log2 copy number ratios. Two formats are allowed:
- a matrix where each row represents a gene and each column a sample.
In this case |
Exp |
Matrix with the expression data to classify. Each row must be a gene or an Illumina probe, and each
column must correspond to a sample.
Rownames must be either Illumina probes, in which case |
CN.by.feat |
Either "probe" or "gene", Default is "probe". |
Exp.by.feat |
Either "probe" or "gene", Default is "gene". |
ref |
Either "hg18", "hg19" or "hg38". It is used to match the copy number probes if |
One of CN
or Exp
must be not NULL.
If matching is done by gene, hgnc gene name is used to match the rownames of the features. A list of
synonym gene names is used (see Map.All
).
For copy number features matched by probe, the maximum log ratio in absolute value inside the limits
of the feature is used. If there is no copy number in that region, the value of the probe before it is used.
A list with the following elements is returned:
CN |
copy number data to classify |
train.CN |
copy number training data |
Exp |
expression data to classify |
train.Exp |
expression training data |
train.iC10 |
iC10 assignments for the training data |
map.cn |
annotation data for the copy number features |
map.exp |
annotation data for the expression features |
Note that the training set will be different, depending on the features matched. Genomic annotation for the training dataset has been obtained from Mark Dunning's lluminaHumanv3.db package.
Oscar M Rueda
Ali HR et al. Genome-driven integrated classification of breast cancer validated in over 7,500 samples. Genome Biology 2014; 15:431. Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352.
require(iC10TrainingData) data(train.CN) data(train.Exp) features <- matchFeatures(Exp=train.Exp,Exp.by.feat="probe", ref="hg18") str(features)
require(iC10TrainingData) data(train.CN) data(train.Exp) features <- matchFeatures(Exp=train.Exp,Exp.by.feat="probe", ref="hg18") str(features)
Normalization of expression features. Several methods available in the package CONOR
can be used.
normalizeFeatures(x, method=c("none", "scale"))
normalizeFeatures(x, method=c("none", "scale"))
x |
An object result of a call to |
method |
Several methods are available: "none": No normalization is done "scale": Each expression feature is scaled to have zero mean and standard deviation 1 |
No further normalization is needed on the copy number, as log2 ratios are comparable between platforms.
A list of the same format as matchFeatures
, but with train.Exp
anfd Exp
normalized.
As CONOR
package is no longer maintained, the methods are not available temporarily. We will include more normalization methods in the next version of this package.
Oscar M Rueda
Ali HR et al. Genome-driven integrated classification of breast cancer validated in over 7,500 samples. Genome Biology 2014; 15:431. Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352.
require(iC10TrainingData) data(train.CN) data(train.Exp) features <- matchFeatures(Exp=train.Exp, Exp.by.feat="probe", ref="hg18") features <- normalizeFeatures(features, "scale")
require(iC10TrainingData) data(train.CN) data(train.Exp) features <- matchFeatures(Exp=train.Exp, Exp.by.feat="probe", ref="hg18") features <- normalizeFeatures(features, "scale")
Plot results of the iC10 classifier, in two different formats: either the signatures of the training set or the signatures of the new data classified.
## S3 method for class 'iC10' plot(x, sample.name=1, newdata = NULL,...)
## S3 method for class 'iC10' plot(x, sample.name=1, newdata = NULL,...)
x |
An object of |
sample.name |
Number of sample to plot (if |
newdata |
An object result to call to |
... |
Additional arguments passed to |
Two types of plots can be produced. If newdata
is NULL, a panel 6x2 is drawn with the 10 profiles
of the signatures of the training set and the profile of the features of sample.name
and the
distribution of the probabilities of classification to each iC10 for that sample.
If newdata
is not nutll, a panel 6x2 (with the 11th panel empty) is drawn with the 10 profiles of
newdata
samples and their distribution into the clusters.
The features are sorted by type: copy number (if available) are drawn in grey, and then expression, each
of them are sorted by genomic position.
A 6x2 plot is produced.
Oscar M Rueda
Ali HR et al. Genome-driven integrated classification of breast cancer validated in over 7,500 samples. Genome Biology 2014; 15:431. Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352.
iC10
require(iC10TrainingData) data(train.CN) data(train.Exp) features <- matchFeatures(Exp=train.Exp, Exp.by.feat="probe") features <- normalizeFeatures(features, "scale") res <- iC10(features) plot(res, sample.name=10) plot(res, newdata=features)
require(iC10TrainingData) data(train.CN) data(train.Exp) features <- matchFeatures(Exp=train.Exp, Exp.by.feat="probe") features <- normalizeFeatures(features, "scale") res <- iC10(features) plot(res, sample.name=10) plot(res, newdata=features)
Print results of the iC10 classifier
## S3 method for class 'iC10' print(x, ...)
## S3 method for class 'iC10' print(x, ...)
x |
An object of |
... |
Additional arguments passed to |
It returns a call to str
.
Oscar M Rueda
Ali HR et al. Genome-driven integrated classification of breast cancer validated in over 7,500 samples. Genome Biology 2014; 15:431. Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352.
iC10
require(iC10TrainingData) data(train.CN) data(train.Exp) features <- matchFeatures(Exp=train.Exp, Exp.by.feat="probe") features <- normalizeFeatures(features, "scale") res <- iC10(features) res
require(iC10TrainingData) data(train.CN) data(train.Exp) features <- matchFeatures(Exp=train.Exp, Exp.by.feat="probe") features <- normalizeFeatures(features, "scale") res <- iC10(features) res
Summary results of the iC10 classifier: shows the distribution of samples classified into each iC10 group and a summary of the maximum posterior probablity for each sample. Small values pinpoint samples with no clear group assigned.
## S3 method for class 'iC10' summary(object, ...)
## S3 method for class 'iC10' summary(object, ...)
object |
An object of |
... |
Additional arguments passed to |
The function prints a table of the classification ad a summary of the maximum posterior probability for each sample.
Oscar M Rueda
Ali HR et al. Genome-driven integrated classification of breast cancer validated in over 7,500 samples. Genome Biology 2014; 15:431. Curtis et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012; 486:346-352. Tibshirani et al. Diagnosis of multiple cancer types by shrunken centroids of gene expression. PNAS 2002; 99(10):6567-6572.
See iC10
and pamr.train
, pamr.cv
and pamr.predict
in package pamr
.
require(iC10TrainingData) data(train.CN) data(train.Exp) features <- matchFeatures(Exp=train.Exp, Exp.by.feat="probe", ref="hg18") features <- normalizeFeatures(features, "scale") res <- iC10(features) summary(res)
require(iC10TrainingData) data(train.CN) data(train.Exp) features <- matchFeatures(Exp=train.Exp, Exp.by.feat="probe", ref="hg18") features <- normalizeFeatures(features, "scale") res <- iC10(features) summary(res)