开放共赢
关注创新
俞鸿
副总经理
手机:[1**********]
E-mail: [email protected]
1
ESTs(Expressed Sequence tags )是从cDNA文件中随机挑
选单次测序的短序列,提供了全基因组测序的廉价替代方案选单次测序的短序列
提供了全基因组测序的廉价替代方案。
基因查找
补充基因组
表达量比较分析
辅助基因结构的鉴定交替剪接的分析SNP分析
蛋白质组学质谱搜库
俞鸿
cDNA文库构建文库测序
序列前处理聚类与拼接数数据库匹配库功能注释其它分析
表达谱分析交替剪接分析SSR分析
资源库
cDNA文库测序文库序–454,illuminaHiSeq 2000, SoLid,3730,…
cDNA文库
是指某生物某发育时期所转录的全部片段与某种载体连接而形成的克隆的集合。mRNA 经反转录形成的
具有组织细胞特异性比基因组文库小的多
已经过剪接、去除了内含子的cDNA
文库类型
非标准化cDNA文库均一化cDNA 文库差减cDNA文库
抑制性差减cDNA文库
文库测序
单向测序双向测序
cDNA
传统Sanger测序方法
二代测序
Roche / 454
Genome Sequencer FLX
3730
Illumina
传统测序转录组测序结果分析
8
Phred
scores
Phred scores
q=20, 99% base calling accuracy
载体序列屏蔽
严格的聚类方法(
Stringent)
产生的一致性序列比较短表达基因ESTs数据的覆盖率低
因此所含有的同一基因的不含有的基的同转录形式少序列保真度高
TIGR Gene Indices
松散的聚类方法(loose)
产生的一致性序列比较长表达基因ESTs数据的覆盖率高
含有同一基因不同的转录形式,如各种选择性剪接体式如各种选择性剪接体每一类中可能包含旁系同源基因(paralogous基因(pg
expressed gene)的转录本序列的保真度低stackPACK kPACK
Unigene 的聚类方法位于两者之间
Assembies/contigs and singletons Assembies/contigsand
singletons
number
Total length
Length distributionLength distribution Contig depth statistics
序列相似性比对
BLAST,BLAT
NR,UniRef100,Genome sequences,etc.
Domain与motif查找
Interproscan, pfam
GO功能分类与富集分析
BLAST2GO, etc.
基本统计信息
SNP number
SNP出现频率
Non-synonymous and synonymous其他统计信息
non-
S is the number of SNPs detected in the contig, L isg,
the contig sequence length and D is the sequencing depth
β is useful as a relative measurement to compare the nucleotide diversitybetween contigs generated within this
project between contigs generated within this project.
Coding sequence measuring more than 200 bp and an averagesequencing depth of at least 10 reads/nt.
Three β parameters were calculated for each contig:
•βT , which estimates the diversity on the entire contigs, including its non-coding including its noncoding regions; regions;
•βN ,which estimates the diversity in non-synonymous sites; •βS , which estimates the diversity in synonymous sites.
The GS Reference Mapper(454 Life The GS
Reference
Mapper(454Life Science) Pyrobayrs
Roche 454转录组数据分析
lake sturgeon (Acipenser fulvescens): the relative merits of normalization g (p )
Hale MC, McCormick CR, Jackson JR, Dewoody JA.
BMC Genomics. 2009Apr 29;10:203.BMC Genomics. 2009 Apr 29;10:203.
PMID: 19402907 [PubMed -indexed for MEDLINE]
5
Libraries 5 Libraries
Normalized libraries 1-2
Native libraries 3-5
PCAP, not Newbler assembler
Best BLAST hit03an e-value ≤ 1 ×≤
1
10-03
and a bit score > 40 was
considered a significant
match
877 candidate SNPs
~1SNP/460bp
one in every 192 bp in Eucalypt and one in 214 bp in maize
Indel-type errors Indel type errors
Classification statistics
18 SNP were unique to one sex
Males 16, females 2Males 16females 2
Ts/Tv ratio
Ts: transitions
Tv: transversions
every 5.2 reads (on average) resulted every 5
2
reads
(onaverage)
resulted in a different significant BLAST hit.
Data format conversiongenome (8-10h/sample)Map reads onto the
Transcript assembly
and quantification
Different expression test
Pathway mapping
RNA-seq数据分析
q数据分析
29
Nat BiotechnolN t
Bi
t
h
l . 2009,27(5):455
Genome Biology2009, 10:R25
?
Nat Biotechnol. 2009,27(5):455()
Bioinformatics 2009,25 (9):1105
•
Genes with low expression level (FPKM
感谢余曜、徐昊!
上海生咨信息技术有限公司是由新加坡海归科研人员创办的生物医药外包研发的高科技企业。公司致力于药物设计、生物信息服务、生物软件开发与代理生物数据库开发主营业务涉及为开发与代理、生物数据库开发,主营业务涉及为医药研发企业科研院所提供专业的研发软件,药物优化与筛选,药物机理验证,基因芯片、蛋白
质芯片、质谱及大规模测序相关的实验设计、数据分析和结果验证,使客户能有连贯整合的咨询及解决方案。
地址上海复旦枫林科技园区上海市肇嘉浜路地址:上海复旦枫林科技园区446弄2号楼1001室座机:021-64261235
官网:联系人:俞鸿
[1**********]
[email protected] @bif
培训业务
454和
Solexa测序服务
药物设计分析服务
生物医药平台搭建服务
软件销售
BIOREFER business
定制服务
生物软件和数据库开发
生物信息外包服务
生物信息分析服务
35
开放共赢
关注创新
俞鸿
副总经理
手机:[1**********]
E-mail: [email protected]
1
ESTs(Expressed Sequence tags )是从cDNA文件中随机挑
选单次测序的短序列,提供了全基因组测序的廉价替代方案选单次测序的短序列
提供了全基因组测序的廉价替代方案。
基因查找
补充基因组
表达量比较分析
辅助基因结构的鉴定交替剪接的分析SNP分析
蛋白质组学质谱搜库
俞鸿
cDNA文库构建文库测序
序列前处理聚类与拼接数数据库匹配库功能注释其它分析
表达谱分析交替剪接分析SSR分析
资源库
cDNA文库测序文库序–454,illuminaHiSeq 2000, SoLid,3730,…
cDNA文库
是指某生物某发育时期所转录的全部片段与某种载体连接而形成的克隆的集合。mRNA 经反转录形成的
具有组织细胞特异性比基因组文库小的多
已经过剪接、去除了内含子的cDNA
文库类型
非标准化cDNA文库均一化cDNA 文库差减cDNA文库
抑制性差减cDNA文库
文库测序
单向测序双向测序
cDNA
传统Sanger测序方法
二代测序
Roche / 454
Genome Sequencer FLX
3730
Illumina
传统测序转录组测序结果分析
8
Phred
scores
Phred scores
q=20, 99% base calling accuracy
载体序列屏蔽
严格的聚类方法(
Stringent)
产生的一致性序列比较短表达基因ESTs数据的覆盖率低
因此所含有的同一基因的不含有的基的同转录形式少序列保真度高
TIGR Gene Indices
松散的聚类方法(loose)
产生的一致性序列比较长表达基因ESTs数据的覆盖率高
含有同一基因不同的转录形式,如各种选择性剪接体式如各种选择性剪接体每一类中可能包含旁系同源基因(paralogous基因(pg
expressed gene)的转录本序列的保真度低stackPACK kPACK
Unigene 的聚类方法位于两者之间
Assembies/contigs and singletons Assembies/contigsand
singletons
number
Total length
Length distributionLength distribution Contig depth statistics
序列相似性比对
BLAST,BLAT
NR,UniRef100,Genome sequences,etc.
Domain与motif查找
Interproscan, pfam
GO功能分类与富集分析
BLAST2GO, etc.
基本统计信息
SNP number
SNP出现频率
Non-synonymous and synonymous其他统计信息
non-
S is the number of SNPs detected in the contig, L isg,
the contig sequence length and D is the sequencing depth
β is useful as a relative measurement to compare the nucleotide diversitybetween contigs generated within this
project between contigs generated within this project.
Coding sequence measuring more than 200 bp and an averagesequencing depth of at least 10 reads/nt.
Three β parameters were calculated for each contig:
•βT , which estimates the diversity on the entire contigs, including its non-coding including its noncoding regions; regions;
•βN ,which estimates the diversity in non-synonymous sites; •βS , which estimates the diversity in synonymous sites.
The GS Reference Mapper(454 Life The GS
Reference
Mapper(454Life Science) Pyrobayrs
Roche 454转录组数据分析
lake sturgeon (Acipenser fulvescens): the relative merits of normalization g (p )
Hale MC, McCormick CR, Jackson JR, Dewoody JA.
BMC Genomics. 2009Apr 29;10:203.BMC Genomics. 2009 Apr 29;10:203.
PMID: 19402907 [PubMed -indexed for MEDLINE]
5
Libraries 5 Libraries
Normalized libraries 1-2
Native libraries 3-5
PCAP, not Newbler assembler
Best BLAST hit03an e-value ≤ 1 ×≤
1
10-03
and a bit score > 40 was
considered a significant
match
877 candidate SNPs
~1SNP/460bp
one in every 192 bp in Eucalypt and one in 214 bp in maize
Indel-type errors Indel type errors
Classification statistics
18 SNP were unique to one sex
Males 16, females 2Males 16females 2
Ts/Tv ratio
Ts: transitions
Tv: transversions
every 5.2 reads (on average) resulted every 5
2
reads
(onaverage)
resulted in a different significant BLAST hit.
Data format conversiongenome (8-10h/sample)Map reads onto the
Transcript assembly
and quantification
Different expression test
Pathway mapping
RNA-seq数据分析
q数据分析
29
Nat BiotechnolN t
Bi
t
h
l . 2009,27(5):455
Genome Biology2009, 10:R25
?
Nat Biotechnol. 2009,27(5):455()
Bioinformatics 2009,25 (9):1105
•
Genes with low expression level (FPKM
感谢余曜、徐昊!
上海生咨信息技术有限公司是由新加坡海归科研人员创办的生物医药外包研发的高科技企业。公司致力于药物设计、生物信息服务、生物软件开发与代理生物数据库开发主营业务涉及为开发与代理、生物数据库开发,主营业务涉及为医药研发企业科研院所提供专业的研发软件,药物优化与筛选,药物机理验证,基因芯片、蛋白
质芯片、质谱及大规模测序相关的实验设计、数据分析和结果验证,使客户能有连贯整合的咨询及解决方案。
地址上海复旦枫林科技园区上海市肇嘉浜路地址:上海复旦枫林科技园区446弄2号楼1001室座机:021-64261235
官网:联系人:俞鸿
[1**********]
[email protected] @bif
培训业务
454和
Solexa测序服务
药物设计分析服务
生物医药平台搭建服务
软件销售
BIOREFER business
定制服务
生物软件和数据库开发
生物信息外包服务
生物信息分析服务
35