forked from duty-machine/duty-machine
-
Notifications
You must be signed in to change notification settings - Fork 32
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
15 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
--- | ||
title: "最近准备做点泛癌分析,这些找数据集的法子我都收藏起来了!" | ||
date: 2025-01-18T15:10:44Z | ||
draft: ["false"] | ||
tags: [ | ||
"fetched", | ||
"生信碱移" | ||
] | ||
categories: ["Acdemic"] | ||
--- | ||
最近准备做点泛癌分析,这些找数据集的法子我都收藏起来了! by 生信碱移 | ||
------ | ||
<div><section data-tool="markdown编辑器" data-website="https://markdown.com.cn/editor"><section powered-by="xiumi.us"><section><section powered-by="xiumi.us"><section><section><section powered-by="xiumi.us"><section><section powered-by="xiumi.us"><section><img data-imgfileid="100010999" data-ratio="1.0324675324675325" data-src="https://mmbiz.qpic.cn/mmbiz_gif/lN9Tp5oiaqHFn9Rg6MwMU3ukMR9ROPh7bf7QWHEMwhUBUwSUKFsV8oK9noHic3jLaeJVQewHJcLq1cTXVAat35Tw/640?wx_fmt=gif&wxfrom=5&wx_lazy=1" data-type="gif" data-w="154" src="https://mmbiz.qpic.cn/mmbiz_gif/lN9Tp5oiaqHFn9Rg6MwMU3ukMR9ROPh7bf7QWHEMwhUBUwSUKFsV8oK9noHic3jLaeJVQewHJcLq1cTXVAat35Tw/640?wx_fmt=gif&wxfrom=5&wx_lazy=1"></section></section></section></section></section><section><section powered-by="xiumi.us"><section><p>老铁快点击蓝字 <strong>关注俺</strong></p></section></section></section><section><section powered-by="xiumi.us"><section><section powered-by="xiumi.us"><section><img data-ratio="1.0324675324675325" data-type="gif" data-w="154" data-src="https://mmbiz.qpic.cn/mmbiz_gif/lN9Tp5oiaqHFn9Rg6MwMU3ukMR9ROPh7bf7QWHEMwhUBUwSUKFsV8oK9noHic3jLaeJVQewHJcLq1cTXVAat35Tw/640?wx_fmt=gif&wxfrom=5&wx_lazy=1" data-imgfileid="100011001" src="https://mmbiz.qpic.cn/mmbiz_gif/lN9Tp5oiaqHFn9Rg6MwMU3ukMR9ROPh7bf7QWHEMwhUBUwSUKFsV8oK9noHic3jLaeJVQewHJcLq1cTXVAat35Tw/640?wx_fmt=gif&wxfrom=5&wx_lazy=1"></section></section></section></section></section></section></section></section></section><section data-mpa-powered-by="yiban.io" data-style='white-space: normal; max-width: 100%; letter-spacing: 0.544px; text-size-adjust: auto; background-color: rgb(255, 255, 255); font-family: "Helvetica Neue", Helvetica, "Hiragino Sans GB", "Microsoft YaHei", Arial, sans-serif; box-sizing: border-box !important; overflow-wrap: break-word !important;'><section><section><section><section data-id="85660" data-custom="rgb(117, 117, 118)" data-color="rgb(117, 117, 118)"><section data-style="margin-top: 2em; padding-top: 0.5em; padding-bottom: 0.5em; max-width: 100%; border-style: solid none; text-decoration: inherit; border-top-color: rgb(204, 204, 204); border-bottom-color: rgb(204, 204, 204); border-top-width: 1px; border-bottom-width: 1px; box-sizing: border-box !important; overflow-wrap: break-word !important;"><p><span>生信碱移</span></p><section><strong>泛癌数据</strong></section></section></section></section></section></section></section><p><span>想做一篇好的研究<span><strong>最重要的前提</strong></span>便是找到合适的数据,小编最先接触公开数据库分析时,其实也是靠着少数关键词在各大数据库游走</span><img data-ratio="1" data-src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Yellowdog.png" data-w="128" src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Yellowdog.png"><span>。</span></p><p><span><strong>最近准备整理一下泛癌的数据集</strong>,按照以前的笨办法,当然可以考虑在<span><strong>GEO</strong></span>等数据库利用关键词检索相关数据集:</span></p><section><img data-galleryid="" data-imgfileid="100011783" data-ratio="0.5285714285714286" data-s="300,640" data-src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNF8icA4lMt6TlNNo9OibLibCqj8autQ2nIm59icLmtiaUibnpd3Cgia06jqDdA/640?wx_fmt=png&from=appmsg" data-type="png" data-w="700" src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNF8icA4lMt6TlNNo9OibLibCqj8autQ2nIm59icLmtiaUibnpd3Cgia06jqDdA/640?wx_fmt=png&from=appmsg"></section><p data-class="mbImgTitle"><span>▲ GEO数据上使用关键词检索泛癌相关数据集。</span></p><p><span><strong>尽管这样做看起来收到的数据集很多,但是往往会有一些遗漏</strong>。比如,不同的数据集质量可能<span><strong>参差不齐</strong></span>,存在缺乏与临床特征、治疗反应或预后等信息,需要人工进行筛选整合。<strong>属于是数据整理时间太长,试错成本太高</strong><strong><img data-ratio="1" data-src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Yellowdog.png" data-w="128" src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Yellowdog.png">。</strong></span></p><section><span><strong>当然,也可以使用</strong><span><strong>Xena</strong></span><strong>,这个数据库储存了多种癌症的多组学数据,下载即用:</strong></span></section><ul data-tool="markdown.com.cn编辑器"><li><section>https://xenabrowser.net/datapages/</section></li></ul><section><img data-galleryid="" data-imgfileid="100011784" data-ratio="0.8212962962962963" data-s="300,640" data-src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rN0LTHRZUmI7yf1I24hGA4Fsia7ZIxSic371JvUfXjLcdgVKUP5dNxOBLA/640?wx_fmt=png&from=appmsg" data-type="png" data-w="1080" src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rN0LTHRZUmI7yf1I24hGA4Fsia7ZIxSic371JvUfXjLcdgVKUP5dNxOBLA/640?wx_fmt=png&from=appmsg"></section><p data-class="mbImgTitle"><span>▲ xena中20个癌症数据集的多组学数据整合。</span></p><p data-tool="markdown.com.cn编辑器"><span>最近还发现了<span><strong>TCGA</strong></span>数据库中的<strong>publications界面</strong>:</span></p><ul data-tool="markdown.com.cn编辑器"><li><section>https://gdc.cancer.gov/about-data/publications</section></li></ul><p data-tool="markdown.com.cn编辑器"><span>这个界面储存了大量TCGA项目相关的已发表文献,基本都是子刊、正刊级别的文章。<span><strong>最重要的是,每篇文章</strong></span><span><strong>使用的数据</strong></span><span><strong>都上传到了这个界面!</strong></span><br></span></p><p data-tool="markdown.com.cn编辑器"><span>比如,在它的<strong>PanCanAtlas合集</strong>当中<strong><img data-ratio="1" data-src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Yellowdog.png" data-w="128" src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Yellowdog.png">,</strong>收录了<span><strong>21</strong></span>篇文献分析所使用的数据:</span></p><p><img data-galleryid="" data-imgfileid="100011785" data-ratio="0.8703703703703703" data-s="300,640" data-src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNOZiacBQJ7AZANRvEYnVr4lYVVAgibEJMIyOFsJ4E8w9ardmHDXZPGxJg/640?wx_fmt=png&from=appmsg" data-type="png" data-w="1080" src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNOZiacBQJ7AZANRvEYnVr4lYVVAgibEJMIyOFsJ4E8w9ardmHDXZPGxJg/640?wx_fmt=png&from=appmsg"></p><section><span>▲TCGA PanCanAtlas合集中的文章数据,都是顶刊<strong>。</strong></span></section><p data-tool="markdown.com.cn编辑器"><span>比如像上面的第一篇文章,便是<span><strong>八千</strong></span>多个肿瘤样本的<strong>可变剪接</strong>数据:</span></p><p><img data-galleryid="" data-imgfileid="100011788" data-ratio="1.0740740740740742" data-s="300,640" data-src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rN62cQ7nGImapWs5pAicIxkObeWFaQtW3f8LickzexHfYh3ic64ccc6RhvA/640?wx_fmt=png&from=appmsg" data-type="png" data-w="1080" src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rN62cQ7nGImapWs5pAicIxkObeWFaQtW3f8LickzexHfYh3ic64ccc6RhvA/640?wx_fmt=png&from=appmsg"></p><section><span>▲ 八千个样本的可变剪接数据,甚至还有蛋白质质谱!</span></section><p><img data-galleryid="" data-imgfileid="100011790" data-ratio="0.9" data-s="300,640" data-src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNUG31qv6sWDZXRzDZ40h6OqmBnLX6Z4XTOQq5uYgVvZ5PHvYYSl1icgQ/640?wx_fmt=png&from=appmsg" data-type="png" data-w="1080" src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNUG31qv6sWDZXRzDZ40h6OqmBnLX6Z4XTOQq5uYgVvZ5PHvYYSl1icgQ/640?wx_fmt=png&from=appmsg"></p><p data-class="mbImgTitle"><span>▲ 所有的数据都能够直接下载,还贴心的准备了多种格式。</span></p><p data-tool="markdown.com.cn编辑器"><span>当然,按照年份筛选,还能够看到最新发表的一些数据。比如下面这个<strong>三维基因组</strong>的数据:</span></p><p><img data-galleryid="" data-imgfileid="100011793" data-ratio="0.9990740740740741" data-s="300,640" data-src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNOvXDnkcvOGWTrib9oFXjWGJBfw6CzJJ3qkmmy1LTxb8Wg0rWSgjdYyg/640?wx_fmt=png&from=appmsg" data-type="png" data-w="1080" src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNOvXDnkcvOGWTrib9oFXjWGJBfw6CzJJ3qkmmy1LTxb8Wg0rWSgjdYyg/640?wx_fmt=png&from=appmsg"></p><p data-class="mbImgTitle"><span>▲ 筛选2024年发表的数据,第一篇是三维基因组相关的泛癌数据集。</span></p><p><img data-galleryid="" data-imgfileid="100011796" data-ratio="0.812962962962963" data-s="300,640" data-src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNYfGicvG8YNT22oIcJyGtGAIaf8NJXFWSsSWKmbgiaNEcN8QOStfnAXkg/640?wx_fmt=png&from=appmsg" data-type="png" data-w="1080" src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNYfGicvG8YNT22oIcJyGtGAIaf8NJXFWSsSWKmbgiaNEcN8QOStfnAXkg/640?wx_fmt=png&from=appmsg"></p><section><span>▲ 这个数据集包含全基因组测序(WGS)中的CNV、SV、SNV和Indel变异;还包括HiChIP数据,储存染色质三维结构与基因调控的关系。</span></section><p data-tool="markdown.com.cn编辑器"><span>除了上面这些个方案以外,还可以直接在<strong><span>谷歌学术</span></strong>或者<strong><span>Pubmed</span></strong>上直接查查相关的研究。比如,找到一篇同样做泛癌的研究,看看他们用了什么数据集。<strong>下面列一个复旦/上海交大团队去年发表的泛癌分析,别人用的数据集可不少</strong><strong>:</strong></span></p><section><img data-galleryid="" data-imgfileid="100011800" data-ratio="0.5048828125" data-s="300,640" data-src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNMDHttbytMjlKYaerCEA5QrrowsicJujBU4eeibCAtX1gpjGbQDTMFzXw/640?wx_fmt=png&from=appmsg" data-type="png" data-w="1024" src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNMDHttbytMjlKYaerCEA5QrrowsicJujBU4eeibCAtX1gpjGbQDTMFzXw/640?wx_fmt=png&from=appmsg"></section><p data-class="mbImgTitle"><span>▲ DOI: 10.1186/s13046-024-03042-7。</span></p><p><img data-imgfileid="100011801" data-ratio="0.47314814814814815" data-src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNJPhPM2Hcooyic5bBDFzLxGHicm7Y7FRGiaM9vDhgsXkicKUtVOiapqIju9A/640?wx_fmt=png&from=appmsg" data-type="png" data-w="1080" src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNJPhPM2Hcooyic5bBDFzLxGHicm7Y7FRGiaM9vDhgsXkicKUtVOiapqIju9A/640?wx_fmt=png&from=appmsg"></p><section><span>▲ 2024年的泛癌数据分析,工作量可太大了。</span></section><p><strong><span><strong><span><strong><span>数据集还是不难找的</span><span> </span></strong><strong><span><img data-ratio="1" data-src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Yellowdog.png" data-w="128" src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Yellowdog.png"></span></strong></span></strong></span></strong></p><section><span><strong>起步就是要把数据集做大</strong></span><strong><strong></strong></strong></section><section><span><strong><strong>至少工作量摆在这</strong></strong></span></section><section><span><strong><strong><strong>审稿人都得心软</strong></strong></strong></span></section><section><span><strong>欢迎各位关注<img data-src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Yellowdog.png" data-ratio="1" data-w="128" src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Yellowdog.png"><img data-src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/Expression/[email protected]" data-ratio="1" data-w="128" src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/Expression/[email protected]"><img data-src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Worship.png" data-ratio="1" data-w="128" src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Worship.png"></strong></span></section><section><mp-common-profile data-id="MzkyNTIzMzYyMA==" data-pluginname="mpprofile" data-headimg="http://mmbiz.qpic.cn/mmbiz_png/LvUIqvYKCeXYZNMxRMnjiaicO2a27jDZ2FgQga8TdeQcsGRJRIn2IInkKtfcbbMXOBSViaPXpTOBulUlNzd11pzow/0?wx_fmt=png" data-nickname="生信碱移" data-alias="liudoufu307" data-signature="春来秋至,分享我的所见与所识" data-from="2" data-weuitheme="light"></mp-common-profile></section><section><br></section><section><strong><span>END~</span></strong><br></section><p><span><strong>仅供粉丝老铁们参考</strong></span></p><section><strong>如有侵权或错误,请联系删除改正~</strong></section><section data-class="_mbEditor" data-id="32689"><section><section><section><p><span><strong mpa-from-tpl="t">推荐好文,点击查看:</strong></span></p><p><a target="_blank" href="http://mp.weixin.qq.com/s?__biz=MzkyNTIzMzYyMA==&mid=2247490399&idx=1&sn=bab116f9b25cfef81c690cfa566ec034&chksm=c1c8e3e4f6bf6af2ac22ee98441129f72698139a2fc5cba2607a7cad46449da498e2a37e732f&scene=21#wechat_redirect" textvalue="【教程】一文读懂孟德尔随机化!" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2" hasload="1">【教程】一文读懂孟德尔随机化!</a><br></p><p><a target="_blank" href="https://mp.weixin.qq.com/s?__biz=MzkyNTIzMzYyMA==&mid=2247495382&idx=1&sn=33c7576a1cdd370361e3667b4346dcc8&scene=21#wechat_redirect" textvalue="【工具】NBT: 单细胞可解释张量分解算法sclTD!" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2"><span>【工具】还在用单细胞非负矩阵分解吗?广义二值协方差分解+疾病异质性,又是遥遥领先了!(GBCD包)</span></a></p><p><a target="_blank" href="https://mp.weixin.qq.com/s?__biz=MzkyNTIzMzYyMA==&mid=2247495398&idx=1&sn=ee8b90bc53b8885d39f9005936984f88&scene=21#wechat_redirect" textvalue="【文献】这篇顶刊纯生信领先常规生信分析一个版本!" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2">【文献】Transofomer+图表示学习,16种泛癌的纯数据库深度学习又又又登上顶刊了!</a></p><p><a target="_blank" href="https://mp.weixin.qq.com/s?__biz=MzkyNTIzMzYyMA==&mid=2247495197&idx=1&sn=c62596642fbefba5bb5c6e7070af84f7&scene=21#wechat_redirect" textvalue="【期刊】无版面费低分 SCI 期刊,近期发表多篇纯网药研究!" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2"><span>【期刊】收稿范围广泛!影响因子3+,中科院3区,近期刊登纯生信研究!</span></a></p></section></section></section></section></section><p><mp-style-type data-value="3"></mp-style-type></p></div> | ||
<hr> | ||
<a href="https://mp.weixin.qq.com/s/y2gbdm3V3IjoM2gxTtGkUA",target="_blank" rel="noopener noreferrer">原文链接</a> |