Skip to content

Commit

Permalink
最近准备做点泛癌分析,这些找数据集的法子我都收藏起来了!
Browse files Browse the repository at this point in the history
  • Loading branch information
ixxmu committed Jan 18, 2025
1 parent 5aaa975 commit 84d3a6c
Showing 1 changed file with 15 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
title: "最近准备做点泛癌分析,这些找数据集的法子我都收藏起来了!"
date: 2025-01-18T15:10:44Z
draft: ["false"]
tags: [
"fetched",
"生信碱移"
]
categories: ["Acdemic"]
---
最近准备做点泛癌分析,这些找数据集的法子我都收藏起来了! by 生信碱移
------
<div><section data-tool="markdown编辑器" data-website="https://markdown.com.cn/editor"><section powered-by="xiumi.us"><section><section powered-by="xiumi.us"><section><section><section powered-by="xiumi.us"><section><section powered-by="xiumi.us"><section><img data-imgfileid="100010999" data-ratio="1.0324675324675325" data-src="https://mmbiz.qpic.cn/mmbiz_gif/lN9Tp5oiaqHFn9Rg6MwMU3ukMR9ROPh7bf7QWHEMwhUBUwSUKFsV8oK9noHic3jLaeJVQewHJcLq1cTXVAat35Tw/640?wx_fmt=gif&amp;wxfrom=5&amp;wx_lazy=1" data-type="gif" data-w="154" src="https://mmbiz.qpic.cn/mmbiz_gif/lN9Tp5oiaqHFn9Rg6MwMU3ukMR9ROPh7bf7QWHEMwhUBUwSUKFsV8oK9noHic3jLaeJVQewHJcLq1cTXVAat35Tw/640?wx_fmt=gif&amp;wxfrom=5&amp;wx_lazy=1"></section></section></section></section></section><section><section powered-by="xiumi.us"><section><p>老铁快点击蓝字 <strong>关注俺</strong></p></section></section></section><section><section powered-by="xiumi.us"><section><section powered-by="xiumi.us"><section><img data-ratio="1.0324675324675325" data-type="gif" data-w="154" data-src="https://mmbiz.qpic.cn/mmbiz_gif/lN9Tp5oiaqHFn9Rg6MwMU3ukMR9ROPh7bf7QWHEMwhUBUwSUKFsV8oK9noHic3jLaeJVQewHJcLq1cTXVAat35Tw/640?wx_fmt=gif&amp;wxfrom=5&amp;wx_lazy=1" data-imgfileid="100011001" src="https://mmbiz.qpic.cn/mmbiz_gif/lN9Tp5oiaqHFn9Rg6MwMU3ukMR9ROPh7bf7QWHEMwhUBUwSUKFsV8oK9noHic3jLaeJVQewHJcLq1cTXVAat35Tw/640?wx_fmt=gif&amp;wxfrom=5&amp;wx_lazy=1"></section></section></section></section></section></section></section></section></section><section data-mpa-powered-by="yiban.io" data-style='white-space: normal; max-width: 100%; letter-spacing: 0.544px; text-size-adjust: auto; background-color: rgb(255, 255, 255); font-family: "Helvetica Neue", Helvetica, "Hiragino Sans GB", "Microsoft YaHei", Arial, sans-serif; box-sizing: border-box !important; overflow-wrap: break-word !important;'><section><section><section><section data-id="85660" data-custom="rgb(117, 117, 118)" data-color="rgb(117, 117, 118)"><section data-style="margin-top: 2em; padding-top: 0.5em; padding-bottom: 0.5em; max-width: 100%; border-style: solid none; text-decoration: inherit; border-top-color: rgb(204, 204, 204); border-bottom-color: rgb(204, 204, 204); border-top-width: 1px; border-bottom-width: 1px; box-sizing: border-box !important; overflow-wrap: break-word !important;"><p><span>生信碱移</span></p><section><strong>泛癌数据</strong></section></section></section></section></section></section></section><p><span>想做一篇好的研究<span><strong>最重要的前提</strong></span>便是找到合适的数据,小编最先接触公开数据库分析时,其实也是靠着少数关键词在各大数据库游走</span><img data-ratio="1" data-src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Yellowdog.png" data-w="128" src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Yellowdog.png"><span>。</span></p><p><span><strong>最近准备整理一下泛癌的数据集</strong>,按照以前的笨办法,当然可以考虑在<span><strong>GEO</strong></span>等数据库利用关键词检索相关数据集:</span></p><section><img data-galleryid="" data-imgfileid="100011783" data-ratio="0.5285714285714286" data-s="300,640" data-src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNF8icA4lMt6TlNNo9OibLibCqj8autQ2nIm59icLmtiaUibnpd3Cgia06jqDdA/640?wx_fmt=png&amp;from=appmsg" data-type="png" data-w="700" src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNF8icA4lMt6TlNNo9OibLibCqj8autQ2nIm59icLmtiaUibnpd3Cgia06jqDdA/640?wx_fmt=png&amp;from=appmsg"></section><p data-class="mbImgTitle"><span>▲ GEO数据上使用关键词检索泛癌相关数据集。</span></p><p><span><strong>尽管这样做看起来收到的数据集很多,但是往往会有一些遗漏</strong>。比如,不同的数据集质量可能<span><strong>参差不齐</strong></span>,存在缺乏与临床特征、治疗反应或预后等信息,需要人工进行筛选整合。<strong>属于是数据整理时间太长,试错成本太高</strong><strong><img data-ratio="1" data-src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Yellowdog.png" data-w="128" src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Yellowdog.png">。</strong></span></p><section><span><strong>当然,也可以使用</strong><span><strong>Xena</strong></span><strong>,这个数据库储存了多种癌症的多组学数据,下载即用:</strong></span></section><ul data-tool="markdown.com.cn编辑器"><li><section>https://xenabrowser.net/datapages/</section></li></ul><section><img data-galleryid="" data-imgfileid="100011784" data-ratio="0.8212962962962963" data-s="300,640" data-src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rN0LTHRZUmI7yf1I24hGA4Fsia7ZIxSic371JvUfXjLcdgVKUP5dNxOBLA/640?wx_fmt=png&amp;from=appmsg" data-type="png" data-w="1080" src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rN0LTHRZUmI7yf1I24hGA4Fsia7ZIxSic371JvUfXjLcdgVKUP5dNxOBLA/640?wx_fmt=png&amp;from=appmsg"></section><p data-class="mbImgTitle"><span>▲ xena中20个癌症数据集的多组学数据整合。</span></p><p data-tool="markdown.com.cn编辑器"><span>最近还发现了<span><strong>TCGA</strong></span>数据库中的<strong>publications界面</strong>:</span></p><ul data-tool="markdown.com.cn编辑器"><li><section>https://gdc.cancer.gov/about-data/publications</section></li></ul><p data-tool="markdown.com.cn编辑器"><span>这个界面储存了大量TCGA项目相关的已发表文献,基本都是子刊、正刊级别的文章。<span><strong>最重要的是,每篇文章</strong></span><span><strong>使用的数据</strong></span><span><strong>都上传到了这个界面!</strong></span><br></span></p><p data-tool="markdown.com.cn编辑器"><span>比如,在它的<strong>PanCanAtlas合集</strong>当中<strong><img data-ratio="1" data-src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Yellowdog.png" data-w="128" src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Yellowdog.png">,</strong>收录了<span><strong>21</strong></span>篇文献分析所使用的数据:</span></p><p><img data-galleryid="" data-imgfileid="100011785" data-ratio="0.8703703703703703" data-s="300,640" data-src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNOZiacBQJ7AZANRvEYnVr4lYVVAgibEJMIyOFsJ4E8w9ardmHDXZPGxJg/640?wx_fmt=png&amp;from=appmsg" data-type="png" data-w="1080" src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNOZiacBQJ7AZANRvEYnVr4lYVVAgibEJMIyOFsJ4E8w9ardmHDXZPGxJg/640?wx_fmt=png&amp;from=appmsg"></p><section><span>▲TCGA PanCanAtlas合集中的文章数据,都是顶刊<strong>。</strong></span></section><p data-tool="markdown.com.cn编辑器"><span>比如像上面的第一篇文章,便是<span><strong>八千</strong></span>多个肿瘤样本的<strong>可变剪接</strong>数据:</span></p><p><img data-galleryid="" data-imgfileid="100011788" data-ratio="1.0740740740740742" data-s="300,640" data-src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rN62cQ7nGImapWs5pAicIxkObeWFaQtW3f8LickzexHfYh3ic64ccc6RhvA/640?wx_fmt=png&amp;from=appmsg" data-type="png" data-w="1080" src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rN62cQ7nGImapWs5pAicIxkObeWFaQtW3f8LickzexHfYh3ic64ccc6RhvA/640?wx_fmt=png&amp;from=appmsg"></p><section><span>▲ 八千个样本的可变剪接数据,甚至还有蛋白质质谱!</span></section><p><img data-galleryid="" data-imgfileid="100011790" data-ratio="0.9" data-s="300,640" data-src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNUG31qv6sWDZXRzDZ40h6OqmBnLX6Z4XTOQq5uYgVvZ5PHvYYSl1icgQ/640?wx_fmt=png&amp;from=appmsg" data-type="png" data-w="1080" src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNUG31qv6sWDZXRzDZ40h6OqmBnLX6Z4XTOQq5uYgVvZ5PHvYYSl1icgQ/640?wx_fmt=png&amp;from=appmsg"></p><p data-class="mbImgTitle"><span>▲ 所有的数据都能够直接下载,还贴心的准备了多种格式。</span></p><p data-tool="markdown.com.cn编辑器"><span>当然,按照年份筛选,还能够看到最新发表的一些数据。比如下面这个<strong>三维基因组</strong>的数据:</span></p><p><img data-galleryid="" data-imgfileid="100011793" data-ratio="0.9990740740740741" data-s="300,640" data-src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNOvXDnkcvOGWTrib9oFXjWGJBfw6CzJJ3qkmmy1LTxb8Wg0rWSgjdYyg/640?wx_fmt=png&amp;from=appmsg" data-type="png" data-w="1080" src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNOvXDnkcvOGWTrib9oFXjWGJBfw6CzJJ3qkmmy1LTxb8Wg0rWSgjdYyg/640?wx_fmt=png&amp;from=appmsg"></p><p data-class="mbImgTitle"><span>▲ 筛选2024年发表的数据,第一篇是三维基因组相关的泛癌数据集。</span></p><p><img data-galleryid="" data-imgfileid="100011796" data-ratio="0.812962962962963" data-s="300,640" data-src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNYfGicvG8YNT22oIcJyGtGAIaf8NJXFWSsSWKmbgiaNEcN8QOStfnAXkg/640?wx_fmt=png&amp;from=appmsg" data-type="png" data-w="1080" src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNYfGicvG8YNT22oIcJyGtGAIaf8NJXFWSsSWKmbgiaNEcN8QOStfnAXkg/640?wx_fmt=png&amp;from=appmsg"></p><section><span>▲ 这个数据集包含全基因组测序(WGS)中的CNV、SV、SNV和Indel变异;还包括HiChIP数据,储存染色质三维结构与基因调控的关系。</span></section><p data-tool="markdown.com.cn编辑器"><span>除了上面这些个方案以外,还可以直接在<strong><span>谷歌学术</span></strong>或者<strong><span>Pubmed</span></strong>上直接查查相关的研究。比如,找到一篇同样做泛癌的研究,看看他们用了什么数据集。<strong>下面列一个复旦/上海交大团队去年发表的泛癌分析,别人用的数据集可不少</strong><strong>:</strong></span></p><section><img data-galleryid="" data-imgfileid="100011800" data-ratio="0.5048828125" data-s="300,640" data-src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNMDHttbytMjlKYaerCEA5QrrowsicJujBU4eeibCAtX1gpjGbQDTMFzXw/640?wx_fmt=png&amp;from=appmsg" data-type="png" data-w="1024" src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNMDHttbytMjlKYaerCEA5QrrowsicJujBU4eeibCAtX1gpjGbQDTMFzXw/640?wx_fmt=png&amp;from=appmsg"></section><p data-class="mbImgTitle"><span>▲ DOI: 10.1186/s13046-024-03042-7。</span></p><p><img data-imgfileid="100011801" data-ratio="0.47314814814814815" data-src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNJPhPM2Hcooyic5bBDFzLxGHicm7Y7FRGiaM9vDhgsXkicKUtVOiapqIju9A/640?wx_fmt=png&amp;from=appmsg" data-type="png" data-w="1080" src="https://mmbiz.qpic.cn/sz_mmbiz_png/LvUIqvYKCeWibs9iahRLfy1icEyF9mut7rNJPhPM2Hcooyic5bBDFzLxGHicm7Y7FRGiaM9vDhgsXkicKUtVOiapqIju9A/640?wx_fmt=png&amp;from=appmsg"></p><section><span>▲ 2024年的泛癌数据分析,工作量可太大了。</span></section><p><strong><span><strong><span><strong><span>数据集还是不难找的</span><span> </span></strong><strong><span><img data-ratio="1" data-src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Yellowdog.png" data-w="128" src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Yellowdog.png"></span></strong></span></strong></span></strong></p><section><span><strong>起步就是要把数据集做大</strong></span><strong><strong></strong></strong></section><section><span><strong><strong>至少工作量摆在这</strong></strong></span></section><section><span><strong><strong><strong>审稿人都得心软</strong></strong></strong></span></section><section><span><strong>欢迎各位关注<img data-src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Yellowdog.png" data-ratio="1" data-w="128" src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Yellowdog.png"><img data-src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/Expression/[email protected]" data-ratio="1" data-w="128" src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/Expression/[email protected]"><img data-src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Worship.png" data-ratio="1" data-w="128" src="https://res.wx.qq.com/t/wx_fed/we-emoji/res/v1.3.10/assets/newemoji/Worship.png"></strong></span></section><section><mp-common-profile data-id="MzkyNTIzMzYyMA==" data-pluginname="mpprofile" data-headimg="http://mmbiz.qpic.cn/mmbiz_png/LvUIqvYKCeXYZNMxRMnjiaicO2a27jDZ2FgQga8TdeQcsGRJRIn2IInkKtfcbbMXOBSViaPXpTOBulUlNzd11pzow/0?wx_fmt=png" data-nickname="生信碱移" data-alias="liudoufu307" data-signature="春来秋至,分享我的所见与所识" data-from="2" data-weuitheme="light"></mp-common-profile></section><section><br></section><section><strong><span>END~</span></strong><br></section><p><span><strong>仅供粉丝老铁们参考</strong></span></p><section><strong>如有侵权或错误,请联系删除改正~</strong></section><section data-class="_mbEditor" data-id="32689"><section><section><section><p><span><strong mpa-from-tpl="t">推荐好文,点击查看:</strong></span></p><p><a target="_blank" href="http://mp.weixin.qq.com/s?__biz=MzkyNTIzMzYyMA==&amp;mid=2247490399&amp;idx=1&amp;sn=bab116f9b25cfef81c690cfa566ec034&amp;chksm=c1c8e3e4f6bf6af2ac22ee98441129f72698139a2fc5cba2607a7cad46449da498e2a37e732f&amp;scene=21#wechat_redirect" textvalue="【教程】一文读懂孟德尔随机化!" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2" hasload="1">【教程】一文读懂孟德尔随机化!</a><br></p><p><a target="_blank" href="https://mp.weixin.qq.com/s?__biz=MzkyNTIzMzYyMA==&amp;mid=2247495382&amp;idx=1&amp;sn=33c7576a1cdd370361e3667b4346dcc8&amp;scene=21#wechat_redirect" textvalue="【工具】NBT: 单细胞可‍解释张量分解算法sclTD!" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2"><span>【工具】还在用单细胞非负矩阵分解吗?广义二值协方差分解+疾病异质性,又是遥遥领先了!(GBCD包)</span></a></p><p><a target="_blank" href="https://mp.weixin.qq.com/s?__biz=MzkyNTIzMzYyMA==&amp;mid=2247495398&amp;idx=1&amp;sn=ee8b90bc53b8885d39f9005936984f88&amp;scene=21#wechat_redirect" textvalue="【文献】这篇顶刊纯生信领先常规生信分‍析一个版本!" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2">【文献】Transofomer+图表示学习,16种泛癌的纯数据库深度学习又又又登上顶刊了!</a></p><p><a target="_blank" href="https://mp.weixin.qq.com/s?__biz=MzkyNTIzMzYyMA==&amp;mid=2247495197&amp;idx=1&amp;sn=c62596642fbefba5bb5c6e7070af84f7&amp;scene=21#wechat_redirect" textvalue="【期刊】无版面费低分 SCI 期刊‍,近期发表多篇纯网药研究!" linktype="text" imgurl="" imgdata="null" data-itemshowtype="0" tab="innerlink" data-linktype="2"><span>【期刊】收稿范围广泛!影响因子3+,中科院3区,近期刊登纯生信研究!</span></a></p></section></section></section></section></section><p><mp-style-type data-value="3"></mp-style-type></p></div>
<hr>
<a href="https://mp.weixin.qq.com/s/y2gbdm3V3IjoM2gxTtGkUA",target="_blank" rel="noopener noreferrer">原文链接</a>

0 comments on commit 84d3a6c

Please sign in to comment.