NBDC Research ID: hum0197.v18

研究内容の概要

目的： 多層的オミクス解析による疾患病態の解明、日本人集団におけるGWASおよび複数集団におけるGWASメタ解析、COVID-19重症化メカニズムの解明

方法： メタゲノムシークエンス、ゲノムワイド関連解析、small RNA-seq解析、eQTL解析

対象： 日本人集団（95＋103＋227＋30＋136 名）の腸内細菌叢のメタゲノムシークエンスデータ

肺胞蛋白症患者：198名、対照者：395名のゲノムワイド関連解析データ

バイオバンク・ジャパン（179,000名）、UKバイオバンク（361,000名）、FinnGen（136,000名）の220形質のゲノムワイド関連解析データ

日本人集団141名のsmall RNA-seq解析により定量した個人毎のmiRNAリードカウントデータと、全ゲノムシーケンス解析データと合わせて解析したeQTL解析データ

炎症性腸疾患症例（潰瘍性大腸炎35症例、クローン病39症例）、対照健常者40名のメタゲノムシークエンスデータ

頭蓋内胚細胞腫瘍患者：133名、対照者：762名のゲノムワイド関連解析データ

バイオバンク・ジャパン（161,801名）、UKバイオバンク（377,583名）の9形質のゲノムワイド関連解析データ

日本人集団におけるCOVID-19患者30＋43症例と健常者31＋44名の末梢血単核細胞（PBMC）から抽出したRNAを用いたscRNA-seqデータ

微生物ゲノムのMetagenome-Assembled Genome（MAG）・ウイルスのゲノム配列・CRISPR spacer配列

日本人88名および健常人73名のショットガンシークエンスデータ、ならびに、日本人5名の高深度ショットガンシークエンスデータ

バイオバンク・ジャパン（180,215名）、UKバイオバンク（377,441名）の15形質のゲノムワイド関連解析データ、ならびに、FinnGen、Breast Cancer Association Consortium（BCAC）、Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome（PRACTICAL）の要約統計量を含めたメタ解析（乳がん：648,746名、前立腺がん：482,080名）データ

間質性膀胱炎ハンナ型：144名、対照者：41,516名のゲノムワイド関連解析データ

腸内微生物叢（日本人集団524名、423種の微生物）のゲノムワイド関連解析データ

血中代謝物（日本人集団362名、306種の代謝物）のゲノムワイド関連解析データ

KEGG Gene OrthologおよびKEGG Pathwayのゲノムワイド関連解析データ

データID	内容	制限	公開日
JGAS000205	メタゲノム	制限公開（Type I）	2019/11/15
hum0197.v2.gwas.v1	肺胞蛋白症のGWAS	非制限公開	2020/11/27
JGAS000260	メタゲノム	制限公開（Type I）	2020/11/27
hum0197.v3.gwas.v1	215形質のGWAS	非制限公開	2021/03/22
JGAS000316	メタゲノム	制限公開（Type I）	2021/10/12
JGAS000415	メタゲノム	制限公開（Type I）	2021/12/10
hum0197.v5.gwas.v1	10形質のGWAS	非制限公開	2021/12/21
hum0197.v5.finemap.v1	79形質のFine-mapping	非制限公開	2021/12/21
JGAS000504	miRNAリードカウント	制限公開（Type I）	2022/02/08
hum0197.v6.eqtl.v1	eQTL解析データ	非制限公開	2022/02/08
JGAS000530	メタゲノム	制限公開（Type I）	2022/05/23
JGAS000531	メタゲノム	制限公開（Type I）	2022/06/03
hum0197.v9.gwas.GCT.v1	頭蓋内胚細胞腫瘍のGWAS	非制限公開	2022/06/10
hum0197.v10.gwas.v1	9形質のGWAS	非制限公開	2022/06/16
JGAS000543	scRNA-seqデータ	制限公開（Type I）	2022/07/21
hum0197.v12	微生物ゲノムのMAG・ウイルスのゲノム配列・CRISPR spacer配列	非制限公開	2022/12/01
JGAS000543（データ追加）	臨床情報	制限公開（Type I）	2023/02/14
JGAS000593	scRNA-seqデータ、臨床情報	制限公開（Type I）	2023/02/14
hum0197.v3.gwas.v1（データ追加）	5形質のGWAS	非制限公開	2023/02/16
JGAS000600	メタゲノム	制限公開（Type I）	2023/03/29
hum0197.v16.gwas.v1	15形質のGWAS	非制限公開	2023/06/06
hum0197.v17.hic-gwas.v1	間質性膀胱炎ハンナ型のGWAS	非制限公開	2023/06/27
hum0197.v18.gwas.v1	腸内微生物叢のGWAS 血中代謝物のGWAS KEGG Gene OrthologおよびKEGG PathwayのGWAS	非制限公開	2023/10/02

※リリース情報はこちら

※制限公開データの利用にあたっては、利用申請が必要です。申請方法はこちら。

※論文等でデータベースからダウンロードしたデータを含む結果を公表する際には、下記文献を引用いただくか、NBDCヒトデータベースに登録されたデータを利用した旨について謝辞（Acknowledgement）に記載して下さい。記載例はこちら。

分子データ

JGAS000205／JGAS000260／JGAS000316／JGAS000415／JGAS000530／JGAS000531


対象	日本人集団：95＋103＋227＋30＋136 名炎症性腸疾患症例潰瘍性大腸炎（ICD10：K519）：35症例クローン病（ICD10：K509）：39症例対照健常者：40名
規模	メタゲノム
対象領域（Target Captureの場合）	-
Platform	Illumina [HiSeq 3000、NovaSeq 6000]
ライブラリソース	便より抽出したDNA
検体情報（購入の場合）	-
ライブラリ作製方法（キット名）	KAPA Hyper Prep Kit
断片化の方法	超音波断片化（Covaris）
ライブラリ構築方法	Paired-end
リード長（除：バーコード、アダプタ、プライマー、リンカー）	150 bp
Japanese Genotype-phenotype Archive Dataset ID	JGAD000290（日本人集団：95名） JGAD000363（日本人集団：103名） JGAD000427（日本人集団：227名） JGAD000532（日本人集団：30名） JGAD000649（炎症性腸疾患症例） JGAD000650（日本人集団：136名）
総データ量	JGAD000290：477 GB（fastq） JGAD000363：408 GB（fastq） JGAD000427：881.2 GB（fastq） JGAD000532：106.7 GB（fastq） JGAD000649：374.6 GB （fastq） JGAD000650：541.4 GB（fastq）
コメント（利用にあたっての制限事項）	NBDC policy

hum0197.v2.gwas.v1


対象	肺胞蛋白症（ICD10：J840）：198症例対照者：395名
規模	genome wide SNPs
対象領域（Target Captureの場合）	-
Platform	Illumina [Infinium Asian Screening Array]
ソース	末梢血から抽出したDNA
検体情報（購入の場合）	-
調整試薬（キット名、バージョン）	Infinium Asian Screening Array
遺伝子型決定アルゴリズム（ソフトウェア）	GenomeStudio for genotyping, shapeit2 for haplotype phasing, and minimac3 for imputation
関連解析（ソフトウェア）	PLINK2
フィルタリング	Sample QC: We excluded samples with low genotyping call rates (call rate < 98%) and in close genetic relation (PI_HAT > 0.175). We included samples of the estimated East Asian ancestry. Variant QC: We excluded variants with (1) genotyping call rate < 98%, (2) P value for Hardy–Weinberg equilibrium < 1.0 × 10⁻⁶, and (3) minor allele count < 5, or (4) > 10% frequency difference with the imputation reference panel.
マーカー数（QC後）	12,153,232 autosomal variants and 242,876 X-chromosomal variants after QC.
NBDC Dataset ID	hum0197.v2.gwas.v1 （データのダウンロードは上記Data IDをクリックしてください） Dictionary file
総データ量	390MB for autosome (txt.gz) and 19MB for X chromosome (txt.gz)
コメント（利用にあたっての制限事項）	NBDC policy

hum0197.v3.gwas.v1


対象	バイオバンク・ジャパン（179,000名）、UKバイオバンク（361,000名）、FinnGen（136,000名）形質数：220
規模	genome wide SNPs
対象領域（Target Captureの場合）	-
Platform	BBJ：Illumina [HumanOmniExpressExome BeadChip、HumanOmniExpress BeadChip、HumanExome BeadChip] UK Biobank：Applied Biosystems [UK BiLEVE Axiom Array、UK Biobank Axiom Array] FinnGen：Thermo Fisher Scientific [FinnGen1 ThermoFisher Arrayなど]
ソース	末梢血から抽出したDNA
検体情報（購入の場合）	-
調整試薬（キット名、バージョン）	BBJ：HumanOmniExpressExome BeadChip、HumanOmniExpress BeadChip、HumanExome BeadChip UK Biobank：UK BiLEVE Axiom Array、UK Biobank Axiom Array FinnGen：FinnGen1 ThermoFisher Arrayなど
遺伝子型決定アルゴリズム（ソフトウェア）	BBJ：Eagle、Minimac3 UK Biobank：IMPUTE4 FinnGen：beagle4.1
関連解析（ソフトウェア）	For binary traits, SAIGE software was used with age, age2, sex, age×sex, age2×sex, and top 20 principal components as covariates. For quantitative traits (biomarkers), BOLT-LMM or plink software was used with the same covariates.
フィルタリング	BBJ：We included imputed variants with Rsq > 0.7. UK Biobank：We excluded the variants with (i) INFO score ≤ 0.8, (ii) MAF ≤ 0.0001 (except for missense and protein-truncating variants annotated by VEP, which were excluded if MAF ≤ 1 × 10-6), and (iii) PHWE ≤ 1 × 10-10. FinnGen：We excluded variants with an imputation INFO score < 0.8 or MAF < 0.0001.
マーカー数（QC後）	BBJ：13,530,797 variants UK Biobank：13,791,467 variants FinnGen：16,859,359 variants
NBDC Dataset ID	hum0197.v3.gwas.v1 （データのダウンロードは上記データIDをクリックし、遷移先のサイトの各Dataset IDをクリックしてください） Dictionary file（BBJ、EUR、META）
総データ量	BBJ：~1.5G for autosome and ~33M for chrX UK Biobank：~1.5G for autosome and ~15M for chrX FinnGen：~740M for autosome and ~20M for chrX
コメント（利用にあたっての制限事項）	NBDC policy

hum0197.v5.gwas.v1 / hum0197.v5.finemap.v1


対象	バイオバンク・ジャパン（179,000名）形質数：79
規模	genome wide SNPs
対象領域（Target Captureの場合）	-
Platform	Illumina [HumanOmniExpressExome BeadChip、HumanOmniExpress BeadChip、HumanExome BeadChip]
ソース	末梢血から抽出したDNA
検体情報（購入の場合）	-
調整試薬（キット名、バージョン）	HumanOmniExpressExome BeadChip、HumanOmniExpress BeadChip、HumanExome BeadChip
遺伝子型決定アルゴリズム（ソフトウェア）	Eagle、Minimac3
関連解析（ソフトウェア）	GWAS: For binary traits, SAIGE software was used with age, age2, sex, age×sex, age2×sex, and top 20 principal components as covariates. For quantitative traits (biomarkers), BOLT-LMM was used with the same covariates. Fine-mapping: FINEMAP and SuSiE were used with GWAS summary statistics and in-sample dosage LD, allowing up to 10 causal variants per region.
フィルタリング	GWAS: We included imputed variants with Rsq > 0.7. For binary traits, variants with MAC < 10 were additionally excluded. Fine-mapping: We defined fine-mapping regions based on a 3 Mb window around each lead variant and merged regions if they overlapped. We excluded the major histocompatibility complex (MHC) region (chr 6: 25–36 Mb) from analysis due to extensive LD structure in the region. For each method, we only included variants from successfully fine-mapped regions while excluding those from failed regions (e.g., due to conversion failure or available memory restrictions).
マーカー数（QC後）	13,531,752 variants（ref: hg19）
NBDC Dataset ID	hum0197.v5.gwas.v1 / hum0197.v5.finemap.v1 （データのダウンロードは上記データIDをクリックし、遷移先のサイトの各Dataset IDをクリックしてください） Dictionary file
総データ量	14 GB
コメント（利用にあたっての制限事項）	NBDC policy

JGAS000504


対象	日本人集団：141名
規模	small RNA-seq
対象領域（Target Captureの場合）	-
Platform	Illumina [HiSeq 2500]
ライブラリソース	末梢血単核細胞から抽出したRNA
検体情報（購入の場合）	-
ライブラリ作製方法（キット名）	SMARTer smRNA-Seq Kit
断片化の方法	-
ライブラリ構築方法	Single-end
リード長（除：バーコード、アダプタ、プライマー、リンカー）	100 bp
マッピング方法	bowtie（GRCh37）
リードカウント決定アルゴリズム（ソフトウェア）	featureCounts + miRbase v22
フィルタリング（QC）方法	We performed adapter trimming using Cutadapt v1.8 and removed reads with a low quality score (Phred quality score < 20 in >20% of total bases) using fastp v0.20.0. Also, we removed reads with a length of >29 bp or <15 bp, which are not expected to be mature miRNAs. Mature miRNAs detected with ≥1 read in at least half of the individuals were included in the dataset.
miRNA数	343
Japanese Genotype-phenotype Archive Dataset ID	JGAD000621
総データ量	54.7 KB（txt）
コメント（利用にあたっての制限事項）	NBDC policy

hum0197.v6.eqtl.v1


対象	日本人集団：141名
規模	eQTL
対象領域（Target Captureの場合）	-
Platform	small RNA-seq：Illumina [HiSeq 2500] 全ゲノムシーケンス：Illumina [HiSeq X Ten]
ソース	JGAS000504のリードカウントデータおよび全血から抽出したgenomic DNAを用いた全ゲノムシーケンスデータ
検体情報（購入の場合）	-
調整試薬（キット名、バージョン）	small RNA-seq：JGAS000504を参照のこと全ゲノムシーケンス：TruSeq DNA PCR-Free Library Preparation Kit
遺伝子型/リードカウント決定アルゴリズム（ソフトウェア）	リードカウントについてはJGAS000504を参照のこと全ゲノムシーケンスデータはBWA-MEM v0.7.13を用いGRCh37にアラインメントし、GATK v3.8-0を用いbest practiceに準じて解析した。
フィルタリング	リードカウントについてはJGAS000504を参照のこと全ゲノムシーケンスデータはGenotype call rate <90%, ExcessHet > 60, Hardy-Weinberg平衡検定P値<1.0×10⁻¹⁰のバリアントを除外した上で、Beagle v5.1によるgenotype imputationを実施した。
マーカー数（QC後）	リードカウントについてはJGAS000504を参照のこと全ゲノムシーケンスデータ：12,171,854 variants
eQTL検出方法	We analyzed the association between genetic variants with minor allele frequency (MAF) ≥ 0.01 within a cis-window around each miRNA (±1 Mb of the mature miRNA) and normalized expression values using MatrixEQTL v2.3.
NBDC Dataset ID	hum0197.v6.eqtl.v1 （データのダウンロードは上記Dataset IDをクリックしてください） Dictionary file
総データ量	1.1 MB（txt）
コメント（利用にあたっての制限事項）	NBDC policy

hum0197.v9.gwas.GCT.v1


対象	頭蓋内胚細胞腫瘍（ICD10：C719）：133症例対照者：762名
規模	genome wide SNPs
対象領域（Target Captureの場合）	-
Platform	Illumina [Infinium Asian Screening Array]
ソース	末梢血から抽出したDNA
検体情報（購入の場合）	-
調整試薬（キット名、バージョン）	Infinium Asian Screening Array
遺伝子型決定アルゴリズム（ソフトウェア）	遺伝子型決定：GenomeStudio ハプロタイプフェージング：shapeit2 for haplotype phasing インピュテーション：minimac3
関連解析（ソフトウェア）	PLINK2
フィルタリング	サンプルQC： ① 検体毎のgenotyping call rate < 0.97、② PI_HAT > 0.17、③ 非東アジア系、に該当する場合除外バリアントQC： ① genotyping call rate < 0.99、② minor allele count < 5、③ P-value for Hardy–Weinberg equilibrium in controls < 1.0 × 10^−5 、④ > 10% allele frequency difference with the imputation reference panel or the allele frequency panel of Tohoku Medical Megabank Project、に該当する場合除外インピュテーション後のQC： ① Rsq < 0.7、② minor allele frequency < 0.5%、に該当する場合除外
マーカー数（QC後）	7,803,874 autosomal variants 181,867 X-chromosomal variants
NBDC Dataset ID	hum0197.v9.gwas.GCT.v1 （データのダウンロードは上記Dataset IDをクリックしてください） Dictionary file
総データ量	248 MB （txt）
コメント（利用にあたっての制限事項）	NBDC policy

hum0197.v10.gwas.v1


対象	バイオバンク・ジャパン（161,801名）、UKバイオバンク（377,581名）疾患群：自己免疫疾患 [関節リウマチ（ICD10：M05）、バセドウ病（ICD10：C719）、1型糖尿病（ICD10：E10）] アレルギー疾患 [気管支喘息（ICD10：J45）、アトピー性皮膚炎（ICD10：L20）、花粉症（ICD10：J301）] 対照群：自己免疫・アレルギー疾患を有さない登録者（各コホート内で疾患群間にはサンプルオーバーラップあり）形質数：9
規模	genome wide SNPs
対象領域（Target Captureの場合）	-
Platform	BBJ：Illumina [HumanOmniExpressExome BeadChip、HumanOmniExpress BeadChip、HumanExome BeadChip] UK Biobank：Applied Biosystems [UK BiLEVE Axiom Array、UK Biobank Axiom Array]
ソース	末梢血から抽出したDNA
検体情報（購入の場合）	-
調整試薬（キット名、バージョン）	BBJ：HumanOmniExpressExome BeadChip、HumanOmniExpress BeadChip、HumanExome BeadChip UK Biobank：UK BiLEVE Axiom Array、UK Biobank Axiom Array
遺伝子型決定アルゴリズム（ソフトウェア）	BBJ：Eagle、Minimac3 UK Biobank：IMPUTE4
関連解析（ソフトウェア）	SAIGE software was used with age,　sex, and top five principal components as covariates RE2C software was used for the multi-trait meta-analysis adjusting for sample overlap between GWAS summary data
フィルタリング	Rsq < 0.7 および MAF < 0.005 のバリアントを除外
マーカー数（QC後）	BBJ：8,374,220 autosomal variants for individual trait／8,369,174 autosomal variants for meta-analysis UK Biobank：10,864,380 autosomal variants for individual trait／10,858,065 autosomal variants for meta-analysis BBJ＋UK Biobank： 5,965,154 autosomal variants for meta-analysis
NBDC Dataset ID	hum0197.v10.gwas.v1 （データのダウンロードは上記Dataset IDをクリックし、遷移先のサイトのファイルダウンロードリンクをクリックしてください） Dictionary file
総データ量	BBJ：～760MB for individual trait／～430MB for multi-trait meta-analysis UK Biobank：～1.1GB for individual trait／～550MB for multi-trait meta-analysis BBJ＋UK Biobank：～310MB for multi-trait meta-analysis
コメント（利用にあたっての制限事項）	NBDC policy

JGAS000543 / JGAS000593


対象	COVID-19患者（ICD10：U071）：30＋43症例健常者：31＋44名
規模	scRNA-seq
対象領域（Target Captureの場合）	-
Platform	Illumina [NovaSeq 6000]
ライブラリソース	末梢血単核細胞から抽出したRNA
検体情報（購入の場合）	-
ライブラリ作製方法（キット名）	Chromium Next GEM Single Cell 5’ Library & Gel Bead Kit v1.1、 Chromium Next GEM Chip G Single Cell Kit 、Single Index Kit T Set A
断片化の方法	酵素反応
ライブラリ構築方法	Paired-end
リード長（除：バーコード、アダプタ、プライマー、リンカー）	91 bp
Japanese Genotype-phenotype Archive Dataset ID	JGAD000662 JGAD000722
総データ量	1.3＋2.0 TB （fastq、xlsx [臨床情報]）
コメント（利用にあたっての制限事項）	NBDC policy

hum0197.v12.MAG.v1


対象	日本人腸内微生物ゲノム JGAS000205／JGAS000260／JGAS000316／JGAS000415／JGAS000530／JGAS000531、公開データ（DRA006684）より取得
規模	メタゲノム
対象領域（Target Captureの場合）	-
Platform	Illumina [HiSeq 2500/3000、NovaSeq 6000]
ライブラリソース	JGAS000205／JGAS000260／JGAS000316／JGAS000415／JGAS000530／JGAS000531、DRA006684より取得した配列
検体情報（購入の場合）	-
MAGの構築方法	metaspades,dastools(metabat2、maxbin2、concoct)によるde novo assembly + binning
DDBJ Sequence Read Archive ID	JGA MAG：20220531NSUB000031HIGH_JGA_JMAG_GENOME_*.acclist（JGAS000205／JGAS000260／JGAS000316／JGAS000415／JGAS000530／JGAS000531） DRA014186（JGAS000205／JGAS000260／JGAS000316／JGAS000415／JGAS000530／JGAS000531） DRA014188（JGAS000205／JGAS000260／JGAS000316／JGAS000415／JGAS000530／JGAS000531） DRA014191（JGAS000205／JGAS000260／JGAS000316／JGAS000415／JGAS000530／JGAS000531） DRA014192（JGAS000205／JGAS000260／JGAS000316／JGAS000415／JGAS000530／JGAS000531） TPA MAG： EMNX01000001-EMNX01000025、EMNY01000001-EMNY01000068、EMNZ01000001-EMNZ01000149、EMOA01000001-EMOA01000067（DRA006684） DRA014184（DRA006684）
総データ量	JGA MAG：153 GB（fasta） DRA014186：11.5 GB（fasta） DRA014188：11.9 GB（fasta） DRA014191：12.2 GB（fasta） DRA014192：5.75 GB（fasta） TPA MAG：11.9 MB（fasta） DRA014184 ：3.65 GB（fasta）
コメント（利用にあたっての制限事項）	NBDC policy

hum0197.v12.VIRUS.v1


対象	日本人腸内微生物ゲノム JGAS000205／JGAS000260／JGAS000316／JGAS000415／JGAS000530／JGAS000531、公開データ（DRA006684）より取得
規模	NGS（WGS）
対象領域（Target Captureの場合）	-
Platform	Illumina [HiSeq 2500/3000、NovaSeq 6000]
ライブラリソース	JGAS000205／JGAS000260／JGAS000316／JGAS000415／JGAS000530／JGAS000531、DRA006684より取得した配列
検体情報（購入の場合）	-
ウイルスゲノム構築方法	metaspadesによるde novo assemblyの後に、virfinder及びvirsorterによってウイルスゲノムを検出。
DDBJ Sequence Read Archive ID	JGAS000205／JGAS000260／JGAS000316／JGAS000415／JGAS000530／JGAS000531：BRDB01000001-BRDB01028816 DRA006684：EMNW01000001-EMNW01002579
総データ量	JGAS000205／JGAS000260／JGAS000316／JGAS000415／JGAS000530／JGAS000531：1.09 GB（fasta） DRA006684：98.3 MB（fasta）
コメント（利用にあたっての制限事項）	NBDC policy

hum0197.v12.CRISPR.v1


対象	日本人腸内微生物ゲノム JGAS000205／JGAS000260／JGAS000316／JGAS000415／JGAS000530／JGAS000531、公開データ（DRA006684）より取得
規模	メタゲノム
対象領域（Target Captureの場合）	-
Platform	Illumina [HiSeq 2500/3000、NovaSeq 6000]
ライブラリソース	JGAS000205／JGAS000260／JGAS000316／JGAS000415／JGAS000530／JGAS000531、DRA006684より取得した配列
検体情報（購入の場合）	-
CRISPR配列構築方法	MAG配列に対してMINCEDを適用。
DDBJ Sequence Read Archive ID	DRA014186（JGAS000205／JGAS000260／JGAS000316／JGAS000415／JGAS000530／JGAS000531） DRA014184（DRA006684）
総データ量	DRA014184：17.9 MB（fasta） DRA014186：1.43 MB（fasta）
コメント（利用にあたっての制限事項）	NBDC policy

JGAS000600


対象	日本人：88 名（ショットガンシークエンス）健常者：73 名（120検体）（ショットガンシークエンス） - フェノール・クロロホルム法によるDNA抽出：73検体 - DNeasy PowerSoil Pro KitによるDNA抽出：47検体日本人：5 名（高深度ショットガンシークエンス）
規模	メタゲノム
対象領域（Target Captureの場合）	-
Platform	Illumina [HiSeq 3000、NovaSeq 6000]
ライブラリソース	便より抽出したDNA
検体情報（購入の場合）	-
ライブラリ作製方法（キット名）	KAPA Hyper Prep Kit
断片化の方法	超音波断片化（Covaris）
ライブラリ構築方法	Paired-end
リード長（除：バーコード、アダプタ、プライマー、リンカー）	150 bp
Japanese Genotype-phenotype Archive Dataset ID	JGAD000729
総データ量	2.6 TB（fastq）
コメント（利用にあたっての制限事項）	NBDC policy

hum0197.v16.gwas.v1


対象	バイオバンク・ジャパン（180,215名）、UKバイオバンク（377,441名） FinnGen、BCAC、PRACTICALの要約統計量を含めたメタ解析（乳がん：648,746名、前立腺がん：482,080名）疾患群：胆道がん（ICD10：C22.1、23-24）、乳がん（ICD10：C50）、子宮頸がん（ICD10：C53）、大腸がん（ICD10：C18-20）、子宮体がん（ICD10：C54）、食道がん（ICD10：C15）、胃がん（ICD10：C16）、肝細胞がん（ICD10：C22.0）、肺がん（ICD10：C34）、非ホジキンリンパ腫（ICD10：C82-83）、卵巣がん（ICD10：C56）、膵がん（ICD10：C25）、前立腺がん（ICD10：C61）対照群：がんに罹患していない登録者（各コホート内で疾患群間にはサンプルオーバーラップあり）形質数：15
規模	genome wide SNPs
対象領域（Target Captureの場合）	-
Platform	BBJ：Illumina [HumanOmniExpressExome BeadChip、HumanOmniExpress BeadChip、HumanExome BeadChip] UK Biobank：Applied Biosystems [UK BiLEVE Axiom Array、UK Biobank Axiom Array] FinnGen：Thermo Fisher Scientific [FinnGen1 ThermoFisher Arrayなど] BCAC：Illumina [iCOGS OncoArray] PRACTICAL：Illumina [iCOGS OncoArray]
ソース	末梢血から抽出したDNA
検体情報（購入の場合）	-
調整試薬（キット名、バージョン）	BBJ：HumanOmniExpressExome BeadChip、HumanOmniExpress BeadChip、HumanExome BeadChip UK Biobank：UK BiLEVE Axiom Array、UK Biobank Axiom Array FinnGen：FinnGen1 ThermoFisher Arrayなど BCAC：Infinium OncoArray-500K v1.0 BeadChip Kit PRACTICAL：Infinium OncoArray-500K v1.0 BeadChip Kit
遺伝子型決定アルゴリズム（ソフトウェア）	BBJ：Eagle、Minimac3 UK Biobank：IMPUTE4 FinnGen：beagle4.1 BCAC：IMPUTE2 PRACTICAL：IMPUTE2
関連解析（ソフトウェア）	SAIGE software was used with age, sex, and top five principal components as covariates RE2C software was used for the multi-trait meta-analysis adjusting for sample overlap between GWAS summary data
フィルタリング	各データセットのsample QCとVariant QC：Read meファイルをご参照ください。 Rsq < 0.7 および MAF < 0.01 のバリアントを除外
マーカー数（QC後）	BBJ：13MN（7,398,798）、each cancer（7,442,557 （7,420,485-7,444,681）） UK Biobank：13MN（9,602,853）、each cancer（9,620,786　（9,620,343-9,620,935）） BBJ＋UK Biobank：13MN（5,374,018）、each cancer（5,696,155 （5,677,934-5,698,357）） BBJ＋UK Biobank＋FinnGen＋BCAC（乳がん）：5,104,756 BBJ＋UK Biobank＋FinnGen＋PRACTICAL（前立腺がん）：5,105,796 BBJ＋UK Biobank＋FinnGen＋BCAC＋PRACTICAL（乳がん＋前立腺がん）：5,100,089 （each cancerについてはmean （min-max）を記載）
NBDC Dataset ID	hum0197.v16.gwas.v1 （データのダウンロードは上記Dataset IDをクリックし、遷移先のサイトのファイルダウンロードリンクをクリックしてください） Dictionary file
総データ量	BBJ：13MN（287 MB）、each cancer（625 （605-633）MB） UK Biobank：13MN（362 MB）、each cancer（841 （814-859） MB） BBJ＋UK Biobank：13MN（202 MB）、each cancer（260 （255-264） MB） BBJ＋UK Biobank＋FinnGen＋BCAC（乳がん）：242 MB BBJ＋UK Biobank＋FinnGen＋PRACTICAL（前立腺がん）：243 MB BBJ＋UK Biobank＋FinnGen＋BCAC＋PRACTICAL（乳がん＋前立腺がん）：253 MB （each cancerについてはmean （min-max）を記載）
コメント（利用にあたっての制限事項）	NBDC policy

hum0197.v17.hic-gwas.v1


対象	間質性膀胱炎ハンナ型（ICD10：N301）：144症例対照者：41,516名
規模	genome wide SNPs
対象領域（Target Captureの場合）	-
Platform	Illumina [Infinium Asian Screening Array]
ソース	末梢血から抽出したDNA
検体情報（購入の場合）	-
調整試薬（キット名、バージョン）	Infinium Asian Screening Array
遺伝子型決定アルゴリズム（ソフトウェア）	遺伝子型決定：GenomeStudio ハプロタイプフェージング：shapeit4 インピュテーション：minimac4
関連解析（ソフトウェア）	SAIGE
フィルタリング	Sample QC: We excluded individuals with low genotyping call rates (call rate < 98%). We included individuals of the estimated Japanese ancestry using PCA. Variant QC: We excluded variants with (1) genotyping call rate < 99%, (2) minor allele count < 5, (3) P-value for Hardy–Weinberg equilibrium < 1.0 × 10^−10, and (4) > 5% allele frequency difference compared with the imputation reference panel or the allele frequency panel of Tohoku Medical Megabank Project. Post-imputation QC: We excluded imputed variants with Rsq < 0.7 and minor allele frequency < 0.5%.
マーカー数（QC後）	7,909,790 variants（hg19）
NBDC Dataset ID	hum0197.v17.hic-gwas.v1 （データのダウンロードは上記Dataset IDをクリックしてください） Dictionary file
総データ量	700 MB （txt）
コメント（利用にあたっての制限事項）	NBDC policy

hum0197.v18.gwas.v1


対象	日本人集団：524名（423種の微生物）日本人集団：362名（306種の代謝物）日本人集団：524名（KEGG Gene OrthologおよびKEGG Pathway）
規模	genome wide SNPs
対象領域（Target Captureの場合）	-
Platform	SNP array：Illumina [Infinium Asian Screening Array] Whole genome sequencing：Illumina [HiSeq X Ten] Metagenome shotgun sequencing：Illumina [HiSeq 2500/3000、NovaSeq 6000]
ソース	末梢血から抽出したDNA
検体情報（購入の場合）	-
調整試薬（キット名、バージョン）	SNP array：Infinium Asian Screening Array Whole genome sequencing：TruSeq DNA PCR-Free Library Preparation Kit Metagenome shotgun sequencing：KAPA Hyper Prep Kit
遺伝子型決定アルゴリズム（ソフトウェア）	・SNP array：遺伝子型決定：GenomeStudio ハプロタイプフェージング：shapeit4 インピュテーション：minimac4 ・WGS： WA-MEM v0.7.13、GATK v3.8-0
関連解析（ソフトウェア）	PLINK2
フィルタリング	・SNP array data：サンプル QC：We excluded individuals with low genotyping call rates (call rate < 98%). We included individuals of the estimated Asian ancestry using PCA. バリアント QC：We excluded variants with (1) genotyping call rate < 99%, (2) minor allele count < 5, (3) P-value for Hardy–Weinberg equilibrium < 1.0 × 10^−10, and (4) > 5% allele frequency difference compared with the imputation reference panel or the allele frequency panel of Tohoku Medical Megabank Project. インピュテーション後のQC：We excluded imputed variants with Rsq < 0.7 and minor allele frequency < 1%. ・WGS： We excluded variants with genotype call rate <90%, ExcessHet > 60, Hardy-Weinberg P<1.0×10−10 After imputation with Beagle v5.1, we excluded imputed variants with minor allele frequency < 1%.
マーカー数（QC後）	腸内微生物叢/KEGG （SNP array）: 7,213,470 variants（hg19）血中代謝物（WGS）: 6,840,258 variants（GRCh37）
NBDC Dataset ID	hum0197.v18.gwas.v1（腸内微生物叢、血中代謝物、KEGG）（データのダウンロードは上記リンクをクリックし、遷移先のサイトのファイルダウンロードリンクをクリックしてください） Dictionary file
総データ量	腸内微生物叢：206 GB 血中代謝物：90.7 GB KEGG：300 MB
コメント（利用にあたっての制限事項）	NBDC policy

提供者情報

研究代表者： 岡田随象

所属機関： 大阪大学大学院医学系研究科遺伝統計学

プロジェクト/研究グループ名： -

科研費/助成金（Research Project Number）：

科研費・助成金名	タイトル	研究課題番号
日本医療研究開発機構（AMED）革新的先端研究開発支援事業ソロタイプ（PRIME）	遺伝統計学が紐解く微生物叢・宿主・疾患・創薬のクロストーク	JP19gm6010001
⽇本医療研究開発機構（AMED）⾰新的先端研究開発⽀援事業ステップタイプ（FORCE）	メタゲノムワイド関連解析による疾患特異的微生物叢解明と個別化医療実装	JP20gm4010006
⽇本医療研究開発機構（AMED）難治性疾患実用化研究事業	横断的オミクス解析を駆使した肺胞蛋白症の病態解明とインシリコ・リポジショニング創薬	JP20ek0109413
日本医療研究開発機構（AMED）免疫アレルギー疾患実用化研究事業	疾患ゲノム情報を活用した自己免疫疾患における核酸ゲノム創薬の推進	JP19ek0410041
日本医療研究開発機構（AMED）免疫アレルギー疾患実用化研究事業	免疫オミクス情報の横断的統合による関節リウマチのゲノム個別化医療の実現	JP21ek0410075
日本医療研究開発機構（AMED）ゲノム医療実現推進プラットフォーム事業	遺伝統計学に基づく日本人集団のゲノム個別化医療の実装	JP21km0405211
日本医療研究開発機構（AMED）ゲノム医療実現推進プラットフォーム事業	次世代ゲノミクス研究による乾癬の疾患病態解明・個別化医療・創薬	JP21km0405217
科学研究費助成事業基盤研究（A）	横断的オミクス解析と全ゲノムシークエンスを駆使した疾患病態と組織特異性の解明	19H01021
科学研究費助成事業基盤研究（A）	統合シークエンス解析による免疫アレルギー疾患ダイナミクスの解明	22H00476

	タイトル	DOI	データID
1	Metagenome-wide association study of gut microbiome revealed novel aetiology of rheumatoid arthritis in the Japanese population.	doi: 10.1136/annrheumdis-2019-215743	JGAD000290
2	Genetic determinants of risk in autoimmune pulmonary alveolar proteinosis.	doi: 10.1038/s41467-021-21011-y	hum0197.v2.gwas.v1
3	A metagenome-wide association study of gut microbiome in patients with multiple sclerosis revealed novel disease pathology.	doi: 10.3389/fcimb.2020.585973	JGAD000363
4	A cross-population atlas of genetic associations for 220 human phenotypes.	doi: 10.1038/s41588-021-00931-x	hum0197.v3.gwas.v1
5	Metagenome-wide association study revealed disease-specific landscape of the gut microbiome of systemic lupus erythematosus in Japanese	doi: 10.1136/annrheumdis-2021-220687	JGAD000427
6	Whole gut virome analysis of 476 Japanese revealed a link between phage and autoimmune disease	doi: 10.1136/annrheumdis-2021-221267	JGAD000532
7	Insights from complex trait fine-mapping across diverse populations	doi: 10.1101/2021.09.03.21262975	hum0197.v5.gwas.v1 hum0197.v5.finemap.v1
8	Genetic architecture of microRNA expression and its link to complex diseases in the Japanese population.	doi: 10.1093/hmg/ddab361	JGAD000621 hum0197.v6.eqtl.v1
9	Multi-trait and cross-population genome-wide association studies across autoimmune and allergic diseases identify shared and distinct genetic components.	doi: 10.1136/annrheumdis-2022-222460	hum0197.v10.gwas.v1
10	DOCK2 is involved in the host genetics and biology of severe COVID-19	doi: 10.1038/s41586-022-05163-5	JGAD000662
11	Prokaryotic and viral genomes recovered from 787 Japanese gut metagenomes revealed microbial features linked to diets, populations, and diseases	doi: 10.1016/j.xgen.2022.100219	hum0197.v12
12	Reconstruction of the personal information from human genome reads in gut metagenome sequencing data	doi: 10.1038/s41564-023-01381-3	JGAD000729
13	Pan-cancer and cross-population genome-wide association studies dissect shared genetic backgrounds underlying carcinogenesis	doi: 10.1038/s41467-023-39136-7	hum0197.v16.gwas.v1
14	Genome-wide association analysis identifies susceptibility loci within the major histocompatibility complex region for Hunner-type interstitial cystitis	doi: 10.1016/j.xcrm.2023.101114	hum0197.v17.hic-gwas.v1
15	Analysis of gut microbiome, host genetics, and plasma metabolites reveals gut microbiome-host interactions in the Japanese population	doi: 10.1016/j.celrep.2023.113324	hum0197.v18.gwas.v1

制限公開データの利用者一覧

研究代表者	所属機関	国・州名	研究題目	利用データID	利用期間
Ilana Brito	Cornell University Meinig School of Biomedical Engineering	アメリカ合衆国	Comparative metagenomics of lupus patients' microbiomes	JGAD000290, JGAD000363, JGAD000427, JGAD000532	2022/05/12-2024/05/04
Yongxin Li	Department of Chemistry, The University of Hong Kong	香港	Comparison of gut bacterial diversity and composition in MS/EAE	JGAD000363	2022/09/19-2024/07/01
Tina Fuchs	Institute for Clinical Chemistry, Medical Faculty Mannheim, Heidelberg University	ドイツ	Investigating the clonality of VIREM cells in COVID-19 patients	JGAD000662, JGAD000772	2024/02/26-2024/12/31

研究内容の概要

分子データ

提供者情報

関連論文

制限公開データの利用者一覧